This template is the metatemplate behind {{chset-ctrl}}, {{chset-ctrl3}}, {{chset-ctrl4}}, {{chset-cell}}, {{chset-cell3}}, and {{chset-cell4}}. The intention is to implement them using this template and thus make it easier to keep them in sync.

Usage

edit

Used with Template:chset-tableformat to indicate a table cell.

  • First row:
    • Parameter char: the character in question. May link to the appropriate article or Wiktionary page if appropriate. Only provide for a non-control, non-whitespace printing character. If there are alternative characters separate with a slash. If it is a sequence of characters put them next to each other.
    • Parameter ctrl: XX, name of a whitespace, control, format, separator or otherwise non-printing character (e.g., SP, LF, HT, NBSP, ZWNJ, PDO), with link to appropriate article if it exists. Do not provide at the same time as char. This just does template:sc2 so you can use that if you need to combine a control with a normal character. You can also use lower-case letters to get tinier text to fit a longer string in.
    • Parameter fn: printed in normal (small) size after the letter. This is useful to add a reference or template:efn footnote to the glyph.
  • Second row:
    • Parameter unic: hhhh, Unicode value in hexadecimal, 4 digits for most codepoints (those on the Basic Multilingual Plane) and 5 otherwise, (e.g., 0020, 1D44A).
      • A little-used feature is that if the char field is blank, the matching Unicode character is placed there, but this only works if this is just a hex number.
      • If there are multiple mappings separate them with a slash (such as 0020/00A0), if this translates to a series of characters separate them with a space.
      • Set to   for a character without a Unicode mapping. Alternatively, if a Private Use Area mapping is in established/documented use for such a character (e.g. the Apple logo in Mac OS Roman) then it may be given, but don't make them up.
      • Set to LEAD for a lead byte (rather than a character). L is not a hex digit so this is unambiguous (or use the hex code to indicate something about what lead byte this is, for example in UTF-8).
  • Subsequent rows:
    • Parameter deci: arbitrary text drawn in bold, for displaying input methods. This is most often a decimal number for the Windows Alt code input.
    • Parameter octl: a second line of arbitrary text drawn in bold. You probably should not use this unless the input method really uses a second form.
    • Parameter kuten: arbitrary text not in bold. For JIS (men)kuten, GB quwei, KS hangyol or equivalent code (English: (plane-)row-cell, or (plane-)section-position).
      • This is a important identifier for characters in CJK DBCSs such as JIS X 0208 (more so than e.g. deci, which is not usually used for a DBCS).
      • (d(d)-)d(d)-d(d) (two or three numbers of up to two digits each, e.g., 91-1, 2-2-1). Generally numbers 1 through 94 correspond with encoding bytes of either 0x21 through 0x7E, or 0xA1 through 0xFE.
      • For a lead byte, specify underscores in place of subsequent numbers, this may look something like 16-_.
      • For visual consistency, may be set to - for a byte which is not within the lead/trail byte range, but which is in the same line as those which are.

You should use the same entries for every cell in a table (or at least in a table row), otherwise they will not line up horizontally. Use   if a field should be blank.

Examples

edit

A few examples:

{| {{chset-tableformat}}
<!-- ctrl4 plus kuten -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|deci=33|octl=041|kuten=1-1}}
<!-- ctrl4 -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]|deci=160|octl=240}}
<!-- cell4 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|char=[[]]|deci=33|octl=041|kuten=91-1}}
<!-- cell4 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|deci=33|octl=041|kuten=91-1}}
<!-- cell4 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|deci=161|octl=241}}
<!-- cell4 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|deci=161|octl=241}}
<!-- ctrl3 plus kuten -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|deci=33|kuten=1-1}}
<!-- ctrl3 -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]|deci=160}}
<!-- cell3 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|char=[[]]|deci=33|kuten=91-1}}
<!-- cell3 plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|deci=33|kuten=91-1}}
<!-- cell3 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|deci=161}}
<!-- cell3 -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|deci=161}}
<!-- ctrl plus kuten -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=3000|ctrl=[[space character|IDSP]]|kuten=1-1}}
<!-- ctrl -->
|{{Character set color|misc}}|{{chset-cell-unified|unic=00A0|ctrl=[[non-breaking space|NBSP]]}}
<!-- cell plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|char=[[]]|kuten=91-1}}
<!-- cell plus kuten -->
|{{Character set color|graph}}|{{chset-cell-unified|unic=26E3|kuten=91-1}}
<!-- cell -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1|char=[[inverted exclamation mark|¡]]|fn={{efn|A footnote next to character}}}}
<!-- cell -->
|{{Character set color|punct}}|{{chset-cell-unified|unic=00A1}}{{efn|A trailing footnote}}
|}
IDSP
3000
33
041
1-1
NBSP
00A0
160
240

26E3
33
041
91-1

26E3
33
041
91-1
¡
00A1
161
241
¡
00A1
161
241
IDSP
3000
33
1-1
NBSP
00A0
160

26E3
33
91-1

26E3
33
91-1
¡
00A1
161
¡
00A1
161
IDSP
3000
1-1
NBSP
00A0

26E3
91-1

26E3
91-1
¡[a]
00A1
¡
00A1
[b]
  1. ^ A footnote next to character
  2. ^ A trailing footnote

Chset family of templates

edit

See PETSCII, and Computer Braille Code for examples of usage.

edit

Character row header

edit

Character cell colors

edit

Note: if adjusting these colors, reference Template:Chset-table-header/family-test-sheet for a reference of how well they work together, and whether base / variant / boxed / legend colors are properly in sync.

Boxed and slightly shaded variants of these exist in order to indicate some kind of additional information (depending on the article) like, for example, a derivation from a base codepage, a variance of definition of the corresponding codepage in different sources (to be explained in the article) or in different revisions of a code page

For generating colors for cells by Unicode category, this script may be helpful.

Please note that the boxed variants must not be used, if a cell, which is not to be marked, is surrounded by four cells, which need to be marked, as this would make the central cell appear marked as well. The shaded variants do not exhibit this problem.

Character cell contents

edit

Test table

edit

The following colours should be in sync with one another and with the legend.

  Letter  Number  Punctuation  Symbol  Other  Lead byte  Undefined

JIS-Roman with alternative codings of single shifts and with tests for box and var colours
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
NUL
0000
SOH
0001
STX
0002
ETX
0003
EOT
0004
ENQ
0005
ACK
0006
BEL
0007
BS
0008
HT
0009
LF
000A
VT
000B
FF
000C
CR
000D
SO
 
SI
 
1_
16
DLE
0010
DC1
0011
DC2
0012
DC3
0013
DC4
0014
NAK
0015
SYN
0016
ETB
0017
CAN
0018
EM
0019
SUB
001A
ESC
001B
SS2
 
SS3
 
RS
001E
US
001F
2_
32
SP
0020
!
0021
"
0022
#
0023
$
0024
%
0025
&
0026
'
0027
(
0028
)
0029
*
002A
+
002B
,
002C
-
002D
.
002E
/
002F
3_
48
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
;
003B
<
003C
=
003D
>
003E
?
003F
4_
64
@
0040
A
0041
B
0042
C
0043
D
0044
E
0045
F
0046
G
0047
H
0048
I
0049
J
004A
K
004B
L
004C
M
004D
N
004E
O
004F
5_
80
P
0050
Q
0051
R
0052
S
0053
T
0054
U
0055
V
0056
W
0057
X
0058
Y
0059
Z
005A
[
005B
¥
00A5
]
005D
^
005E
_
005F
6_
96
`
0060
a
0061
b
0062
c
0063
d
0064
e
0065
f
0066
g
0067
h
0068
i
0069
j
006A
k
006B
l
006C
m
006D
n
006E
o
006F
7_
112
p
0070
q
0071
r
0072
s
0073
t
0074
u
0075
v
0076
w
0077
x
0078
y
0079
z
007A
{
007B
|
007C
}
007D

203E
DEL
007F

  Letter  Number  Punctuation  Symbol  Other  Lead byte  Undefined