Teletext character set

This article covers technical details of the character encoding system defined by ETS 300 706 of the ETSI, a standard for World System Teletext, and used for the Viewdata and Teletext variants of Videotex in Europe.

Character sets

edit

The following tables show various Teletext character sets. Each character is shown with a potential Unicode equivalent if available. Space and control characters are represented by the abbreviations for their names.

Control characters

edit

Control characters are used to set foreground and background color (black, red, green, yellow, blue, magenta, cyan, white, flash), character height (normal, double width, double height, double), current default character set, and other attributes.[1][2]

In formats where compatibility with ECMA-48's C0 control codes such as TAB and LF is not required, these control codes are sometimes mapped transparently to the Unicode C0 control code range (U+0000 through U+001F).[3] Amongst C1 control code sets, the ITU T.101 C1 control codes for "Serial" Data Syntax 2,[4] are mostly a transposition of the Teletext spacing controls, except for the inclusion of CSI at 0x9B.

Teletext spacing attributes[2]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x ABK ANR ANG ANY ANB ANM ANC ANW FSH STD EBX SBX NSZ DBH DBW DBS
1x MBK MSR MSG MSY MSB MSM MSC MSW CDY SPL[a] STL[b] ESC[c] BBD NBD HMS RMS
  1. ^ The ETS 300 706 name of this control code is "Contiguous Mosaic Graphics", and it switches mosaic characters to contiguous (connected) display.[2] Its other name of "Stop Lining" arises from other formats, as well as using it to switch mosaic characters to connected display, also using it to switch alphanumeric characters to non-underlined display.[4]
  2. ^ The ETS 300 706 name of this control code is "Separated Mosaic Graphics", and it switches mosaic characters to separated display.[2] Its other name of "Start Lining" arises from other formats, as well as using it to switch mosaic characters to separated display, also using it to switch alphanumeric characters to underlined display.[4]
  3. ^ ESC is also given the alternative name of "Switch" by ETS 300 706. It is used in certain contexts as a toggle between two G0 sets previously designated by dedicated packets.[2]

Latin

edit
Teletext (Latin G0)[5][6]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " # ¤ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
  National option subset (see table below)


Latin G0 national option subsets[7]
23 24 40 5B 5C 5D 5E 5F 60 7B 7C 7D 7E
Primary set # ¤ @ [ \ ] ^ _ ` { | } ~
Czech/Slovak # ů č ť ž ý í ř é á ě ú š
English £ $ @ ½ #/ ¼ ¾ ÷
Estonian # õ Š Ä Ö Ž Ü Õ š ä ö ž ü
French é ï à ë ê ù î # è â ô û ç
German # $ § Ä Ö Ü ^ _ ° ä ö ü ß
Italian £ $ é ° ç # ù à ò è ì
Latvian/Lithuanian # $ Š ė ę Ž č ū š ą ų ž į
Polish # ń ą Ƶ Ś Ł ć ó ę ż ś ł ź
Portuguese/Spanish ç $ ¡ á é í ó ú ¿ ü ñ è à
Romanian # ¤ Ţ/Ț Â Ş/Ș Ă Î ı ţ/ț â ş/ș ă î
Serbian/Croatian/Slovenian # Ë Č Ć Ž Đ Š ë č ć ž đ š
Swedish/Finnish/Hungarian # ¤ É Ä Ö Å Ü _ é ä ö å ü
Turkish ğ İ Ş Ö Ç Ü Ğ ı ş ö ç ü
Teletext (Latin G2)[8][9]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ¡ ¢ £ $ ¥ # § ¤ «
3x ° ± ² ³ × µ · ÷ » ¼ ½ ¾ ¿
4x NBSP ̀ ́ ̂ ̃ ̄ ̆ ̇ ̈ ̣ ̊ ̧ ̲ ̋ ̨ ̌
5x ¹ ® © α
6x Ω Æ Ð ª Ħ IJ Ŀ Ł Ø Œ º Þ Ŧ Ŋ ʼn
7x ĸ æ đ ð ħ ı ij ŀ ł ø œ ß þ ŧ ŋ
  Diacritical marks for use with G0 characters

Greek

edit
Teletext (Greek G0)[10]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; « = » ?
4x ΐ Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο
5x Π Ρ ʹ Σ Τ Υ Φ Χ Ψ Ω Ϊ Ϋ ά έ ή ί
6x ΰ α β γ δ ε ζ η θ ι κ λ μ ν ξ ο
7x π ρ ς σ τ υ φ χ ψ ω ϊ ϋ ό ύ ώ
Teletext (Greek G2)[11]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  a b £ e h i § : k
3x ° ± ² ³ × m n p ÷ t ¼ ½ ¾ x
4x ̀ ́ ̂ ̃ ̄ ̆ ̇ ̈ ̣ ̊ ̧ ̲ ̋ ̨ ̌
5x ? ¹ ® © ɑ Ί Ύ Ώ
6x C D F G J L Q R S U V W Y Z Ά Ή
7x c d f g j l q r s u v w y z Έ
  Diacritical marks for use with G0 characters

Cyrillic

edit
Teletext (Cyrillic G0, Russian/Bulgarian)[12]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " # $ % ы ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x Ю А Б Ц Д Е Ф Г Х И Ѝ К Л М Н О
5x П Я Р С Т У Ж В Ь Ъ З Ш Э Щ Ч Ы
6x ю а б ц д е ф г х и ѝ к л м н о
7x п я р с т у ж в ь ъ з ш э щ ч
Teletext (Cyrillic G0, Serbian/Croatian)[13]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x Ч А Б Ц Д Е Ф Г Х И Ј К Л М Н О
5x П Ќ Р С Т У В Ѓ Љ Њ З Ћ Ж Ђ Ш Џ
6x ч а б ц д е ф г х и ј к л м н о
7x п ќ р с т у в ѓ љ њ з ћ ж ђ ш
Teletext (Cyrillic G0, Ukrainian)[14]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " # $ % ї ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x Ю А Б Ц Д Е Ф Г Х И Ѝ К Л М Н О
5x П Я Р С Т У Ж В Ь І З Ш Є Щ Ч Ї
6x ю а б ц д е ф г х и ѝ к л м н о
7x п я р с т у ж в ь і з ш є щ ч
Teletext (Cyrillic G2)[15]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ¡ ¢ £ $ ¥ § «
3x ° ± ² ³ × µ · ÷ » ¼ ½ ¾ ¿
4x ̀ ́ ̂ ̃ ̄ ̆ ̇ ̈ ̣ ̊ ̧ ̲ ̋ ̨ ̌
5x ¹ ® © α Ł ł β
6x D E F G I J K L N Q R S U V W Z
7x d e f g i j k l n q r s u v w z
  Diacritical marks for use with G0 characters

Arabic

edit

Note that each Arabic contextual/positional character in the tables below is shown with the non-positional Unicode equivalent if available.

Teletext (Arabic G0)[16]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " £ $ % ) ( * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ؛ > = < ؟
4x
5x ﺿ #
6x ـ
7x
Teletext (Arabic G2)[17]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP 
3x ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩
4x à A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z ë ê ù î
6x é a b c d e f g h i j k l m n o
7x p q r s t u v w x y z â ô û ç


Hebrew

edit
Teletext (Hebrew G0)[18]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  ! " £ $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z ½ #
6x א ב ג ד ה ו ז ח ט י ך כ ל ם מ ן
7x נ ס ע ף פ ץ צ ק ר ש ת ¾ ÷

Graphics character sets

edit

G1 block mosaics

edit
Teletext (G1)[19][20]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x  SP  🬀 🬁 🬂 🬃 🬄 🬅 🬆 🬇 🬈 🬉 🬊 🬋 🬌 🬍 🬎
3x 🬏 🬐 🬑 🬒 🬓 🬔 🬕 🬖 🬗 🬘 🬙 🬚 🬛 🬜 🬝
4x
5x
6x 🬞 🬟 🬠 🬡 🬢 🬣 🬤 🬥 🬦 🬧 🬨 🬩 🬪 🬫 🬬
7x 🬭 🬮 🬯 🬰 🬱 🬲 🬳 🬴 🬵 🬶 🬷 🬸 🬹 🬺 🬻

Same table as above, rendered with bitmaps:

0 1 2 3 4 5 6 7 8 9 A B C D E F
2                                
3                                
6                                
7                                

G3 smooth mosaics and line drawing

edit
Teletext (G3)[21][22]
0 1 2 3 4 5 6 7 8 9 A B C D E F
2x 🬼 🬽 🬾 🬿 🭀 🭁 🭂 🭃 🭄 🭅 🭆 🭨 🭩 🭰
3x 🭇 🭈 🭉 🭊 🭋 🭌 🭍 🭎 🭏 🭐 🭑 🭪 🭫 🭵
4x 🮤 🮥 🮦 🮧 🮠 🮡 🮢 🮣
5x NBSP
6x 🭒 🭓 🭔 🭕 🭖 🭗 🭘 🭙 🭚 🭛 🭜 🭬 🭭
7x 🭝 🭞 🭟 🭠 🭡 🭢 🭣 🭤 🭥 🭦 🭧 🭮 🭯

References

edit
  1. ^ "4. Teletext and Minitel" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04, p. 2
  2. ^ a b c d e "12.2 Spacing attributes" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, pp. 76–80, retrieved 4 April 2020
  3. ^ Ewell, Doug (2020-10-16). "Teletext separated mosaic graphics". Unicode Mailing List Archive. Unicode Consortium.
  4. ^ a b c British Standards Institution (1982-06-01). Attribute Control Set for UK Videotex (PDF). ITSCJ/IPSJ. ISO-IR-56.
  5. ^ "15.6.1, 15.6.2" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, pp. 114–115, retrieved 4 April 2020
  6. ^ "TELTXTG0.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04
  7. ^ "15.6.2 Latin National Option Sub-Sets" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 115, retrieved 4 April 2020
  8. ^ "15.6.3 Latin G2 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 116, retrieved 4 April 2020
  9. ^ "TELTXTG2.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04
  10. ^ "15.6.8 Greek G0 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 121, retrieved 4 April 2020
  11. ^ "15.6.9 Greek G2 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 122, retrieved 4 April 2020
  12. ^ "15.6.5 Cyrillic G0 Set - Option 2 - Russian/Bulgarian" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 118, retrieved 4 April 2020
  13. ^ "15.6.4 Cyrillic G0 Set - Option 1 - Serbian/Croatian" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 117, retrieved 4 April 2020
  14. ^ "15.6.6 Cyrillic G0 Set - Option 3 - Ukrainian" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 119, retrieved 4 April 2020
  15. ^ "15.6.7 Cyrillic G2 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 120, retrieved 4 April 2020
  16. ^ "15.6.10 Arabic G0 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 123, retrieved 4 April 2020
  17. ^ "15.6.11 Arabic G2 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 124, retrieved 4 April 2020
  18. ^ "15.6.12 Hebrew G0 Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 125, retrieved 4 April 2020
  19. ^ "15.7.1 G1 Block Mosaics Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 126, retrieved 4 April 2020
  20. ^ "TELTXTG1.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04
  21. ^ "15.7.2 G3 Smooth Mosaics and Line Drawing Set" (PDF), Enhanced Teletext specification, European Telecommunications Standards Institute (ETSI), May 1997, p. 127, retrieved 4 April 2020
  22. ^ "TELTXTG3.TXT" (PDF), L2/19-025: Proposal to add characters from legacy computers and teletext to the UCS, 2019-01-04