User:Drmccreedy/roadmap multilingual

BMP
SIP
SMP
SSP
TIP

I use a Perl script to generate Unicode character roadmap images for the BMP, SIP, SMP, SSP and TIP planes.

These images are multilingual, currently supporting English, Belarusian, Chinese using simplified characters, Chinese using traditional characters, Czech, Dutch, French, German, Hungarian, Korean, Persian, Russian, Turkish and Ukrainian.
(You can see them all on my test page.)

I'm happy to add languages if provided with the necessary translations.

What to translate edit

This is the text to be translated if you want to add a new language to the roadmap images:

key text to translate
Africa African scripts
Americas American scripts
AsiaEast East Asian scripts
AsiaSC South and Central Asian scripts
AsiaSE Southeast Asian scripts
asOfVersion As of Unicode 15.1
cuneiform Cuneiform
Europe Non-Latin European scripts
Han CJK characters
hieroglyphs Hieroglyphs
IndOcean Indonesian and Oceanic scripts
Latin Latin script
ME Middle Eastern and Southwest Asian scripts
misc Miscellaneous characters
notation Notational systems
private Private use
surrogates UTF-16 surrogates
symbols Symbols
tags Tags
unallocated Unallocated code points
variation Variation Selectors
xDescBMP A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox Each numbered box represents 256 code points.

Context edit

The image legends more-or-less line up with the chapters in the Unicode Standard.
The Standard's table of contents is especially useful in understanding the meaning of the legend groups:

Chapter 7 covers Latin script and Non-Latin European scripts like Greek, Cyrillic, Armenian, and Georgian.

Chapter 9 covers Middle Eastern and Southwest Asian scripts including Hebrew, Arabic, Syriac, and Phoenician.

Chapter 11 covers Cuneiform and Hieroglyphs.

Chapters 12-15 cover South and Central Asian scripts including Tibetan, Mongolian, and the official scripts of India.

Chapter 16 covers Southeast Asian scripts including Thai, Lao, Burmese, and Khmer.

Chapter 17 covers Indonesian and Oceanic scripts like the Philippine scripts, Balinese, and Javanese.

Chapter 18 covers East Asian scripts like Han, Bopomofo, Hiragana, Katakana, Hangul, Yi, Tangut, and others.
CJK characters are a subset of East Asian made up of Chinese characters or ideographs that are, or have been, used in China, Japan, Korean, and Vietnam.

Chapter 19 covers African scripts like Ethiopic, Osmanya, and Tifinagh.

Chapter 20 covers American scripts (that is, scripts used in North and South America): Cherokee, Canadian syllabics, Osage, Deseret.

Chapter 21 covers Notational systems like Braille, musical symbols, Duployan shorthand, and Sutton SignWriting.

Chapter 22 covers Symbols like currency symbols, numerals, and math symbols.

Chapter 23 covers "Special Areas and Format Characters" which are split out here into four groups:

  • Private use are characters in Private Use Areas. They are characters intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments.
  • Tags are special-use tag characters that enable the spelling out of ASCII-based string tags (A-Z, a-z, 0-9, and a few other characters) using characters that can be strictly separated from ordinary text content characters in Unicode.
  • UTF-16 surrogates are characters that are used in pairs as surrogates (or stand-ins) in UTF-16 to represent Unicode characters over 0xffff (ones that can't fit into two bytes).
  • Variation Selectors are format characters used to specify a specific glyph variant for a Unicode character.

Unallocated code points represent the unused space or areas, those not yet defined/allocated.

Miscellaneous characters is just a catch-all for any characters that don't fit elsewhere.

Lastly, As of Unicode 15.1 tells the viewer what version of the Unicode Standard was used to create the images. It's a way of knowing if the chart is current or out-of-date.


Languages to proofread edit

These languages need to be fully translated and proofread: Belarusian (be) and Korean (ko).

Belarusian edit

key Belarusian English
Africa Пісьменства Афрыкі African scripts
Americas Пісьменства Амерыкі American scripts
AsiaEast Пісьменства Усходняй Азіі East Asian scripts
AsiaSC Пісьменства Паўднёвай і Цэнтральнай Азіі South and Central Asian scripts
AsiaSE Пісьменства Паўднёва-Усходняй Азіі Southeast Asian scripts
asOfVersion Па стане на версію Унікода 15.1 As of Unicode 15.1
cuneiform Клінапіс Cuneiform
Europe Нелацінскія еўрапейскія пісьменства Non-Latin European scripts
Han Ідэаграмы ККЯ CJK characters
hieroglyphs Іерогліфы Hieroglyphs
IndOcean Пісьменства Інданезіі і Акіяніі Indonesian and Oceanic scripts
Latin Лацінская пісьменнасць Latin script
ME Пісьменства Сярэдняга Усходу і Паўднёва-Заходняй Азіі Middle Eastern and Southwest Asian scripts
misc Розныя сімвалы Miscellaneous characters
notation Сістэмы нотапісу Notational systems
private Вобласць для прыватнага выкарыстання Private use
surrogates Сурагатныя пары UTF-16 UTF-16 surrogates
symbols Знакі Symbols
tags Тэгі Tags
unallocated Свабодныя кодавыя пазіцыі Unallocated code points
variation Варыянтныя селектары Variation Selectors
xDescBMP  ? A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP  ? A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP  ? A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP  ? A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP  ? A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox  ? Each numbered box represents 256 code points.

Korean edit

key Korean English
Africa 아프리카 문자 African scripts
Americas 북미 및 남미 문자 American scripts
AsiaEast 동아시아 문자 East Asian scripts
AsiaSC 남부와 중앙 아시아 문자 South and Central Asian scripts
AsiaSE 동남아시아 문자 Southeast Asian scripts
asOfVersion 유니 코드 버전 15.1 As of Unicode 15.1
cuneiform 쐐기 문자 Cuneiform
Europe 기타 유럽 문자 Non-Latin European scripts
Han CJK 문자 CJK characters
hieroglyphs 상형 문자 Hieroglyphs
IndOcean 인도네시아, 오세아니아 문자 Indonesian and Oceanic scripts
Latin 로마자, 로마자권 기호 Latin script
ME 중동·서남아시아 문자 Middle Eastern and Southwest Asian scripts
misc 기타 문자 Miscellaneous characters
notation  ? Notational systems
private 사용자 정의 영역 Private use
surrogates UTF-16 상·하위 대체 영역 UTF-16 surrogates
symbols 기호 Symbols
tags  ? Tags
unallocated 쓰이지 않음 Unallocated code points
variation  ? Variation Selectors
xDescBMP  ? A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP  ? A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP  ? A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP  ? A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP  ? A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox  ? Each numbered box represents 256 code points.

In progress edit

Bulgarian edit

Bulgarian is currently being translated:

key Bulgarian English
Africa Африканско писане African scripts
Americas Американско писане American scripts
AsiaEast Източноазиатска писменост East Asian scripts
AsiaSC Писане в Южна и Централна Азия South and Central Asian scripts
AsiaSE Писане в Югоизточна Азия Southeast Asian scripts
Europe Европейско писане, различно от латинското Non-Latin European scripts
Han Китайска, японска и корейска писменост CJK characters
IndOcean Индонезийска и океанска писменост Indonesian and Oceanic scripts
Latin Латинска писменост Latin script
ME Писане от Близкия изток и Югозападна Азия Middle Eastern and Southwest Asian scripts
asOfVersion От версия Unicode 15.1 As of Unicode 15.1
cuneiform Cuneiform Cuneiform
hieroglyphs Йероглифи Hieroglyphs
misc Различни герои Miscellaneous characters
notation Нотационни системи Notational systems
private Частна употреба Private use
surrogates заместители на UTF-16 UTF-16 surrogates
symbols Символи Symbols
tags Етикети Tags
unallocated Неразпределени кодови точки Unallocated code points
variation Селектори на вариации Variation Selectors
xDescBMP Графично представяне на основната многоезична равнина (BMP) на Unicode. A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP Графично представяне на допълнителната идеографска равнина (SIP) на Unicode. A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP Графично представяне на допълнителната многоезична равнина (SMP) на Unicode. A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP Графично представяне на допълнителната равнина със специално предназначение (SSP) на Unicode. A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP Графично представяне на третостепенната идеографска равнина (TIP) на Unicode. A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox Всяко номерирано поле представлява 256 кодови точки. Each numbered box represents 256 code points.

Legend:

  • Red text = Machine translated by DeepL, pending human proofreading.
  • Green text = Human proofread, ready for adoption.

Croatian edit

Croatian is currently being translated:

key Croatian English
Africa African scripts
Americas American scripts
AsiaEast East Asian scripts
AsiaSC South and Central Asian scripts
AsiaSE Southeast Asian scripts
Europe Non-Latin European scripts
Han CJK characters
IndOcean Indonesian and Oceanic scripts
Latin Latin script
ME Middle Eastern and Southwest Asian scripts
asOfVersion As of Unicode 15.1
cuneiform Cuneiform
hieroglyphs Hieroglyphs
misc Miscellaneous characters
notation Notational systems
private Private use
surrogates UTF-16 surrogates
symbols Symbols
tags Tags
unallocated Unallocated code points
variation Variation Selectors
xDescBMP A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox Each numbered box represents 256 code points.

Japanese edit

Japanese is currently being translated:

key Japanese English
Africa アフリカ文字 African scripts
Americas アメリカの脚本 American scripts
AsiaEast 東アジア文字 East Asian scripts
AsiaSC 南・中央アジア文字 South and Central Asian scripts
AsiaSE 東南アジア文字 Southeast Asian scripts
Europe 非ラテン系ヨーロッパ文字 Non-Latin European scripts
Han 日中韓文字 CJK characters
IndOcean インドネシア文字とオセアニア文字 Indonesian and Oceanic scripts
Latin ラテン文字 Latin script
ME 中東・西南アジア文字 Middle Eastern and Southwest Asian scripts
asOfVersion ユニコード15.1現在 As of Unicode 15.1
cuneiform 楔形文字 Cuneiform
hieroglyphs ヒエログリフ Hieroglyphs
misc その他のキャラクター Miscellaneous characters
notation 表記システム Notational systems
private 個人使用 Private use
surrogates UTF-16サロゲート UTF-16 surrogates
symbols シンボル Symbols
tags タグ Tags
unallocated 未割り当てのコードポイント Unallocated code points
variation バリエーション・セレクター Variation Selectors
xDescBMP Unicodeの基本多言語面(BMP)をグラフィカルに表現したもの。 A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP UnicodeのSIP(Supplementary Ideographic Plane)を図式化したもの。 A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP Unicodeの補足多言語面(SMP)を図式化したもの。 A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP UnicodeのSSP(Supplementary Special-purpose Plane)を図式化したもの。 A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP UnicodeのTIP(Tertiary Ideographic Plane)を図式化したもの。 A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox 各番号のボックスは256のコードポイントを表す。 Each numbered box represents 256 code points.

Legend:

  • Red text = Machine translated by DeepL, pending human proofreading.
  • Green text = Human proofread, ready for adoption.

Portuguese edit

Portuguese is currently being translated:

key Portuguese English
Africa Escrita africana African scripts
Americas Escrita americana American scripts
AsiaEast Escrita da Ásia Oriental East Asian scripts
AsiaSC Escrita da Ásia Central e do Sul South and Central Asian scripts
AsiaSE Escrita do Sudeste Asiático Southeast Asian scripts
Europe Escrita europeia não latina Non-Latin European scripts
Han Caracteres CJK CJK characters
IndOcean Escrita indonésia e oceânica Indonesian and Oceanic scripts
Latin Escrita latina Latin script
ME Escrita do Oriente Médio e do Sudoeste Asiático Middle Eastern and Southwest Asian scripts
asOfVersion A partir do Unicode 15.1 As of Unicode 15.1
cuneiform Cuneiforme Cuneiform
hieroglyphs Hieróglifos Hieroglyphs
misc Caracteres diversos Miscellaneous characters
notation Sistemas de notação Notational systems
private Uso privado Private use
surrogates Substitutos do UTF-16 UTF-16 surrogates
symbols Símbolos Symbols
tags Etiquetas Tags
unallocated Pontos de código não atribuídos Unallocated code points
variation Seletores de variação Variation Selectors
xDescBMP Uma representação gráfica do Plano Multilíngue Básico (BMP) do Unicode. A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP Uma representação gráfica do Plano Ideográfico Suplementar (SIP) do Unicode. A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP Uma representação gráfica do Plano Suplementar Multilíngue (SMP) do Unicode. A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP Uma representação gráfica do Plano Suplementar para Fins Especiais (SSP) do Unicode. A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP Uma representação gráfica do Plano Ideográfico Terciário (TIP, na sigla em inglês) do Unicode. A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox Cada caixa numerada representa 256 pontos de código. Each numbered box represents 256 code points.

Legend:

  • Red text = Machine translated by DeepL, pending human proofreading.
  • Green text = Human proofread, ready for adoption.

Spanish edit

Spanish is currently being translated:

key Spanish English
Africa Escrituras africanas African scripts
Americas Escrituras americanas American scripts
AsiaEast Escrituras de Asia Oriental East Asian scripts
AsiaSC Escrituras de Asia Meridional y Asia Central South and Central Asian scripts
AsiaSE Escrituras del Sudeste Asiático Southeast Asian scripts
asOfVersion A partir de Unicode 15.1 As of Unicode 15.1
cuneiform Cuneiforme Cuneiform
Europe Escrituras europeas no latinas Non-Latin European scripts
Han Caracteres CJK CJK characters
hieroglyphs Jeroglíficos Hieroglyphs
IndOcean Escrituras indonesias y oceánicas Indonesian and Oceanic scripts
Latin Escritura latina Latin script
ME Escrituras del Oriente Medio y del Asia sudoccidental Middle Eastern and Southwest Asian scripts
misc Caracteres varios Miscellaneous characters
notation Sistemas notacionales Notational systems
private Uso privado Private use
surrogates Sustitutos de UTF-16 UTF-16 surrogates
symbols Símbolos Symbols
tags Etiquetas Tags
unallocated Puntos de código no asignados Unallocated code points
variation Selectores de variación Variation Selectors
xDescBMP Representación gráfica del Plano Multilingüe Básico (PMB) de Unicode. A graphical representation of Unicode's Basic Multilingual Plane (BMP).
xDescSIP Una representación gráfica del Plano Ideográfico Suplementario (SIP) de Unicode. A graphical representation of Unicode's Supplementary Ideographic Plane (SIP).
xDescSMP Una representación gráfica del Plano Multilingüe Suplementario (SMP) de Unicode. A graphical representation of Unicode's Supplementary Multilingual Plane (SMP).
xDescSSP Una representación gráfica del Plano Suplementario de Propósito Especial (SSP) de Unicode. A graphical representation of Unicode's Supplementary Special-purpose Plane (SSP).
xDescTIP Una representación gráfica del Plano Ideográfico Terciario (TIP) de Unicode. A graphical representation of Unicode's Tertiary Ideographic Plane (TIP).
xEachBox Cada casilla numerada representa 256 puntos de código. Each numbered box represents 256 code points.

Legend:

  • Red text = Machine translated by DeepL, pending human proofreading.
  • Green text = Human proofread, ready for adoption.