ⓘ Basic Latin (Unicode block)

Latin Extended Additional

Latin Extended Additional is a Unicode block. The characters in this block are mostly precomposed combinations of Latin letters with one or more general diacritical marks. Ninety of the characters are used in the Vietnamese alphabet. There are also a few Medievalist characters.

Mongolian Latin alphabet

The Mongolian Latin script was officially adopted in Mongolia in 1931. In 1939, the second version of the Latin alphabet was introduced but not used widely until it was replaced by the Cyrillic script in 1941.

Ukrainian Latin alphabet

A Latin alphabet for the Ukrainian language has been proposed or imposed several times in the history in Ukraine, but has never challenged the conventional Cyrillic Ukrainian alphabet.

List of Latin-script letters

This is a list of letters of the Latin script. The definition of a Latin-script letter for this list is a character encoded in the Unicode Standard that has a script property of Latin and the general category of Letter. An overview of the distribution of Latin-script letters in Unicode is given in Latin script in Unicode.


ⓘ Basic Latin (Unicode block)

The Basic Latin or C0 Controls and Basic Latin Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding. It ranges from U+0000 to U+007F, contains 128 characters and includes the C0 controls, ASCII punctuation and symbols, ASCII digits, both the uppercase and lowercase of the English alphabet and a control character.

The Basic Latin block was included in its present form from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.


1. Table of characters

A The letter U+005C \ may show up as a Yen¥ or Won₩ sign in Japanese/Korean fonts mistaking Unicode especially UTF-8 as a legacy character set which replaced the backslash with these signs.

2. Subheadings

The C0 Controls and Basic Latin block contains six subheadings.

C0 controls

The C0 Controls, referred to as C0 ASCII control codes in version 1.0, are inherited from ASCII and other 7-bit and 8-bit encoding schemes. The Alias names for C0 controls are taken from the ISO/IEC 6429:1992 standard.

ASCII punctuation and symbols

This subheading refers to standard punctuation characters, simple mathematical operators, and symbols like the dollar sign, percent, ampersand, underscore, and pipe.

ASCII digits

The ASCII Digits subheading contains the standard European number characters 1–9 and 0.

Uppercase Latin alphabet

The Uppercase Latin alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the majuscule.

Lowercase Latin alphabet

The Lowercase Latin Alphabet subheading contains the standard 26-letter unaccented Latin alphabet in the minuscule.

Control character

The Control Character subheading contains the "Delete" character.


3. Variants

Several of the characters are defined to render as a standardized variant if followed by variant indicators.

A variant is defined for a zero with a short diagonal stroke: U+0030 DIGIT ZERO, U+FE00 VS1 0︀.

Twelve characters #, *, and the digits can be followed by U+FE0E VS15 or U+FE0F VS16 to create emoji variants. Theyre keycap base characters, for example #️⃣ U+0023 NUMBER SIGN U+FE0F VS16 U+20E3 COMBINING ENCLOSING KEYCAP. The VS15 version is "text presentation" while the VS16 version is "emoji-style".

  • Basic Latin may refer to: Basic Latin Unicode block a Unicode block ISO Basic Latin alphabet, a Latin - script alphabet supported in ASCII, ISO IEC 646
  • the Basic Latin Unicode block and the plus - or - minus sign multiplication sign and obelus due to them already appearing in the Latin - 1 Supplement
  • a thousand characters from the Latin script are encoded in Unicode grouped in several basic and extended Latin blocks The extended ranges contain mainly
  • languages. The Unicode block that contains the alphabet is called C0 Controls and Basic Latin Two subheadings exist: Uppercase Latin alphabet the
  • A Unicode block is one of several contiguous ranges of numeric character codes code points of the Unicode character set that are defined by the Unicode
  • lossless translation to from Unicode It is the last of the Basic Multilingual Plane excepting the short Specials block at U FFF0 FFFF. Range U FF01 FF5E
  • Unicode supports several phonetic scripts and notations through the existing writing systems and the addition of extra blocks with phonetic characters
  • In Unicode the Sumero - Akkadian Cuneiform script is covered in three blocks in the Supplementary Multilingual Plane SMP U 12000 U 123FF Cuneiform U 12400 U 1247F
  • Variation Selectors is a Unicode block containing 16 Variation Selector format. They are used to specify a specific glyph variant for a preceding character
  • computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts
  • whether an entire block was necessary for the alphabet or if the turned letters not already in Unicode could instead be added under the Latin script section