Code page

A code page maps each character of text to the characters in a character set for FOCA fonts or the characters associated to a Unicode point for WorldType fonts. Two types of code pages exist:

  • A traditional code page contains the mapping information between a code point and a character ID. It can be used with FOCA character sets and TrueType and OpenType fonts.
  • An extended code page contains the mapping information for a code point, a character ID, and a Unicode point. It can be used with TrueType and OpenType fonts.
A character ID is an 8-byte character data string. A code point is an 8-bit binary number representing a character. Code points are usually shown as hexadecimal representations of their binary values.

Code Point Representations

System Value
Binary 11000001
Decimal 193
Hexadecimal C1

When a code page is used with a FOCA font character set, each keyboard character is translated into a code point. When the text is printed, each code point is matched to a character ID on the code page you specified. The character ID is then matched to the image (raster pattern or outline pattern) of the character in the character set you specified. The image in the character set is the image that is printed in your text. To be a valid code page for a particular character set, all character IDs in the code page must be included in that character set (Figure 11).

Translation of a keyboard character into a printed character with a code page and FOCA font character set

This picture shows how a keyboard character is translated into a printed character with a code page and FOCA font character set

When a code page is used with a TrueType and OpenType font, each code point is matched to the character ID on the code page you specified. The character ID is matched to a Unicode point on the graphic character global identifier to Unicode mapping (GUM) table on your printer. The Unicode point is then matched to the image of the TrueType and OpenType font you specified (Figure 12).

Translation of a keyboard character into a printed character using a code page and a TrueType and OpenType font

This picture shows how a keyboard character is translated into a printed character using a code page and a TrueType and OpenType font.

When an extended code page is used with a TrueType and Open Type font, each code point is matched to the Unicode point on the extended code page you specified without referring to the GUM on your printer. The Unicode point is then matched to the image of the TrueType and OpenType font you specified (see Figure 13).

Translation of a keyboard character into a printed character using an extended code page and a TrueType and OpenType font

This picture shows how a keyboard character is translated into a printed character using an extended code page and a TrueType and OpenType font.

The next figure shows an example of a code page. In the example, when the printer receives hexadecimal code point C1 for the code page T1V10037, it prints an uppercase A (character ID LA020000).

Code page T1V10037

This picture shows a partial grid for the T1V10037 Country Extended: United States, Canada code page.

Code pages for different languages

Code pages accommodate various national languages by using characters and special symbols appropriate to the language. Different code pages can have identical character IDs assigned to different code points. For example, the character é (lowercase e accent acute, character ID LE110000) has these code point assignments in two different code pages:

  • Hexadecimal code point 51 in code page T1V10037 (Country Extended: United States, Canada)
  • Hexadecimal code point 5A in code page T1V10280 (Country Extended: Italy)

Single- and double-byte code pages

A single-byte code page contains 256 or fewer 1-byte code points. Single-byte code pages are large enough for languages with alphabetic writing systems, such as English, Greek, and Arabic. A single-byte character set (SBCS) is used with a single-byte code page.

A double-byte code page can contain as many as 65,536 two-byte code points. Languages with non-alphabetic writing systems, such as Chinese, Japanese, and Korean, require double-byte code pages. A double-byte character set (DBCS) is used with a double-byte code page.

DBCSs contain some single-byte characters, usually romaji (Western characters) and katakana. Single-byte code pages are used with these characters. Because the characters are either half width (Font spacing characteristics) or proportionally spaced, these code pages are sometimes called half-width code pages.

Code page sections

If you think of a double-byte code page as a collection of single-byte code pages, a double-byte character code has two parts: the first byte indicates a section of the code page, and the second byte indicates a code point in the section.

Raster coded fonts treat double-byte code pages this way: the coded font is divided into sections, each with its own single-byte code page. Each character in the section has a single-byte code point.

Outline coded fonts treat double-byte code pages as single, large code pages. Each character has a double-byte code point.