1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
3 "http://www.w3.org/TR/REC-html40/loose.dtd">
11 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
13 <meta http-equiv="Content-Language" content="en-us">
15 <meta name="GENERATOR" content="Microsoft FrontPage 4.0">
17 <meta name="ProgId" content="FrontPage.Editor.Document">
19 <link rel="stylesheet" href="http://www.unicode.org/unicode.css" type="text/css">
21 <title>Unicode Character Database</title>
31 <h1>UNICODE CHARACTER DATABASE<br>
34 <table border="1" cellspacing="2" cellpadding="0" height="87" width="100%">
38 <td valign="TOP" width="144">Revision</td>
40 <td valign="TOP">3.0.0</td>
46 <td valign="TOP" width="144">Authors</td>
48 <td valign="TOP">Mark Davis and Ken Whistler</td>
54 <td valign="TOP" width="144">Date</td>
56 <td valign="TOP">1999-09-11</td>
62 <td valign="TOP" width="144">This Version</td>
64 <td valign="TOP"><a href="ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html">ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html</a></td>
70 <td valign="TOP" width="144">Previous Version</td>
72 <td valign="TOP">n/a</td>
78 <td valign="TOP" width="144">Latest Version</td>
80 <td valign="TOP"><a href="ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html">ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html</a></td>
86 <p align="center">Copyright © 1995-1999 Unicode, Inc. All Rights reserved.</p>
90 <p>The Unicode Character Database is provided as is by Unicode, Inc. No claims
92 are made as to fitness for any particular purpose. No warranties of any kind are
94 expressed or implied. The recipient agrees to determine applicability of
96 information provided. If this file has been purchased on magnetic or optical
98 media from Unicode, Inc., the sole remedy for any claim will be exchange of
100 defective media within 90 days of receipt.</p>
102 <p>This disclaimer is applicable for all other data files accompanying the
104 Unicode Character Database, some of which have been compiled by the Unicode
106 Consortium, and some of which have been supplied by other sources.</p>
108 <h2>Limitations on Rights to Redistribute This Data</h2>
110 <p>Recipient is granted the right to make copies in any form for internal
112 distribution and to freely use the information supplied in the creation of
114 products supporting the Unicode<sup>TM</sup> Standard. The files in the Unicode
116 Character Database can be redistributed to third parties or other organizations
118 (whether for profit or not) as long as this notice and the disclaimer notice are
120 retained. Information can be extracted from these files and used in
122 documentation or programs, as long as there is an accompanying notice indicating
126 <h2>Introduction</h2>
128 <p>The Unicode Character Database is a set of files that define the Unicode
130 character properties and internal mappings. For more information about character
132 properties and mappings, see <i><a href="http://www.unicode.org/unicode/uni2book/u2.html">The
134 Unicode Standard</a></i>.</p>
136 <p>The Unicode Character Database has been updated to reflect Version 3.0 of the
138 Unicode Standard, with many characters added to those published in Version 2.0.
140 A number of corrections have also been made to case mappings or other errors in
142 the database noted since the publication of Version 2.0. Normative bidirectional
144 properties have also been modified to reflect decisions of the Unicode Technical
148 <p>For more information on versions of the Unicode Standard and how to reference
150 them, see <a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions/</a>.</p>
154 <p>Character properties may be either normative or informative. <i>Normative</i>
156 means that implementations that claim conformance to the Unicode Standard (at a
158 particular version) and which make use of a particular property or field must
160 follow the specifications of the standard for that property or field in order to
162 be conformant. The term <i>normative</i> when applied to a property or field of
164 the Unicode Character Database, does <i>not</i> mean that the value of that
166 field will never change. Corrections and extensions to the standard in the
168 future may require minor changes to normative values, even though the Unicode
170 Technical Committee strives to minimize such changes. An<i> informative </i>property
172 or field is strongly recommended, but a conformant implementation is free to use
174 or change such values as it may require while still being conformant to the
176 standard. Particular implementations may choose to override the properties and
178 mappings that are not normative. In that case, it is up to the implementer to
180 establish a protocol to convey that information.</p>
184 <p>The following summarizes the files in the Unicode Character Database. For
186 more information about these files, see the referenced technical report or
188 section of Unicode Standard, Version 3.0.</p>
190 <p><b>UnicodeData.txt (Chapter 4)</b>
194 <li>The main file in the Unicode Character Database.</li>
196 <li>For detailed information on the format, see <a href="UnicodeData.html">UnicodeData.html</a>.
198 This file also characterizes which properties are normative and which are
204 <p><b>PropList.txt (Chapter 4)</b>
208 <li>Additional informative properties list: <i>Alphabetic, Ideographic,</i>
210 and <i>Mathematical</i>, among others.</li>
214 <p><b>SpecialCasing.txt (Chapter 4)</b>
218 <li>List of informative special casing properties, including one-to-many
220 mappings such as SHARP S => "SS", and locale-specific mappings,
222 such as for Turkish <i>dotless i</i>.</li>
226 <p><b>Blocks.txt (Chapter 14)</b>
230 <li>List of normative block names.</li>
234 <p><b>Jamo.txt (Chapter 4)</b>
238 <li>List of normative Jamo short names, used in deriving HANGUL SYLLABLE names
240 algorithmically.</li>
244 <p><b>ArabicShaping.txt (Section 8.2)</b>
248 <li>Basic Arabic and Syriac character shaping properties, such as initial,
250 medial and final shapes. These properties are normative for minimal shaping
252 of Arabic and Syriac. </li>
256 <p><b>NamesList.txt (Chapter 14)</b>
260 <li>This file duplicates some of the material in the UnicodeData file, and
262 adds informative annotations uses in the character charts, as printed in the
264 Unicode Standard. </li>
266 <li><b>Note: </b>The information in NamesList.txt and Index.txt files matches
268 the appropriate version of the book. Changes in the Unicode Character
270 Database since then may not be reflected in these files, since they are
272 primarily of archival interest.</li>
276 <p><b>Index.txt (Chapter 14)</b>
280 <li>Informative index to Unicode characters, as printed in the Unicode
284 <li><b>Note: </b>The information in NamesList.txt and Index.txt files matches
286 the appropriate version of the book. Changes in the Unicode Character
288 Database since then may not be reflected in these files, since they are
290 primarily of archival interest.</li>
294 <p><b>CompositionExclusions.txt (<a href="http://www.unicode.org/unicode/reports/tr15/">UTR#15
296 Unicode Normalization Forms</a>)</b>
300 <li>Normative properties for normalization.</li>
304 <p><b>LineBreak.txt (<a href="http://www.unicode.org/unicode/reports/tr14/">UTR
306 #14: Line Breaking Properties</a>)</b>
310 <li>Normative and informative properties for line breaking. To see which
312 properties are informative and which are normative, consult UTR#14.</li>
316 <p><b>EastAsianWidth.txt (<a href="http://www.unicode.org/unicode/reports/tr11/">UTR
318 #11: East Asian Character Width</a>)</b>
322 <li>Informative properties for determining the choice of wide vs. narrow
324 glyphs in East Asian contexts.</li>
328 <p><b>diffXvY.txt</b>
332 <li>Mechanically-generated informative files containing accumulated
334 differences between successive versions of UnicodeData.txt</li>