Commit | Line | Data |
c7b62a68 |
1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" |
2 | |
3 | "http://www.w3.org/TR/REC-html40/loose.dtd"> |
4 | |
5 | <html> |
6 | |
7 | |
8 | |
9 | <head> |
10 | |
11 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
12 | |
13 | <meta http-equiv="Content-Language" content="en-us"> |
14 | |
15 | <meta name="GENERATOR" content="Microsoft FrontPage 4.0"> |
16 | |
17 | <meta name="ProgId" content="FrontPage.Editor.Document"> |
18 | |
19 | <link rel="stylesheet" href="http://www.unicode.org/unicode.css" type="text/css"> |
20 | |
21 | <title>Unicode Character Database</title> |
22 | |
23 | </head> |
24 | |
25 | |
26 | |
27 | <body> |
28 | |
29 | |
30 | |
31 | <h1>UNICODE CHARACTER DATABASE<br> |
32 | Version 3.0.0</h1> |
33 | |
34 | <table border="1" cellspacing="2" cellpadding="0" height="87" width="100%"> |
35 | |
36 | <tr> |
37 | |
38 | <td valign="TOP" width="144">Revision</td> |
39 | |
40 | <td valign="TOP">3.0.0</td> |
41 | |
42 | </tr> |
43 | |
44 | <tr> |
45 | |
46 | <td valign="TOP" width="144">Authors</td> |
47 | |
48 | <td valign="TOP">Mark Davis and Ken Whistler</td> |
49 | |
50 | </tr> |
51 | |
52 | <tr> |
53 | |
54 | <td valign="TOP" width="144">Date</td> |
55 | |
56 | <td valign="TOP">1999-09-11</td> |
57 | |
58 | </tr> |
59 | |
60 | <tr> |
61 | |
62 | <td valign="TOP" width="144">This Version</td> |
63 | |
64 | <td valign="TOP"><a href="ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html">ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html</a></td> |
65 | |
66 | </tr> |
67 | |
68 | <tr> |
69 | |
70 | <td valign="TOP" width="144">Previous Version</td> |
71 | |
72 | <td valign="TOP">n/a</td> |
73 | |
74 | </tr> |
75 | |
76 | <tr> |
77 | |
78 | <td valign="TOP" width="144">Latest Version</td> |
79 | |
80 | <td valign="TOP"><a href="ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html">ftp://ftp.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html</a></td> |
81 | |
82 | </tr> |
83 | |
84 | </table> |
85 | |
86 | <p align="center">Copyright © 1995-1999 Unicode, Inc. All Rights reserved.</p> |
87 | |
88 | <h2>Disclaimer</h2> |
89 | |
90 | <p>The Unicode Character Database is provided as is by Unicode, Inc. No claims |
91 | |
92 | are made as to fitness for any particular purpose. No warranties of any kind are |
93 | |
94 | expressed or implied. The recipient agrees to determine applicability of |
95 | |
96 | information provided. If this file has been purchased on magnetic or optical |
97 | |
98 | media from Unicode, Inc., the sole remedy for any claim will be exchange of |
99 | |
100 | defective media within 90 days of receipt.</p> |
101 | |
102 | <p>This disclaimer is applicable for all other data files accompanying the |
103 | |
104 | Unicode Character Database, some of which have been compiled by the Unicode |
105 | |
106 | Consortium, and some of which have been supplied by other sources.</p> |
107 | |
108 | <h2>Limitations on Rights to Redistribute This Data</h2> |
109 | |
110 | <p>Recipient is granted the right to make copies in any form for internal |
111 | |
112 | distribution and to freely use the information supplied in the creation of |
113 | |
114 | products supporting the Unicode<sup>TM</sup> Standard. The files in the Unicode |
115 | |
116 | Character Database can be redistributed to third parties or other organizations |
117 | |
118 | (whether for profit or not) as long as this notice and the disclaimer notice are |
119 | |
120 | retained. Information can be extracted from these files and used in |
121 | |
122 | documentation or programs, as long as there is an accompanying notice indicating |
123 | |
124 | the source.</p> |
125 | |
126 | <h2>Introduction</h2> |
127 | |
128 | <p>The Unicode Character Database is a set of files that define the Unicode |
129 | |
130 | character properties and internal mappings. For more information about character |
131 | |
132 | properties and mappings, see <i><a href="http://www.unicode.org/unicode/uni2book/u2.html">The |
133 | |
134 | Unicode Standard</a></i>.</p> |
135 | |
136 | <p>The Unicode Character Database has been updated to reflect Version 3.0 of the |
137 | |
138 | Unicode Standard, with many characters added to those published in Version 2.0. |
139 | |
140 | A number of corrections have also been made to case mappings or other errors in |
141 | |
142 | the database noted since the publication of Version 2.0. Normative bidirectional |
143 | |
144 | properties have also been modified to reflect decisions of the Unicode Technical |
145 | |
146 | Committee.</p> |
147 | |
148 | <p>For more information on versions of the Unicode Standard and how to reference |
149 | |
150 | them, see <a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions/</a>.</p> |
151 | |
152 | <h2>Conformance</h2> |
153 | |
154 | <p>Character properties may be either normative or informative. <i>Normative</i> |
155 | |
156 | means that implementations that claim conformance to the Unicode Standard (at a |
157 | |
158 | particular version) and which make use of a particular property or field must |
159 | |
160 | follow the specifications of the standard for that property or field in order to |
161 | |
162 | be conformant. The term <i>normative</i> when applied to a property or field of |
163 | |
164 | the Unicode Character Database, does <i>not</i> mean that the value of that |
165 | |
166 | field will never change. Corrections and extensions to the standard in the |
167 | |
168 | future may require minor changes to normative values, even though the Unicode |
169 | |
170 | Technical Committee strives to minimize such changes. An<i> informative </i>property |
171 | |
172 | or field is strongly recommended, but a conformant implementation is free to use |
173 | |
174 | or change such values as it may require while still being conformant to the |
175 | |
176 | standard. Particular implementations may choose to override the properties and |
177 | |
178 | mappings that are not normative. In that case, it is up to the implementer to |
179 | |
180 | establish a protocol to convey that information.</p> |
181 | |
182 | <h2>Files</h2> |
183 | |
184 | <p>The following summarizes the files in the Unicode Character Database. For |
185 | |
186 | more information about these files, see the referenced technical report or |
187 | |
188 | section of Unicode Standard, Version 3.0.</p> |
189 | |
190 | <p><b>UnicodeData.txt (Chapter 4)</b> |
191 | |
192 | <ul> |
193 | |
194 | <li>The main file in the Unicode Character Database.</li> |
195 | |
196 | <li>For detailed information on the format, see <a href="UnicodeData.html">UnicodeData.html</a>. |
197 | |
198 | This file also characterizes which properties are normative and which are |
199 | |
200 | informative.</li> |
201 | |
202 | </ul> |
203 | |
204 | <p><b>PropList.txt (Chapter 4)</b> |
205 | |
206 | <ul> |
207 | |
208 | <li>Additional informative properties list: <i>Alphabetic, Ideographic,</i> |
209 | |
210 | and <i>Mathematical</i>, among others.</li> |
211 | |
212 | </ul> |
213 | |
214 | <p><b>SpecialCasing.txt (Chapter 4)</b> |
215 | |
216 | <ul> |
217 | |
218 | <li>List of informative special casing properties, including one-to-many |
219 | |
220 | mappings such as SHARP S => "SS", and locale-specific mappings, |
221 | |
222 | such as for Turkish <i>dotless i</i>.</li> |
223 | |
224 | </ul> |
225 | |
226 | <p><b>Blocks.txt (Chapter 14)</b> |
227 | |
228 | <ul> |
229 | |
230 | <li>List of normative block names.</li> |
231 | |
232 | </ul> |
233 | |
234 | <p><b>Jamo.txt (Chapter 4)</b> |
235 | |
236 | <ul> |
237 | |
238 | <li>List of normative Jamo short names, used in deriving HANGUL SYLLABLE names |
239 | |
240 | algorithmically.</li> |
241 | |
242 | </ul> |
243 | |
244 | <p><b>ArabicShaping.txt (Section 8.2)</b> |
245 | |
246 | <ul> |
247 | |
248 | <li>Basic Arabic and Syriac character shaping properties, such as initial, |
249 | |
250 | medial and final shapes. These properties are normative for minimal shaping |
251 | |
252 | of Arabic and Syriac. </li> |
253 | |
254 | </ul> |
255 | |
256 | <p><b>NamesList.txt (Chapter 14)</b> |
257 | |
258 | <ul> |
259 | |
260 | <li>This file duplicates some of the material in the UnicodeData file, and |
261 | |
262 | adds informative annotations uses in the character charts, as printed in the |
263 | |
264 | Unicode Standard. </li> |
265 | |
266 | <li><b>Note: </b>The information in NamesList.txt and Index.txt files matches |
267 | |
268 | the appropriate version of the book. Changes in the Unicode Character |
269 | |
270 | Database since then may not be reflected in these files, since they are |
271 | |
272 | primarily of archival interest.</li> |
273 | |
274 | </ul> |
275 | |
276 | <p><b>Index.txt (Chapter 14)</b> |
277 | |
278 | <ul> |
279 | |
280 | <li>Informative index to Unicode characters, as printed in the Unicode |
281 | |
282 | Standard</li> |
283 | |
284 | <li><b>Note: </b>The information in NamesList.txt and Index.txt files matches |
285 | |
286 | the appropriate version of the book. Changes in the Unicode Character |
287 | |
288 | Database since then may not be reflected in these files, since they are |
289 | |
290 | primarily of archival interest.</li> |
291 | |
292 | </ul> |
293 | |
294 | <p><b>CompositionExclusions.txt (<a href="http://www.unicode.org/unicode/reports/tr15/">UTR#15 |
295 | |
296 | Unicode Normalization Forms</a>)</b> |
297 | |
298 | <ul> |
299 | |
300 | <li>Normative properties for normalization.</li> |
301 | |
302 | </ul> |
303 | |
304 | <p><b>LineBreak.txt (<a href="http://www.unicode.org/unicode/reports/tr14/">UTR |
305 | |
306 | #14: Line Breaking Properties</a>)</b> |
307 | |
308 | <ul> |
309 | |
310 | <li>Normative and informative properties for line breaking. To see which |
311 | |
312 | properties are informative and which are normative, consult UTR#14.</li> |
313 | |
314 | </ul> |
315 | |
316 | <p><b>EastAsianWidth.txt (<a href="http://www.unicode.org/unicode/reports/tr11/">UTR |
317 | |
318 | #11: East Asian Character Width</a>)</b> |
319 | |
320 | <ul> |
321 | |
322 | <li>Informative properties for determining the choice of wide vs. narrow |
323 | |
324 | glyphs in East Asian contexts.</li> |
325 | |
326 | </ul> |
327 | |
328 | <p><b>diffXvY.txt</b> |
329 | |
330 | <ul> |
331 | |
332 | <li>Mechanically-generated informative files containing accumulated |
333 | |
334 | differences between successive versions of UnicodeData.txt</li> |
335 | |
336 | </ul> |
337 | |
338 | |
339 | |
340 | </body> |
341 | |
342 | |
343 | |
344 | </html> |
345 | |