Re: Patch for ASN.1 compressed integer in pack/unpack
[p5sagit/p5-mst-13.2.git] / pod / perli18n.pod
CommitLineData
0cdde29f 1=head1 NAME
2
3perl18n - Perl i18n (internalization)
4
5=head1 DESCRIPTION
6
7Perl supports the language-specific notions of data like
8"is this a letter" and "which letter comes first". These
9are very important issues especially for languages other
10than English -- but also for English: it would be very
11naive indeed to think that C<A-Za-z> defines all the letters.
12
13Perl understands the language-specific data via the standardized
14(ISO C, XPG4, POSIX 1.c) method called "the locale system".
15The locale system is controlled per application using several
16environment variables.
17
18=head1 USING LOCALES
19
20If your operating system supports the locale system and you have
21installed the locale system and you have set your locale environment
22variables correctly (please see below) before running Perl, Perl will
23understand your data correctly.
24
25In runtime you can switch locales using the POSIX::setlocale().
26
27 use POSIX qw(setlocale LC_CTYPE);
28
29 # query and save the old locale.
30 $old_locale = setlocale(LC_CTYPE);
31
32 setlocale(LC_CTYPE, "fr_CA.ISO8859-1");
33 # for LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1"
34
35 setlocale(LC_CTYPE, "");
36 # for LC_CTYPE now in locale what the LC_ALL / LC_CTYPE / LANG define.
37 # see below for documentation about the LC_ALL / LC_CTYPE / LANG.
38
39 # restore the old locale
40 setlocale(LC_CTYPE, $old_locale);
41
42The first argument of setlocale() is called the category and the
43second argument the locale. The category tells in what area of data
44processing we want to apply language-specific rules, the locale tells
45in what language-country/territory-codeset. For further information
46about the categories, please consult your L<setlocale(3)> manual. For
47the locales available in your system, also consult the L<setlocale(3)>
48manual and see whether it leads you to the list of the available
49locales (search for the C<SEE ALSO> section). If that fails, try out
50in command line the following commands:
51
52=over 12
53
54=item locale -a
55
56=item nlsinfo
57
58=item ls /usr/lib/nls/loc
59
60=item ls /usr/lib/locale
61
62=item ls /usr/lib/nls
63
64=back
65
66and see whether they list something resembling these
67
68 en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5
69 en_US de_DE ru_RU
70 english german russian
71 english.iso88591 german.iso88591 russian.iso88595
72
73Sadly enough even if the calling interface has been standardized
74the names of the locales are not.
75
76=head2 CHARACTER TYPES
77
78Starting from Perl version 5.002 perl has obeyed the LC_CTYPE
79environment variable which controls application's notions on
80which characters are alphabetic characters. This affects in
81Perl the regular expression metanotation
82
83 \w
84
85which stands for alphanumeric characters, that is, alphabetic
86and numeric characters. Depending on your locale settings,
87characters like C<F>, C<I>, C<_>, C<x>, can be understood
88as C<\w> characters.
89
90=head2 COLLATION
91
92Starting from Perl version 5.003_06 perl has obeyed the LC_COLLATE
93environment variable which controls application's notions on the
94ordering (collation) of the characters. C<B> does in most Latin
95alphabets follow the C<A> but where do the C<A> and C<D> belong?
96
97Here is a code snippet that will tell you what are the alphanumeric
98characters in the current locale, in the locale order:
99
100 perl -le 'print sort grep /\w/, map { chr() } 0..255'
101
102As noted above, this will work only for Perl versions 5.003_06 and up.
103
104B<NOTE>: in the pre-5.003_06 Perl releases the per-locale collation
105was possible using the C<I18N::Collate> library module. This is now
106mildly obsolete and to be avoided. The C<LC_COLLATE> functionality is
107integrated into the Perl core language and one can use scalar data
108completely normally -- there is no need to juggle with the scalar
109references of C<I18N::Collate>.
110
111=head1 ENVIRONMENT
112
113=over 12
114
115=item PERL_BADLANG
116
117A string that controls whether Perl warns in its startup about failed
118language-specific "locale" settings. This can happen if the locale
119support in the operating system is lacking is some way. If this string
120has an integer value differing from zero, Perl will not complain.
121B<NOTE>: this is just hiding the warning message: the message tells
122about some problem in your system's locale support and you should
123investigate what the problem is.
124
125=back
126
127The following environment variables are not specific to Perl: they are
128part of the standardized (ISO C, XPG4, POSIX 1.c) setlocale method to
129control an application's opinion on data.
130
131=over 12
132
133=item LC_ALL
134
135C<LC_ALL> is the "override-all" locale environment variable. If it is
136set, it overrides all the rest of the locale environment variables.
137
138=item LC_CTYPE
139
140C<LC_ALL> controls the classification of characters, see above.
141
142If this is unset and the C<LC_ALL> is set, the C<LC_ALL> is used as
143the C<LC_CTYPE>. If both this and the C<LC_ALL> are unset but the C<LANG>
144is set, the C<LANG> is used as the C<LC_CTYPE>.
145If none of these three is set, the default locale C<"C">
146is used as the C<LC_CTYPE>.
147
148=item LC_COLLATE
149
150C<LC_ALL> controls the collation of characters, see above.
151
152If this is unset and the C<LC_ALL> is set, the C<LC_ALL> is used as
153the C<LC_CTYPE>. If both this and the C<LC_ALL> are unset but the
154C<LANG> is set, the C<LANG> is used as the C<LC_COLLATE>.
155If none of these three is set, the default locale C<"C">
156is used as the C<LC_COLLATE>.
157
158=item LANG
159
160LC_ALL is the "catch-all" locale environment variable. If it is set,
161it is used as the last resort if neither of the C<LC_ALL> and the
162category-specific C<LC_...> are set.
163
164=back
165
166There are further locale-controlling environment variables
167(C<LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME>) but
168Perl B<does not> currently obey them.
169