Commit | Line | Data |
0a378802 |
1 | package encoding; |
2 | |
3 | use Encode; |
4 | |
5 | sub import { |
6 | my ($class, $name) = @_; |
7 | $name = $ENV{PERL_ENCODING} if @_ < 2; |
8 | my $enc = find_encoding($name); |
9 | unless (defined $enc) { |
10 | require Carp; |
11 | Carp::croak "Unknown encoding '$name'"; |
12 | } |
13 | ${^ENCODING} = $enc; |
14 | } |
15 | |
16 | =pod |
17 | |
18 | =head1 NAME |
19 | |
20 | encoding - pragma to control the conversion of legacy data into Unicode |
21 | |
22 | =head1 SYNOPSIS |
23 | |
24 | use encoding "iso 8859-7"; |
25 | |
26 | $a = "\xDF"; |
27 | $b = "\x{100}"; |
28 | |
29 | $c = $a . $b; |
30 | |
31 | # $c will be "\x{3af}\x{100}", not "\x{df}\x{100}". |
32 | # The \xDF of ISO 8859-7 is \x{3af} in Unicode. |
33 | |
34 | =head1 DESCRIPTION |
35 | |
36 | Normally when legacy 8-bit data is converted to Unicode the data is |
37 | expected to be Latin-1 (or EBCDIC in EBCDIC platforms). With the |
38 | encoding pragma you can change this default. |
39 | |
40 | The pragma is a per script, not a per block lexical. Only the last |
9f4817db |
41 | C<use encoding> matters, and it affects B<the whole script>. |
0a378802 |
42 | |
43 | =head1 FUTURE POSSIBILITIES |
44 | |
9f4817db |
45 | The C<\x..> and C<\0...> in regular expressions are not |
46 | affected by this pragma. They probably should. |
47 | |
1768d7eb |
48 | Also chr(), ord(), and C<\N{...}> might become affected. |
0a378802 |
49 | |
d521382b |
50 | =head1 KNOWN PROBLEMS |
51 | |
52 | Cannot be combined with C<use utf8>. Note that this is a problem |
53 | B<only> if you would like to have Unicode identifiers in your scripts. |
54 | You should not need C<use utf8> for anything else these days |
55 | (since Perl 5.8.0) |
56 | |
0a378802 |
57 | =head1 SEE ALSO |
58 | |
59 | L<perlunicode> |
60 | |
61 | =cut |
62 | |
63 | 1; |