[p5sagit/p5-mst-13.2.git] / lib / encoding.pm

package encoding;

use Encode;

sub import {
    my ($class, $name) = @_;
    $name = $ENV{PERL_ENCODING} if @_ < 2;
    my $enc = find_encoding($name);
    unless (defined $enc) {
	require Carp;
	Carp::croak "Unknown encoding '$name'";
    }
    ${^ENCODING} = $enc;
}

=pod

=head1 NAME

encoding - pragma to control the conversion of legacy data into Unicode

=head1 SYNOPSIS

    use encoding "iso 8859-7";

    $a = "\xDF";
    $b = "\x{100}";

    $c = $a . $b;

    # $c will be "\x{3af}\x{100}", not "\x{df}\x{100}".
    # The \xDF of ISO 8859-7 is \x{3af} in Unicode.

=head1 DESCRIPTION

Normally when legacy 8-bit data is converted to Unicode the data is
expected to be Latin-1 (or EBCDIC in EBCDIC platforms).  With the
encoding pragma you can change this default.

The pragma is a per script, not a per block lexical.  Only the last
C<use encoding> matters, and it affects B<the whole script>.

=head1 FUTURE POSSIBILITIES

The C<\x..> and C<\0...> in regular expressions are not
affected by this pragma.  They probably should.

Also chr(), ord(), and C<\N{...}> might become affected.

=head1 KNOWN PROBLEMS

Cannot be combined with C<use utf8>.  Note that this is a problem
B<only> if you would like to have Unicode identifiers in your scripts.
You should not need C<use utf8> for anything else these days
(since Perl 5.8.0)

=head1 SEE ALSO

L<perlunicode>

=cut

1;
Commit	Line	Data
0a378802	1	package encoding;
	2
	3	use Encode;
	4
	5	sub import {
	6	my ($class, $name) = @_;
	7	$name = $ENV{PERL_ENCODING} if @_ < 2;
	8	my $enc = find_encoding($name);
	9	unless (defined $enc) {
	10	require Carp;
	11	Carp::croak "Unknown encoding '$name'";
	12	}
	13	${^ENCODING} = $enc;
	14	}
	15
	16	=pod
	17
	18	=head1 NAME
	19
	20	encoding - pragma to control the conversion of legacy data into Unicode
	21
	22	=head1 SYNOPSIS
	23
	24	use encoding "iso 8859-7";
	25
	26	$a = "\xDF";
	27	$b = "\x{100}";
	28
	29	$c = $a . $b;
	30
	31	# $c will be "\x{3af}\x{100}", not "\x{df}\x{100}".
	32	# The \xDF of ISO 8859-7 is \x{3af} in Unicode.
	33
	34	=head1 DESCRIPTION
	35
	36	Normally when legacy 8-bit data is converted to Unicode the data is
	37	expected to be Latin-1 (or EBCDIC in EBCDIC platforms). With the
	38	encoding pragma you can change this default.
	39
	40	The pragma is a per script, not a per block lexical. Only the last
9f4817db	41	C<use encoding> matters, and it affects B<the whole script>.
0a378802	42
	43	=head1 FUTURE POSSIBILITIES
	44
9f4817db	45	The C<\x..> and C<\0...> in regular expressions are not
	46	affected by this pragma. They probably should.
	47
1768d7eb	48	Also chr(), ord(), and C<\N{...}> might become affected.
0a378802	49
d521382b	50	=head1 KNOWN PROBLEMS
	51
	52	Cannot be combined with C<use utf8>. Note that this is a problem
	53	B<only> if you would like to have Unicode identifiers in your scripts.
	54	You should not need C<use utf8> for anything else these days
	55	(since Perl 5.8.0)
	56
0a378802	57	=head1 SEE ALSO
	58
	59	L<perlunicode>
	60
	61	=cut
	62
	63	1;