[p5sagit/p5-mst-13.2.git] / ext / Encode / TW / TW.pm

package Encode::TW;
BEGIN {
    if (ord("A") == 193) {
	die "Encode::TW not supported on EBCDIC\n";
    }
}
our $VERSION = do { my @r = (q$Revision: 1.23 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r };

use Encode;
use XSLoader;
XSLoader::load(__PACKAGE__,$VERSION);

1;
__END__

=head1 NAME

Encode::TW - Taiwan-based Chinese Encodings

=head1 SYNOPSIS

    use Encode qw/encode decode/; 
    $big5 = encode("big5", $utf8); # loads Encode::TW implicitly
    $utf8 = decode("big5", $big5); # ditto

=head1 DESCRIPTION

This module implements Taiwan-based Chinese charset encodings.
Encodings supported are as follows.

  Canonical   Alias		Description
  --------------------------------------------------------------------
  big5-eten   /\bbig-?5$/i	Big5 encoding (with ETen extensions)
	      /\bbig5-?et(en)?$/i
  big5-hkscs  /\bbig5-?hk(scs)?$/i
                                Big5 + Cantonese characters in Hong Kong
  MacChineseSimp		Big5 + Apple Vendor Mappings
  cp950		                Code Page 950 
                                = Big5 + Microsoft vendor mappings
  --------------------------------------------------------------------

To find how to use this module in detail, see L<Encode>.

=head1 NOTES

Due to size concerns, C<EUC-TW> (Extended Unix Character), C<CCCII>
(Chinese Character Code for Information Interchange), C<BIG5PLUS>
(CMEX's Big5+) and C<BIG5EXT> (CMEX's Big5e) are distributed separately
on CPAN, under the name L<Encode::HanExtra>. That module also contains
extra China-based encodings.

=head1 BUGS

Since the original C<big5> encoding (1984) is not supported anywhere
(glibc and DOS-based systems uses C<big5> to mean C<big5-eten>; Microsoft
uses C<big5> to mean C<cp950>), a concious decision was made to alias
C<big5> to C<big5-eten>, which is the de facto superset of the original
big5.

The C<CNS11643> encoding files are not complete. For common C<CNS11643>
manipulation, please use C<EUC-TW> in L<Encode::HanExtra>, which contains
plane 1-7.

ASCII part (0x00-0x7f) is preserved for all encodings, even though it
conflicts with mappings by the Unicode Consortium.  See

L<http://www.debian.or.jp/~kubota/unicode-symbols.html.en>

to find why it is implemented that way.

=head1 SEE ALSO

L<Encode>

=cut
Commit	Line	Data
0e567a6c	1	package Encode::TW;
071db25d	2	BEGIN {
	3	if (ord("A") == 193) {
	4	die "Encode::TW not supported on EBCDIC\n";
	5	}
	6	}
b0b300a3	7	our $VERSION = do { my @r = (q$Revision: 1.23 $ =~ /\d+/g); sprintf "%d."."%02d" x $#r, @r };
c0d88b76	8
c0d88b76	9	use Encode;
0e567a6c	10	use XSLoader;
b2704119	11	XSLoader::load(__PACKAGE__,$VERSION);
0e567a6c	12
	13	1;
	14	__END__
67d7b5ef	15
b2729934	16	=head1 NAME
	17
	18	Encode::TW - Taiwan-based Chinese Encodings
	19
	20	=head1 SYNOPSIS
	21
1b2c56c8	22	use Encode qw/encode decode/;
2b217bf7	23	$big5 = encode("big5", $utf8); # loads Encode::TW implicitly
ee981de6	24	$utf8 = decode("big5", $big5); # ditto
b2729934	25
	26	=head1 DESCRIPTION
	27
	28	This module implements Taiwan-based Chinese charset encodings.
	29	Encodings supported are as follows.
	30
1b2c56c8	31	Canonical Alias Description
1b2c56c8	32	--------------------------------------------------------------------
b0b300a3	33	big5-eten /\bbig-?5$/i Big5 encoding (with ETen extensions)
	34	/\bbig5-?et(en)?$/i
	35	big5-hkscs /\bbig5-?hk(scs)?$/i
	36	Big5 + Cantonese characters in Hong Kong
a999c27c	37	MacChineseSimp Big5 + Apple Vendor Mappings
1b2c56c8	38	cp950 Code Page 950
a999c27c	39	= Big5 + Microsoft vendor mappings
5129552c	40	--------------------------------------------------------------------
5129552c	41
b2729934	42	To find how to use this module in detail, see L<Encode>.
	43
	44	=head1 NOTES
	45
85982a32	46	Due to size concerns, C<EUC-TW> (Extended Unix Character), C<CCCII>
b0b300a3	47	(Chinese Character Code for Information Interchange), C<BIG5PLUS>
	48	(CMEX's Big5+) and C<BIG5EXT> (CMEX's Big5e) are distributed separately
	49	on CPAN, under the name L<Encode::HanExtra>. That module also contains
	50	extra China-based encodings.
b2729934	51
	52	=head1 BUGS
	53
b0b300a3	54	Since the original C<big5> encoding (1984) is not supported anywhere
	55	(glibc and DOS-based systems uses C<big5> to mean C<big5-eten>; Microsoft
	56	uses C<big5> to mean C<cp950>), a concious decision was made to alias
	57	C<big5> to C<big5-eten>, which is the de facto superset of the original
	58	big5.
	59
85982a32	60	The C<CNS11643> encoding files are not complete. For common C<CNS11643>
b2729934	61	manipulation, please use C<EUC-TW> in L<Encode::HanExtra>, which contains
	62	plane 1-7.
	63
	64	ASCII part (0x00-0x7f) is preserved for all encodings, even though it
	65	conflicts with mappings by the Unicode Consortium. See
	66
a63c962f	67	L<http://www.debian.or.jp/~kubota/unicode-symbols.html.en>
b2729934	68
	69	to find why it is implemented that way.
	70
	71	=head1 SEE ALSO
	72
85982a32	73	L<Encode>
b2729934	74
b2729934	75	=cut