1 # Revision history for Perl extension Encode.
3 # $Id: Changes,v 1.30 2002/04/08 02:34:51 dankogai Exp $
6 1.30 $Date: 2002/04/08 02:34:51 $
7 + lib/Encode/Encoder.pm
8 Object Oriented Encoder. I reckon something like this is in need.
11 ! lib/Encode/Supported.pod
12 * autoloading but that prevented upper-case canonicals such as UTF-16
13 is fixed. Now even UTF/UCS are autoloaded!
14 * encodings() is now more intuitive.
15 * t/Unicode.t fixed to explicitly use Unicode.pm -- BOM values are
17 * Obligatory fixes to the POD.
18 ! lib/Encode/Supported.pod
19 Patch from Anton applied.
20 Message-Id: <66641479.20020408033300@motor.ru>
22 ! lib/Encode/Unicode.pm
23 Cosmetic changes: "bless $obj, $class" => "bless $obj => class"
25 1.28 2002/04/07 18:58:42
29 Just a MANIFEST for those missing files.
31 1.26 Date: 2002/04/07 15:22:04
34 Schwarn's patches against Makefile.PL has zapped jis*.ucm. Restored.
35 And t/Aliases.t fixed to make sure they all exist.
37 1.25 2002/04/07 15:01:25 (Unreleased)
39 ! lib/Encode/Unicode.pm
42 - lib/Encode/UTF_EBCDIC.pm
43 - lib/Encode/Internal.pm
45 Integrated into Encode.pm as closures. That way "one package, one file"
46 rule is preserved yet less files to require.
48 commented out binmode(STDERR ...
57 ! Encode/Makefile_PL.e2x
58 Schwarn's MM-compliance patch merged
59 Message-Id: <20020406082609.GA28758@blackrider>
61 ! lib/Encode/Unicode.pm
62 + lib/Encode/UTF_EBCDIC.pm
64 - lib/Encode/10646_1.pm
65 - lib/Encode/ucs2_le.pm
66 (UCS-2|UTF-(16|32))(LE|BE)? implementation and cleanups. Instead of
67 per-module based (en|de)code, I saved a number of .pm by
68 reorganizing it as per-object base (Well, this is what Encode::XS
69 does under the hood). See Encode::Unicode for details.
70 The original Unicode.pm is now correctly renamed to UTF_EBCDIC.pm.
71 This module is used only on EBCDIC environments.
73 1.21 2002/04/05 14:46:34 (Not Released)
79 Are back to make Perl/Tk happy Smile, NI-S.
82 ! lib/Encode/Supported.pm
83 ! lib/Encode/10646_1.pm
84 ! lib/Encode/ucs2_le.pm
85 UCS-16BE is now canonical for UCS-2/ISO-10646-1.
86 Leftover implicit aliases in ucs2_le.pm removed. Tests and documents
87 updated to reflect changes.
88 essage-Id: <20020405114024.1290.17@bactrian.ni-s.u-net.com>
90 ! lib/Encode/Supported.pm
91 Anton's revision commited. Added Dan's own fixes as well.
92 Message-Id: <159103166906.20020405161134@motor.ru>
95 < qr/^UCS2-le$/i => '"UCS-2"', );
97 > qr/^UCS2-LE$/i => '"UTF-16LE"');
98 Sigh. Thank you, Anton.
99 Message-Id: <14567692196.20020405062020@motor.ru>
100 Message-Id: <69FEC0B4-483E-11D6-A045-00039301D480@dan.co.jp>
102 1.20 2002/04/04 19:50:52
104 the last minute addtion. Just give it a try. Docs remains to be done.
105 Not installed by default.
106 ! lib/Encode/Supported.pod
109 ! lib/Encode/Alias.pm
111 ! lib/Encode/10464_1.pm
112 ! lib/Encode/ucs2_le.pm
113 Canonical name for 'UCS-2le" is now "UTF-16LE". UCS-2 left
114 unchanged but UTF-16BE is added as an alias. Implicit aliases
115 move to Encode::Alias so init_alias() works more as expected.
116 Also, 'utf8' is now canonical with 'UTF-8' being an alias.
117 Though pedantically wrong, This should make perl mongers happier.
118 t/Alias.t is enhanced to test all these.
119 Message-Id: <9C39BD58-47AF-11D6-9D82-00039301D480@dan.co.jp>
121 Now all .ucm are stacked in byte_t; They all share ascii part so 50%
122 of the codepoints are common. CJKT left as is because the saving is
128 ! Encode/Makefile_PL.e2x
136 All occurance of _def.h replaced with .exh so djgpp works happily
137 ever after! To credit this amazing discovery, Laszlo is now in
139 Message-Id: <20020403181424.GA8778@freemail.hu>
140 Message-Id: <B5BF0C6F-4732-11D6-B13D-00039301D480@dan.co.jp>
143 ! Encode/Makefile_PL.skel
145 No more @INC fiddling! Uses $ENV{PERL_CORE} instead
146 Message-Id: <20020401222744.GX2000@blackrider>, et al.
148 Two more tests by added jhi
149 Message-Id: <200204020000.DAA25121@alpha.hut.fi>
152 The showstopper fixed -- Memory reallocation bug was causing
153 Encode::XS to fall into infinite loop on certain conditions.
155 Message-Id: <9572CAC4-463C-11D6-ABA5-00039301D480@dan.co.jp>, et al
160 ! lib/Encode/Supported.pod
161 Vendor encodings rebuilt out of original map files at unicode.org.
162 Indic languages such as MacDevanagali remain unspported do to the
163 shortcoming of encengine capabilities (they need algorithmical
164 conversion and I have no knowledge on that!). Pods fixed for added
166 Oh, macJapan.ucm renamed to macJapanese.ucm.
167 macROMnn is macRomanian and macRUMnn is macRumanian.
168 txt2ucm is a crude script that is used to convert them.
170 Unicode Compound Characters (used extensively on Mac) supported
172 Typo fixes and improvements by jhi
173 Message-Id: <200204010201.FAA03564@alpha.hut.fi>, et al.
175 1.11 $Date: 2002/04/08 02:34:51 $
179 Missing files from the MANIFEST fixed.
180 Message-Id: <20020401010156.H10509@alpha.hut.fi>
181 Version incremented just to make CPAN happy.
183 1.10 2002/03/31 21:32:42
186 INSTALL_UCM option added to Makefile.PL so you can install *.ucm
187 if you want. This should make Autrijus happy. Also, piconv
188 is added to default install.
191 Here-documented files that enc2xs generates are now exported
192 to *.e2x. Much cleaner and easier to debug.
194 encoding enhances so you can make it act more like such
195 (now prehistoric ) "localized" variations of perl like Jperl.
197 Further test for encoding.pm. Written in euc-jp
201 Message-Id: <20020330174618.B10154@alpha.hut.fi>
206 *.ucm relocated to ucm/ so MakeMaker will not install'em by default.
213 ! Encode/macIceland.ucm
214 ! lib/Encode/Alias.pm
215 ! lib/Encode/Supported.pod
216 MacIceland fixes and Pod Typo fixes. This adds Andreas to AUTHORS.
217 Message-Id: <m3lmcavhjt.fsf@anima.de>
219 1.01 2002/03/29 20:59:39
222 s/USE_SCRIPTS/MORE_SCRIPTS/
224 installs enc2xs by default for external Encode:: modules in CPAN,
225 such as Encode::HanExtra
227 More sensible perl core detection via $ENV{PERL_CORE}
229 Message-Id: <200203291007.FAA07329@Orb.Nashua.NH.US>
231 Perl core ditection via $^X =~ m/\bminiperl$/o
232 Message-Id: <A5C7B0CA-42F1-11D6-B5AD-00039301D480@dan.co.jp>
236 The version of all files is updated to 1.00 via "ci -f -l1.00",
237 commemorating version 1.00. All files, including *.ucm are now
238 under version control.
241 encode.h moved to Encode/ so it will be installed for the later
244 h2xs-like feature added via "h2xs -M Name *.(enc|ucm)"
249 compile renamed to enc2xs.
250 Affected Makefle.PL updated
252 "Punt it. HanExtra can take care of that later." -- Autrijus
253 Message-Id: <20020328154338.GA7351@not.autrijus.org>
257 ! lib/Encode/CJKConstants.pm
258 ! lib/Encode/KR/2022_KR.pm
259 Table patches for Euro Signs, 2022-KR fixups by Jungshik
260 Message-Id: <Pine.LNX.4.44.0203280616190.2259-200000@www.ykga.org>
264 bin/ added for example scripts. They are not installed by default.
265 to install them, "perl Makefile.PL USE_SCRIPTS".
266 piconv is iconv reinvented in perl. in addition to all features
267 of iconv, it also adds perlish features. See L<piconv/1> for more
269 ! lib/Encode/Alias.pm
270 qr/^ replaced with qr/\b so it directly matches locale names
271 such as en_US.US-ASCII
274 Patch by MJD to fix the following problem applied.
275 Subject: [PATCH 5.7.3 Encode]
276 Aliases.t not properly skipped when Encode extension not built
277 Message-Id: <20020328091850.18677.qmail@plover.com>
278 ! lib/Encode/KR/2022_KR.pm
279 ! lib/Encode/CJKConstants.pm
280 Another patch from Jungshik to make iso-2022-kr actually work
281 Message-Id: <Pine.LNX.4.44.0203271745210.30462-200000@www.ykga.org>
282 ! Encode/Encode/euc-kr.ucm
283 + Encode/Encode/johab.ucm
284 ! Encode/Encode/ksc5601.ucm
286 ! Encode/KR/Makefile.PL
287 ! Encode/lib/Encode/Alias.pm
289 Johab support and complete revision of Korean Encoding by Jungshik
290 Message-Id: <Pine.LNX.4.44.0203271105060.30462-200000@www.ykga.org>
292 Revised to make up with now-dropped Encode::Details.
293 - lib/Encode/Details.pod
294 Dropped. Besides being obsolete, the topics are now covered in
300 Korean aliases fixed thanks to Jungshik Shin
301 /ks[-_ ]?c[-_ ]?5601-1987$/i => cp936
302 Message-Id: <Pine.LNX.4.44.0203262102250.1237-100000@www.ykga.org>
304 =head1 NAME added to all modules to make buildtoc happy
305 Message-Id: <20020327041151.A10618@alpha.hut.fi>
306 - lib/Encode/CJKguide.pod
307 Too controversial and dropped from the dist. Will be available
308 separately on the web.
310 RCS tags added so table debugging gets easier (should that be
311 needed! I hope they all stay 1.00!)
312 + lib/Encode/CJKguide.pod
313 A detailed guide to mainly, but not limited to, CJK multibyte
316 + Encode/hp-roman8.ucm
318 ! Encode/Supported.pod
319 All occurance of "roman8" replaced with "hp-roman8" to avoid
321 ! Encode/Supported.pod
324 Mac Encodings now comply the Inside Macintosh
326 Test for '-raw' conventions added.
328 aliased gb2312 -> euc-cn, ksc5601 -> euc-kr
332 "-raw" appended to canonical names.
333 File mames stay unchanged thanks to UCM format.
334 ! lib/Encode/CN/HZ.pm
335 Patch from Autrijus to fix gb2312 -> gb2312-raw + code linting
336 Message-Id: <20020326035210.GA2091@not.autrijus.org>
339 - lib/Encode/JP/Const.pm
340 + lib/Encode/CJKConstants.pm
341 + lib/Encode/CN/2022_CN.pm
342 + lib/Encode/KR/2022_KR.pm
351 * Support for ISO-2022-KR and ISO-2022-CN added.
353 * more t/*.{euc,ref} added, which was autogenerated from ucm2table
354 * ucm2table autogenerates character table out of UCM files.
357 - lib/Encode/Supports.pod
358 + lib/Encode/Supported.pod
359 Names reverted due to popular demand.
360 8.3 rule applies only when there is a conflict.
361 Message-Id: <20020325095924.GD44120@not.autrijus.org>
366 - lib/Encode/Format/Enc.pod
368 * Character tables is now 100% ucm.
369 * All files under Encode/ is now 8.3-compliant
370 * some of missing encodings added (i.e. gsm0338 and nextstep)
371 * Vendor mappings aggregated with appropriate national std in
372 Makefile.PL, resulting smaller *.so especially for CJK.
373 Following is result on Dan's FreeBSD box.
375 ---------------------------------------------------------------
376 blib/arch/auto/Encode/Byte/Byte.so 157,279 171,042
377 blib/arch/auto/Encode/CN/CN.so 1,634,476 1,626,685
378 blib/arch/auto/Encode/EBCDIC/EBCDIC.so 18,476 18,476
379 blib/arch/auto/Encode/Encode.so 27,791 27,791
380 blib/arch/auto/Encode/JP/JP.so 1,408,056 1,832,811
381 blib/arch/auto/Encode/KR/KR.so 1,156,518 1,329,587
382 blib/arch/auto/Encode/Symbol/Symbol.so 23,940 20,990
383 blib/arch/auto/Encode/TW/TW.so* 948,761 1,316,437
384 ---------------------------------------------------------------
385 Total 5,375,297 6,343,819
387 * As a result of ucm-transition, Encode::Tcl dropped because
388 Encode::Tcl demands *.enc.
389 Encode::Tcl will be supplied in a separate tarball with *.enc.
390 Message-Id: <C024E294-3FC3-11D6-8347-00039301D480@dan.co.jp>
395 -lib/Encode/Supported.pod
396 +lib/Encode/Supports.pod
397 -lib/Encode/iso10646_1.pm
398 +lib/Encode/10646_1.pm
399 -lib/Encode/EncFormat.pod
400 +lib/Encode/Format/Enc.pod
401 Files renamed 8.3 filename compliance. Affected modules/scripts revised.
402 - lib/Encode/JP/Constants.pm
403 + lib/Encode/JP/Consts.pm
404 ! lib/Encode/JP/JIS.pm
405 ! lib/Encode/JP/H2Z.pm
406 Version nit problem and 8.3 rule fix.
407 > Package namespace installed latest in CPAN file
408 > Encode::JP::Constants 0.92 1.02 J/JH/JHI/perl-5.7.3.tar.gz
409 was noted by jhi then Dan discovers "Constants.pm" does not comply 8.3
410 rule. Contants.pm renamed to Consts.pm and affected modules are fixed
411 accordingly. In addition, legacy "use vars qw()..." are replaced with
413 Message-Id: <20020325011248.D1561@alpha.hut.fi>
414 Message-Id: <41023D51-3FB5-11D6-8347-00039301D480@dan.co.jp>
416 - lib/Encode/JP/ISO_2022_JP.pm
417 - lib/Encode/JP/ISO_2022_JP_1.pm
418 + lib/Encode/JP/2022_JP.pm
419 + lib/Encode/JP/2022_JP1.pm
421 8.3 naming conflict for vanilla fat addressed by jhi
422 Message-Id: <20020324201931.V22596@alpha.hut.fi>
425 Typecast fix addressed by jhi
426 Message-Id: <20020324185540.T22596@alpha.hut.fi>
429 ! lib/Encode/Supported.pod
431 + lib/Encode/JP/ISO_2022_JP_1.pm
432 ! lib/Encode/JP/ISO_2022_JP.pm
433 ! lib/Encode/JP/JIS.pm
435 Now Encode::JP is more strict on the difference between ISO-2022-JP
436 and ISO-2022-JP-1. See JP/JP.pm for details. I hope this move
437 makes Anton happier :) FYI the previous version implements
438 ISO-2022-JP as ISO-2022-JP-1 since it had X0212 support.
439 ! lib/Encode/Supported.pod
442 Avoid core-dump in Encode with PERLIO=mmap by NI-S
443 Message-Id: <20020324104139.1326.7@bactrian.ni-s.u-net.com>
448 ! lib/Encode/Suppoted.pod
449 pod fixes to replace F<http://...> to L<http://...>,
450 as suggested by Autrijius in:
451 Message-Id: <20020324083943.GA14901@not.autrijus.org>
452 ! lib/Encode/Suppoted.pod
453 fixes and enhancements by Anton
454 Message-Id: <10632060120.20020324103753@motor.ru>
455 ! lib/Encode/Alias.pm
456 > define_alias( qr/^GB[- ]?(\d+)$/i => '"gb$1"' );
457 added. Suggested by Anton then deobfuscated by Autrijius
458 Message-Id: <20020324064455.GA3667@not.autrijus.org>
460 Further fix by Nicholas Clark
461 Message-Id: <20020323145840.GD304@Bagpuss.unfortu.net>
462 - lib/EncodeFormat.pod
463 + lib/Encode/EncFormat.pod
465 File renamed as suggested by Autrijius
467 ! lib/Encode/Details.pod
468 ! lib/Encode/Supported.pod Sun Mar 24 13:29:35 2002
469 ! Encode.pm Sun Mar 24 13:43:47 2002
470 pod fixes by Autrijius.
471 Message-Id: <20020324062804.GA3595@not.autrijus.org>
472 Message-Id: <20020324075627.GB11986@not.autrijus.org>
474 ! lib/Encode/Alias.pm
476 now more EBCDIC conscious;
477 %ExtModules on EBCDIC system excludes CJK so that you don't
478 have to worry about the matched alias resulting cloaking.
479 t/Alias.t also revised to reflect changes. Verified by jhi
480 Message-Id: <20020324022929.D22596@alpha.hut.fi>
486 EBCDIC detection mechanism installed as in JP/JP.pm
487 Message-Id: <20020323211847.G19148@alpha.hut.fi>
495 Now all table files used by compile are postfixed '_t' to avoid
496 namespace collisions in case insensitive file systems once for all!
498 Message-ID: <58290227735.20020323195659@familiehaase.de>
500 Since the Encode::JP is unsupported under EBCDIC we
501 cannot run this test (aliases as such should work fine) -- jhi
502 Message-Id: <20020323202119.D19148@alpha.hut.fi>
504 duplicate occurance of ascii.ucm and 8859-1.ucm
505 causes MacOS X dlyd to cloak
511 < chdir 't' if -d 't';
513 > if (! -d 'blib' and -d 't'){ chdir 't' };
514 When you are "make test"-ing on Encode/ directory, you must not
515 change $ENV{PWD}. t/JP.t has been fixed before but others somehow
516 remain unchanced. Also the situation detection was made simpler
517 in t/JP.t, which was originally;
518 > chdir 't' if -d 't' and $ENV{PWD} !~ m,/Encode[^/]*$,o;
520 "Use of uninitialized value in string eq at Encode.pm line 96."
524 -- Problem on case insensitive file systems
525 "coexist of ebcdic.c <> EBCDIC.c on Cygwin not possible"
526 Message-ID: <88254111953.20020323095503@familiehaase.de>
529 "So I think it's a bug in gcc, not perl. But it still needs to be
531 Message-Id: <20020323145840.GD304@Bagpuss.unfortu.net>
532 Message-Id: <20020323170509.C96475@plum.flirble.org>
536 ! lib/Encode/Encoding.pm
537 ! lib/Encode/Alias.pm
538 ! lib/Encode/Supported.pod
540 Pod Fixes by Michael G Schwern <schwern@pobox.com> via jhi
541 Message-ID: <20020322073908.GB10539@blackrider>
544 "...I think we should include ISO 8859-1 as well." -- NI-S
545 Message-Id: <20020322120230.1332.8@bactrian.elixent.com>
550 ! lib/Encode/Alias.pm
551 alias definitions relocated to Encode::Alias so module autoloading
552 works for aliases also.
554 encodings() now accepts args to check ExtModules.
563 Latin and single byte encodings are reorganized so they are
564 demand-loaded like Encode::XX. Now only ascii is compiled into
566 ! lib/Encode/Alias.pm
567 for my $k (keys %hash){ delete $hash{$k}; }
568 is depreciated; fixed.
571 In this update, pod rewrites and alias fixes are the main issues
572 + lib/Encode/Supported.pod
573 Describes supported encodings
575 streamlined compiled-in encodings.
576 ! lib/Encode/Description.pod -> lib/Encode/Details.pod
578 + Encode/ibm-125?.ucm
579 Added from icu distibution with any occurance of
580 "IBM-125?" to "cp125?". Filenames remain unchanged to pay
581 some respect to icu staff, however.
582 + lib/Encode/Alias.pm
584 Alias difinitions in Encode.pm relocated.
587 packWARN patch from Paul Marquess via jhi
588 Message-Id: <20020321010101.O28978@alpha.hut.fi>
589 Paul added to AUTHORS as a result.
590 ! t/CJKalias.t -> t/Aliases.t
591 Renamed. Checks even more aliases and alias overloading
594 duplicate alias for ujis => euc-jp removed (Encode::JP has one)
595 gbk => cp936 relocated to CN.pm
597 Test::More with plans (by jhi)
600 + lib/Encode/Description.pod
601 ! lib/Encode/Encoding.pm
602 Now the pod in Encode.pm is abridged as programming references.
603 lib/Encode/Description.pod contains the original, detailed description
604 and Encode::Encoding explains how to write your own module to
605 add new encodings. So far, lib/Encode/Description.pod contains
606 the whole pod once in Encode.pm. This is intentional.
608 Pod revisions by Anton Tagunov
609 Message-Id: <517178431.20020320174824@motor.ru>
611 all occrance of Encode::Tcl::Extended removed including pod
613 test now checks $encoding->name only; $encoding->{name} are
614 no longer check to find the canonical name.
615 ! lib/Encode/JP/JIS.pm
616 ! lib/Encode/JP/ISO_2022_JP.pm
617 ->name() added to be more compliant with API
623 Patch by Autrijus to add aliases to TW and fixes to POD
624 Message-Id: <20020320090619.GA24774@not.autrijus.org>
626 SADAHIRO Tomoyuki added as should. My apologies.
629 * First release to be uploaded to CPAN. For prehistoric changes,
630 please see Changes file of perl distibution as well as
631 perl-unicode@perl.org archive, available at:
632 http://archive.develooper.com/perl-unicode@perl.org/
634 Changes Since 0.92 includes;
639 + Mention to perl-unicode@perl.org added
641 + Encoding aliases added so you can feed locale names
642 and MIME Charset="" directly.
643 - Mention to JISX0212 removed because it's fixed
646 + Encoding aliases added. Note TW is left untouched because
647 euc-tw is not implemented in TW but in Encode::HanExtra.
648 Autrijus, you may fix Encode::HanExtra.
650 + to test encode aliases added