Document the changes with regards to running of END blocks.

[p5sagit/p5-mst-13.2.git] / pod / perltodo.pod
diff --git a/pod/perltodo.pod b/pod/perltodo.pod

index 5d280e6..0f4ea63 100644 (file)
--- a/pod/perltodo.pod
+++ b/pod/perltodo.pod
@@ -20,21 +20,7 @@ of archives may be found at:
 C<perlio> provides this, but the interface could be a lot more
 straightforward.
 
-=head2 Eliminate need for "use utf8";
-
-While the C<utf8> pragma is autoloaded when necessary, it's still needed
-for things like Unicode characters in a source file. The UTF8 hint can
-always be set to true, but it needs to be set to false when F<utf8.pm>
-is being compiled. (To stop Perl trying to autoload the C<utf8>
-pragma...)
-
-=head2 Create a char *sv_printify(sv, STRLEN *lenp, UV flags) function
-
-For displaying PVs with control characters, embedded nulls, and Unicode.
-This would be useful for printing warnings, or data and regex dumping,
-not_a_number(), and so on.
-
-=head2 Autoload byte.pm
+=head2 Autoload bytes.pm
 
 When the lexer sees, for instance, C<bytes::length>, it should
 automatically load the C<bytes> pragma.
@@ -45,6 +31,29 @@ Danger, Will Robinson! Discussing the semantics of C<"\x{F00}">,
 C<"\xF00"> and C<"\U{F00}"> on P5P I<will> lead to a long and boring
 flamewar.
 
+=head2 Create a char *sv_pvprintify(sv, STRLEN *lenp, UV flags)
+
+For displaying PVs with control characters, embedded nulls, and Unicode.
+This would be useful for printing warnings, or data and regex dumping,
+not_a_number(), and so on.
+
+Requirements: should handle both byte and UTF8 strings.  isPRINT()
+characters printed as-is, character less than 256 as \xHH, Unicode
+characters as \x{HHH}.  Don't assume ASCII-like, either, get somebody
+on EBCDIC to test the output.
+
+Possible options, controlled by the flags:
+- whitespace (other than ' ' of isPRINT()) printed as-is
+- use isPRINT_LC() instead of isPRINT()
+- print control characters like this: "\cA"
+- print control characters like this: "^A"
+- non-PRINTables printed as '.' instead of \xHH
+- use \OOO instead of \xHH
+- use the C/Perl-metacharacters like \n, \t
+- have a maximum length for the produced string (read it from *lenp)
+- append a "..." to the produced string if the maximum length is exceeded
+- really fancy: print unicode characters as \N{...}
+
 =head2 Overloadable regex assertions
 
 This may or may not be possible with the current regular expression
@@ -52,20 +61,14 @@ engine. The idea is that, for instance, C<\b> needs to be
 algorithmically computed if you're dealing with Thai text. Hence, the
 B<\b> assertion wants to be overloaded by a function.
 
-=head2 Unicode collation and normalization
-
-Simon Cozens promises to work on this.
-
-    Collation?     http://www.unicode.org/unicode/reports/tr10/
-    Normalization? http://www.unicode.org/unicode/reports/tr15/
-
 =head2 Unicode case mappings 
 
     Case Mappings? http://www.unicode.org/unicode/reports/tr21/
 
 =head2 Unicode regular expression character classes
 
-They have some tricks Perl doesn't yet implement.
+They have some tricks Perl doesn't yet implement like character
+class subtraction.
 
        http://www.unicode.org/unicode/reports/tr18/
 
@@ -273,6 +276,12 @@ is the bootstrapping build process of Perl: if the filesystem the
 target systems sees is not the same what the build host sees, various
 input, output, and (Perl) library files need to be copied back and forth.
 
+As of 5.8.0 Configure mostly works for cross-compilation
+(used successfully for iPAQ Linux), miniperl gets built,
+but then building DynaLoader (and other extensions) fails
+since MakeMaker knows nothing of cross-compilation.
+(See INSTALL/Cross-compilation for the state of things.)
+
 =head2 Perl preprocessor / macros
 
 Source filters help with this, but do not get us all the way. For
@@ -311,6 +320,12 @@ has changed. Detecting a change is perhaps the difficult bit.
 
 There should be a way of restarting the debugger on demand.
 
+=head2 Test Suite for the Debugger
+
+The debugger is a complex piece of software and fixing something
+here may inadvertently break something else over there.  To tame
+this chaotic behaviour, a test suite is necessary. 
+
 =head2 my sub foo { }
 
 The basic principle is sound, but there are problems with the semantics
@@ -475,7 +490,7 @@ Ideas which have been discussed, and which may or may not happen.
 It's unclear what this should do or how to do it without breaking old
 code.
 
-=head2 Make tr/// return histogram
+=head2 Make tr/// return histogram of characters in list context
 
 There is a patch for this, but it may require Unicodification.
 
@@ -764,6 +779,7 @@ Suggesting this on P5P B<will> cause a boring and interminable flamewar.
 =head2 "class"-based lexicals
 
 Use flyweight objects, secure hashes or, dare I say it, pseudo-hashes instead.
+(Or whatever will replace pseudohashes in 5.10.)
 
 =head2 byteperl
 
@@ -771,18 +787,27 @@ C<ByteLoader> covers this.
 
 =head2 Lazy evaluation / tail recursion removal
 
-C<List::Util> in core gives some of these; tail recursion removal is
-done manually, with C<goto &whoami;>. (However, MJD has found that
-C<goto &whoami> introduces a performance penalty, so maybe there should
-be a way to do this after all: C<sub foo {START: ... goto START;> is
-better.)
+C<List::Util> gives first() (a short-circuiting grep); tail recursion
+removal is done manually, with C<goto &whoami;>. (However, MJD has
+found that C<goto &whoami> introduces a performance penalty, so maybe
+there should be a way to do this after all: C<sub foo {START: ... goto
+START;> is better.)
 
 =head2 Make "use utf8" the default
 
-There is a patch available for this, search p5p archives for
-the Subject "[EXPERIMENTAL PATCH] make unicode (utf8) default"
-but this would be unacceptable because of backward compatibility:
-scripts could not contain B<any legacy eight-bit data>.  Also would
-introduce a measurable slowdown of at least few percentages since all
-regular expression operations would be done in full UTF-8.
+Because of backward compatibility this is difficult: scripts could not
+contain B<any legacy eight-bit data> (like Latin-1) anymore, even in
+string literals or pod.  Also would introduce a measurable slowdown of
+at least few percentages since all regular expression operations would
+be done in full UTF-8.  But if you want to try this, add
+-DUSE_UTF8_SCRIPTS to your compilation flags.
+
+=head2 Unicode collation and normalization
+
+The Unicode::Collate and Unicode::Normalize modules
+by SADAHIRO Tomoyuki have been included since 5.8.0.
+
+    Collation?     http://www.unicode.org/unicode/reports/tr10/
+    Normalization? http://www.unicode.org/unicode/reports/tr15/
 
+=cut