C<perlio> provides this, but the interface could be a lot more
straightforward.
-=head2 Eliminate need for "use utf8";
+=head2 Autoload bytes.pm
-While the C<utf8> pragma is autoloaded when necessary, it's still needed
-for things like Unicode characters in a source file. The UTF8 hint can
-always be set to true, but it needs to be set to false when F<utf8.pm>
-is being compiled. (To stop Perl trying to autoload the C<utf8>
-pragma...)
+When the lexer sees, for instance, C<bytes::length>, it should
+automatically load the C<bytes> pragma.
+
+=head2 Make "\u{XXXX}" et al work
+
+Danger, Will Robinson! Discussing the semantics of C<"\x{F00}">,
+C<"\xF00"> and C<"\U{F00}"> on P5P I<will> lead to a long and boring
+flamewar.
=head2 Create a char *sv_pvprintify(sv, STRLEN *lenp, UV flags)
- append a "..." to the produced string if the maximum length is exceeded
- really fancy: print unicode characters as \N{...}
-=head2 Autoload byte.pm
-
-When the lexer sees, for instance, C<bytes::length>, it should
-automatically load the C<bytes> pragma.
-
-=head2 Make "\u{XXXX}" et al work
-
-Danger, Will Robinson! Discussing the semantics of C<"\x{F00}">,
-C<"\xF00"> and C<"\U{F00}"> on P5P I<will> lead to a long and boring
-flamewar.
-
=head2 Overloadable regex assertions
This may or may not be possible with the current regular expression
http://www.unicode.org/unicode/reports/tr18/
-=head2 Unicode Scripts support
-
-Currently the C<\p{In...}> supports only the Blocks database, like
-C<\p{BasicLatin}>, C<\p{InGreek}>, C<\p{InThai}>, but there's also the
-Scripts database, which has members like C<Latin>, C<Greek>,
-C<Armenian>, C<Han>. It is desireable that also the script names
-could be used for the C<\p{In...}> construct. Note: needs to be
-researched whether this is possible, that is, are there conflicts
-between the Blocks and the Scripts, is the Blocks Greek the same as
-the Scripts Greek?
-
=head2 use Thread for iThreads
Artur Bergman's C<iThreads> module is a start on this, but needs to
target systems sees is not the same what the build host sees, various
input, output, and (Perl) library files need to be copied back and forth.
+As of 5.8.0 Configure mostly works for cross-compilation
+(used successfully for iPAQ Linux), miniperl gets built,
+but then building DynaLoader (and other extensions) fails
+since MakeMaker knows nothing of cross-compilation.
+(See INSTALL/Cross-compilation for the state of things.)
+
=head2 Perl preprocessor / macros
Source filters help with this, but do not get us all the way. For
It's unclear what this should do or how to do it without breaking old
code.
-=head2 Make tr/// return histogram
+=head2 Make tr/// return histogram of characters in list context
There is a patch for this, but it may require Unicodification.
=head2 "class"-based lexicals
Use flyweight objects, secure hashes or, dare I say it, pseudo-hashes instead.
+(Or whatever will replace pseudohashes in 5.10.)
=head2 byteperl
=head2 Lazy evaluation / tail recursion removal
-C<List::Util> in core gives some of these; tail recursion removal is
-done manually, with C<goto &whoami;>. (However, MJD has found that
-C<goto &whoami> introduces a performance penalty, so maybe there should
-be a way to do this after all: C<sub foo {START: ... goto START;> is
-better.)
+C<List::Util> gives first() (a short-circuiting grep); tail recursion
+removal is done manually, with C<goto &whoami;>. (However, MJD has
+found that C<goto &whoami> introduces a performance penalty, so maybe
+there should be a way to do this after all: C<sub foo {START: ... goto
+START;> is better.)
=head2 Make "use utf8" the default
-There is a patch available for this, search p5p archives for
-the Subject "[EXPERIMENTAL PATCH] make unicode (utf8) default"
-but this would be unacceptable because of backward compatibility:
-scripts could not contain B<any legacy eight-bit data>. Also would
-introduce a measurable slowdown of at least few percentages since all
-regular expression operations would be done in full UTF-8.
+Because of backward compatibility this is difficult: scripts could not
+contain B<any legacy eight-bit data> (like Latin-1) anymore, even in
+string literals or pod. Also would introduce a measurable slowdown of
+at least few percentages since all regular expression operations would
+be done in full UTF-8. But if you want to try this, add
+-DUSE_UTF8_SCRIPTS to your compilation flags.
+