C<perlio> provides this, but the interface could be a lot more
straightforward.
-=head2 Eliminate need for "use utf8";
-
-While the C<utf8> pragma is autoloaded when necessary, it's still needed
-for things like Unicode characters in a source file. The UTF8 hint can
-always be set to true, but it needs to be set to false when F<utf8.pm>
-is being compiled. (To stop Perl trying to autoload the C<utf8>
-pragma...)
-
-=head2 Autoload byte.pm
+=head2 Autoload bytes.pm
When the lexer sees, for instance, C<bytes::length>, it should
automatically load the C<bytes> pragma.
C<"\xF00"> and C<"\U{F00}"> on P5P I<will> lead to a long and boring
flamewar.
+=head2 Create a char *sv_pvprintify(sv, STRLEN *lenp, UV flags)
+
+For displaying PVs with control characters, embedded nulls, and Unicode.
+This would be useful for printing warnings, or data and regex dumping,
+not_a_number(), and so on.
+
+Requirements: should handle both byte and UTF8 strings. isPRINT()
+characters printed as-is, character less than 256 as \xHH, Unicode
+characters as \x{HHH}. Don't assume ASCII-like, either, get somebody
+on EBCDIC to test the output.
+
+Possible options, controlled by the flags:
+- whitespace (other than ' ' of isPRINT()) printed as-is
+- use isPRINT_LC() instead of isPRINT()
+- print control characters like this: "\cA"
+- print control characters like this: "^A"
+- non-PRINTables printed as '.' instead of \xHH
+- use \OOO instead of \xHH
+- use the C/Perl-metacharacters like \n, \t
+- have a maximum length for the produced string (read it from *lenp)
+- append a "..." to the produced string if the maximum length is exceeded
+- really fancy: print unicode characters as \N{...}
+
=head2 Overloadable regex assertions
This may or may not be possible with the current regular expression
algorithmically computed if you're dealing with Thai text. Hence, the
B<\b> assertion wants to be overloaded by a function.
-=head2 Unicode collation and normalization
+=head2 Unicode
-Simon Cozens promises to work on this.
+=over 4
- Collation? http://www.unicode.org/unicode/reports/tr10/
- Normalization? http://www.unicode.org/unicode/reports/tr15/
+=item *
+
+Allow for long form of the General Category Properties, e.g
+C<\p{IsOpenPunctuation}>, not just the abbreviated form, e.g.
+C<\p{IsPs}>.
+
+=item *
+
+Allow for the metaproperties: C<XID Start>, C<XID Continue>,
+C<NF*_NO>, C<NF*_MAYBE> (require the DerivedCoreProperties and
+DerviceNormalizationProperties files).
+
+There are also enumerated properties: C<Decomposition Type>,
+C<Numeric Type>, C<East Asian Width>, C<Line Break>. These
+properties have multiple values: for uniqueness the property
+value should be appended. For example, C<\p{IsAlphabetic}>
+wouldbe the binary property, while C<\p{AlphabeticLineBreak}>
+would mean the enumerated property.
-=head2 Unicode case mappings
+=item *
Case Mappings? http://www.unicode.org/unicode/reports/tr21/
-=head2 Unicode regular expression character classes
+lc(), uc(), lcfirst(), and ucfirst() work only for some of the
+simplest cases, where the mapping goes from a single Unicode character
+to another single Unicode character. See lib/unicore/SpecCase.txt
+(and CaseFold.txt).
-They have some tricks Perl doesn't yet implement.
+=item *
+
+They have some tricks Perl doesn't yet implement like character
+class subtraction.
http://www.unicode.org/unicode/reports/tr18/
+=back
+
+See L<perlunicode/UNICODE REGULAR EXPRESSION SUPPORT LEVEL> for what's
+there and what's missing. Almost all of Levels 2 and 3 is missing,
+and as of 5.8.0 not even all of Level 1 is there.
+
=head2 use Thread for iThreads
Artur Bergman's C<iThreads> module is a start on this, but needs to
be more mature.
+=head2 make perl_clone optionally clone ops
+
+So that pseudoforking, mod_perl, iThreads and nvi will work properly
+(but not as efficiently) until the regex engine is fixed to be threadsafe.
+
=head2 Work out exit/die semantics for threads
=head2 Typed lexicals for compiler
target systems sees is not the same what the build host sees, various
input, output, and (Perl) library files need to be copied back and forth.
+As of 5.8.0 Configure mostly works for cross-compilation
+(used successfully for iPAQ Linux), miniperl gets built,
+but then building DynaLoader (and other extensions) fails
+since MakeMaker knows nothing of cross-compilation.
+(See INSTALL/Cross-compilation for the state of things.)
+
=head2 Perl preprocessor / macros
Source filters help with this, but do not get us all the way. For
=head2 All ARGV input should act like E<lt>E<gt>
+eg C<read(ARGV, ...)> doesn't currently read across multiple files.
+
=head2 Support for rerunning debugger
There should be a way of restarting the debugger on demand.
+=head2 Test Suite for the Debugger
+
+The debugger is a complex piece of software and fixing something
+here may inadvertently break something else over there. To tame
+this chaotic behaviour, a test suite is necessary.
+
=head2 my sub foo { }
The basic principle is sound, but there are problems with the semantics
full-text search, an index function, locating pages on a particular
high-level subject, and so on.
-=head2 Install .3p man pages
+=head2 Install .3p manpages
-This is a bone of contention; we can create C<.3p> man pages for each
+This is a bone of contention; we can create C<.3p> manpages for each
built-in function, but should we install them by default? Tcl does this,
and it clutters up C<apropos>.
Simon Cozens promises to do this before he gets old.
=head2 Update POSIX.pm for 1003.1-2
+
=head2 Retargetable installation
Allow C<@INC> to be changed after Perl is built.
It's unclear what this should do or how to do it without breaking old
code.
-=head2 Make tr/// return histogram
+=head2 Make tr/// return histogram of characters in list context
There is a patch for this, but it may require Unicodification.
=head2 Compile to real threaded code
+
=head2 Structured types
+
=head2 Modifiable $1 et al.
($x = "elephant") =~ /e(ph)/;
Not needed now we have lexical IO handles.
=head2 format BOTTOM
+
=head2 report HANDLE
Damian Conway's text formatting modules seem to be the Way To Go.
=head2 Generalised want()/caller())
+
=head2 Named prototypes
These both seem to be delayed until Perl 6.
=head2 "class"-based lexicals
Use flyweight objects, secure hashes or, dare I say it, pseudo-hashes instead.
+(Or whatever will replace pseudohashes in 5.10.)
=head2 byteperl
=head2 Lazy evaluation / tail recursion removal
-C<List::Util> in core gives some of these; tail recursion removal is
-done manually, with C<goto &whoami;>. (However, MJD has found that
-C<goto &whoami> introduces a performance penalty, so maybe there should
-be a way to do this after all: C<sub foo {START: ... goto START;> is
-better.)
+C<List::Util> gives first() (a short-circuiting grep); tail recursion
+removal is done manually, with C<goto &whoami;>. (However, MJD has
+found that C<goto &whoami> introduces a performance penalty, so maybe
+there should be a way to do this after all: C<sub foo {START: ... goto
+START;> is better.)
+
+=head2 Make "use utf8" the default
+
+Because of backward compatibility this is difficult: scripts could not
+contain B<any legacy eight-bit data> (like Latin-1) anymore, even in
+string literals or pod. Also would introduce a measurable slowdown of
+at least few percentages since all regular expression operations would
+be done in full UTF-8. But if you want to try this, add
+-DUSE_UTF8_SCRIPTS to your compilation flags.
+
+=head2 Unicode collation and normalization
+
+The Unicode::Collate and Unicode::Normalize modules
+by SADAHIRO Tomoyuki have been included since 5.8.0.
+
+ Collation? http://www.unicode.org/unicode/reports/tr10/
+ Normalization? http://www.unicode.org/unicode/reports/tr15/
+
+=head2 Create debugging macros
+
+Debugging macros (like printsv, dump) can make debugging perl inside a
+C debugger much easier. A good set for gdb comes with mod_perl.
+Something similar should be distributed with perl.
+
+The proper way to do this is to use and extend Devel::DebugInit.
+Devel::DebugInit also needs to be extended to support threads.
+
+See p5p archives for late May/early June 2001 for a recent discussion
+on this topic.
+
+=cut