X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperltodo.pod;h=534580ac183373da363229e1f69624413bdea26e;hb=8a36125691db1d8f79e98507373cbc6ea47271d4;hp=a8bfbabcb4c0b747ae39ce733d7947bcffdedaad;hpb=3958b146226a1f41d96ef1aa1e0dde87d11c5498;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perltodo.pod b/pod/perltodo.pod index a8bfbab..534580a 100644 --- a/pod/perltodo.pod +++ b/pod/perltodo.pod @@ -20,15 +20,7 @@ of archives may be found at: C provides this, but the interface could be a lot more straightforward. -=head2 Eliminate need for "use utf8"; - -While the C pragma is autoloaded when necessary, it's still needed -for things like Unicode characters in a source file. The UTF8 hint can -always be set to true, but it needs to be set to false when F -is being compiled. (To stop Perl trying to autoload the C -pragma...) - -=head2 Autoload byte.pm +=head2 Autoload bytes.pm When the lexer sees, for instance, C, it should automatically load the C pragma. @@ -39,6 +31,32 @@ Danger, Will Robinson! Discussing the semantics of C<"\x{F00}">, C<"\xF00"> and C<"\U{F00}"> on P5P I lead to a long and boring flamewar. +=head2 Create a char *sv_pvprintify(sv, STRLEN *lenp, UV flags) + +For displaying PVs with control characters, embedded nulls, and Unicode. +This would be useful for printing warnings, or data and regex dumping, +not_a_number(), and so on. + +Requirements: should handle both byte and UTF8 strings. isPRINT() +characters printed as-is, character less than 256 as \xHH, Unicode +characters as \x{HHH}. Don't assume ASCII-like, either, get somebody +on EBCDIC to test the output. + +Possible options, controlled by the flags: +- whitespace (other than ' ' of isPRINT()) printed as-is +- use isPRINT_LC() instead of isPRINT() +- print control characters like this: "\cA" +- print control characters like this: "^A" +- non-PRINTables printed as '.' instead of \xHH +- use \OOO instead of \xHH +- use the C/Perl-metacharacters like \n, \t +- have a maximum length for the produced string (read it from *lenp) +- append a "..." to the produced string if the maximum length is exceeded +- really fancy: print unicode characters as \N{...} + +NOTE: pv_display(), pv_uni_display(), sv_uni_display() are doing +something like the above. + =head2 Overloadable regex assertions This may or may not be possible with the current regular expression @@ -46,28 +64,57 @@ engine. The idea is that, for instance, C<\b> needs to be algorithmically computed if you're dealing with Thai text. Hence, the B<\b> assertion wants to be overloaded by a function. -=head2 Unicode collation and normalization +=head2 Unicode -Simon Cozens promises to work on this. +=over 4 - Collation? http://www.unicode.org/unicode/reports/tr10/ - Normalization? http://www.unicode.org/unicode/reports/tr15/ +=item * -=head2 Unicode case mappings +Allow for long form of the General Category Properties, e.g +C<\p{IsOpenPunctuation}>, not just the abbreviated form, e.g. +C<\p{IsPs}>. + +=item * + +Allow for the metaproperties: C, C, +C, C (require the DerivedCoreProperties and +DerviceNormalizationProperties files). + +There are also multiple value properties still unimplemented: +C, C. + +=item * Case Mappings? http://www.unicode.org/unicode/reports/tr21/ -=head2 Unicode regular expression character classes +lc(), uc(), lcfirst(), and ucfirst() work only for some of the +simplest cases, where the mapping goes from a single Unicode character +to another single Unicode character. See lib/unicore/SpecCase.txt +(and CaseFold.txt). + +=item * -They have some tricks Perl doesn't yet implement. +They have some tricks Perl doesn't yet implement like character +class subtraction. http://www.unicode.org/unicode/reports/tr18/ +=back + +See L for what's +there and what's missing. Almost all of Levels 2 and 3 is missing, +and as of 5.8.0 not even all of Level 1 is there. + =head2 use Thread for iThreads Artur Bergman's C module is a start on this, but needs to be more mature. +=head2 make perl_clone optionally clone ops + +So that pseudoforking, mod_perl, iThreads and nvi will work properly +(but not as efficiently) until the regex engine is fixed to be threadsafe. + =head2 Work out exit/die semantics for threads =head2 Typed lexicals for compiler @@ -145,11 +192,6 @@ Have a way to introduce user-defined opcodes without the subroutine call overhead of an XSUB; the user should be able to create PP code. Simon Cozens has some ideas on this. -=head2 spawnvp() on Win32 - -Win32 has problems spawning processes, particularly when the arguments -to the child process contain spaces, quotes or tab characters. - =head2 DLL Versioning Windows needs a way to know what version of a XS or C DLL it's @@ -254,7 +296,7 @@ That's to say, C would be the same as C =head2 Cross compilation Make Perl buildable with a cross-compiler. This will play havoc with -Configure, which needs to how how the target system will respond to +Configure, which needs to know how the target system will respond to its tests; maybe C will be a good starting point here. (Indeed, Bart Schuller reports that he compiled up C for the Agenda PDA and it works fine.) A really big spanner in the works @@ -262,6 +304,12 @@ is the bootstrapping build process of Perl: if the filesystem the target systems sees is not the same what the build host sees, various input, output, and (Perl) library files need to be copied back and forth. +As of 5.8.0 Configure mostly works for cross-compilation +(used successfully for iPAQ Linux), miniperl gets built, +but then building DynaLoader (and other extensions) fails +since MakeMaker knows nothing of cross-compilation. +(See INSTALL/Cross-compilation for the state of things.) + =head2 Perl preprocessor / macros Source filters help with this, but do not get us all the way. For @@ -296,10 +344,18 @@ has changed. Detecting a change is perhaps the difficult bit. =head2 All ARGV input should act like EE +eg C doesn't currently read across multiple files. + =head2 Support for rerunning debugger There should be a way of restarting the debugger on demand. +=head2 Test Suite for the Debugger + +The debugger is a complex piece of software and fixing something +here may inadvertently break something else over there. To tame +this chaotic behaviour, a test suite is necessary. + =head2 my sub foo { } The basic principle is sound, but there are problems with the semantics @@ -455,6 +511,12 @@ Hugo van der Sanden plans to look at this. This has been done in places, but needs a thorough code review. Also fchdir is available in some platforms. +=head2 Make v-strings overloaded objects + +Instead of having to guess whether a string is a v-string and thus +needs to be displayed with %vd, make v-strings (readonly) objects +(class "vstring"?) with a stringify overload. + =head1 Vague ideas Ideas which have been discussed, and which may or may not happen. @@ -464,7 +526,7 @@ Ideas which have been discussed, and which may or may not happen. It's unclear what this should do or how to do it without breaking old code. -=head2 Make tr/// return histogram +=head2 Make tr/// return histogram of characters in list context There is a patch for this, but it may require Unicodification. @@ -753,6 +815,7 @@ Suggesting this on P5P B cause a boring and interminable flamewar. =head2 "class"-based lexicals Use flyweight objects, secure hashes or, dare I say it, pseudo-hashes instead. +(Or whatever will replace pseudohashes in 5.10.) =head2 byteperl @@ -760,8 +823,39 @@ C covers this. =head2 Lazy evaluation / tail recursion removal -C in core gives some of these; tail recursion removal is -done manually, with C. (However, MJD has found that -C introduces a performance penalty, so maybe there should -be a way to do this after all: C is -better.) +C gives first() (a short-circuiting grep); tail recursion +removal is done manually, with C. (However, MJD has +found that C introduces a performance penalty, so maybe +there should be a way to do this after all: C is better.) + +=head2 Make "use utf8" the default + +Because of backward compatibility this is difficult: scripts could not +contain B (like Latin-1) anymore, even in +string literals or pod. Also would introduce a measurable slowdown of +at least few percentages since all regular expression operations would +be done in full UTF-8. But if you want to try this, add +-DUSE_UTF8_SCRIPTS to your compilation flags. + +=head2 Unicode collation and normalization + +The Unicode::Collate and Unicode::Normalize modules +by SADAHIRO Tomoyuki have been included since 5.8.0. + + Collation? http://www.unicode.org/unicode/reports/tr10/ + Normalization? http://www.unicode.org/unicode/reports/tr15/ + +=head2 Create debugging macros + +Debugging macros (like printsv, dump) can make debugging perl inside a +C debugger much easier. A good set for gdb comes with mod_perl. +Something similar should be distributed with perl. + +The proper way to do this is to use and extend Devel::DebugInit. +Devel::DebugInit also needs to be extended to support threads. + +See p5p archives for late May/early June 2001 for a recent discussion +on this topic. + +=cut