3 perltodo - Perl TO-DO List
7 This is a list of wishes for Perl. Send updates to
8 I<perl5-porters@perl.org>. If you want to work on any of these
9 projects, be sure to check the perl5-porters archives for past ideas,
10 flames, and propaganda. This will save you time and also prevent you
11 from implementing something that Larry has already vetoed. One set
12 of archives may be found at:
14 http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/
16 =head1 To do during 5.6.x
18 =head2 Support for I/O disciplines
20 C<perlio> provides this, but the interface could be a lot more
23 =head2 Autoload bytes.pm
25 When the lexer sees, for instance, C<bytes::length>, it should
26 automatically load the C<bytes> pragma.
28 =head2 Make "\u{XXXX}" et al work
30 Danger, Will Robinson! Discussing the semantics of C<"\x{F00}">,
31 C<"\xF00"> and C<"\U{F00}"> on P5P I<will> lead to a long and boring
34 =head2 Create a char *sv_pvprintify(sv, STRLEN *lenp, UV flags)
36 For displaying PVs with control characters, embedded nulls, and Unicode.
37 This would be useful for printing warnings, or data and regex dumping,
38 not_a_number(), and so on.
40 Requirements: should handle both byte and UTF8 strings. isPRINT()
41 characters printed as-is, character less than 256 as \xHH, Unicode
42 characters as \x{HHH}. Don't assume ASCII-like, either, get somebody
43 on EBCDIC to test the output.
45 Possible options, controlled by the flags:
46 - whitespace (other than ' ' of isPRINT()) printed as-is
47 - use isPRINT_LC() instead of isPRINT()
48 - print control characters like this: "\cA"
49 - print control characters like this: "^A"
50 - non-PRINTables printed as '.' instead of \xHH
51 - use \OOO instead of \xHH
52 - use the C/Perl-metacharacters like \n, \t
53 - have a maximum length for the produced string (read it from *lenp)
54 - append a "..." to the produced string if the maximum length is exceeded
55 - really fancy: print unicode characters as \N{...}
57 =head2 Overloadable regex assertions
59 This may or may not be possible with the current regular expression
60 engine. The idea is that, for instance, C<\b> needs to be
61 algorithmically computed if you're dealing with Thai text. Hence, the
62 B<\b> assertion wants to be overloaded by a function.
70 Allow for long form of the General Category Properties, e.g
71 C<\p{IsOpenPunctuation}>, not just the abbreviated form, e.g.
76 Allow for the metaproperties C<Any> and C<Assigned>, and C<Common>;
77 C<Alphabetic>, C<Ideographic>, C<Lowercase>, C<Uppercase> (note that
78 are large classes than the general categories C<Lu> and C<Ll>),
79 C<White Space>, C<Bidi Control>, C<Join Control>, C<ASCII Hex Digit>,
80 C<Hex Digit>, <Noncharacter Code Point>, C<ID Start>, C<ID Continue>,
81 C<XID Start>, C<XID Continue>, C<NF*_NO>, C<NF*_MAYBE>.
83 There are also enumerated properties: C<Decomposition Type>,
84 C<Numeric Type>, C<East Asian Width>, C<Line Break>. These
85 properties have multiple values: for uniqueness the property
86 value should be appended. For example, C<\p{IsAlphabetic}>
87 wouldbe the binary property, while C<\p{AlphabeticLineBreak}>
88 would mean the enumerated property.
92 Case Mappings? http://www.unicode.org/unicode/reports/tr21/
94 lc(), uc(), lcfirst(), and ucfirst() work only for some of the
95 simplest cases, where the mapping goes from a single Unicode character
96 to another single Unicode character. See lib/unicore/SpecCase.txt
101 They have some tricks Perl doesn't yet implement like character
104 http://www.unicode.org/unicode/reports/tr18/
108 See L<perlunicode/UNICODE REGULAR EXPRESSION SUPPORT LEVEL> for what's
109 there and what's missing. Almost all of Levels 2 and 3 is missing,
110 and as of 5.8.0 not even all of Level 1 is there.
112 =head2 use Thread for iThreads
114 Artur Bergman's C<iThreads> module is a start on this, but needs to
117 =head2 make perl_clone optionally clone ops
119 So that pseudoforking, mod_perl, iThreads and nvi will work properly
120 (but not as efficiently) until the regex engine is fixed to be threadsafe.
122 =head2 Work out exit/die semantics for threads
124 =head2 Typed lexicals for compiler
126 =head2 Compiler workarounds for Win32
128 =head2 AUTOLOADing in the compiler
130 =head2 Fixing comppadlist when compiling
132 =head2 Cleaning up exported namespace
134 =head2 Complete signal handling
136 Add C<PERL_ASYNC_CHECK> to opcodes which loop; replace C<sigsetjmp> with
137 C<sigjmp>; check C<wait> for signal safety.
139 =head2 Out-of-source builds
141 This was done for 5.6.0, but needs reworking for 5.7.x
143 =head2 POSIX realtime support
145 POSIX 1003.1 1996 Edition support--realtime stuff: POSIX semaphores,
146 message queues, shared memory, realtime clocks, timers, signals (the
147 metaconfig units mostly already exist for these)
149 =head2 UNIX98 support
151 Reader-writer locks, realtime/asynchronous IO
155 There are non-core modules, such as C<Net::IPv6>, but these will need
156 integrating when IPv6 actually starts to really happen. See RFC 2292
159 =head2 Long double conversion
161 Floating point formatting is still causing some weird test failures.
165 Locales and Unicode interact with each other in unpleasant ways.
166 One possible solution would be to adopt/support ICU:
168 http://oss.software.ibm.com/developerworks/opensource/icu/project/
170 =head2 Thread-safe regexes
172 The regular expression engine is currently non-threadsafe.
174 =head2 Arithmetic on non-Arabic numerals
176 C<[1234567890]> aren't the only numerals any more.
178 =head2 POSIX Unicode character classes
180 ([=a=] for equivalance classes, [.ch.] for collation.)
181 These are dependent on Unicode normalization and collation.
183 =head2 Factoring out common suffices/prefices in regexps (trie optimization)
185 Currently, the user has to optimize C<foo|far> and C<foo|goo> into
186 C<f(?:oo|ar)> and C<[fg]oo> by hand; this could be done automatically.
188 =head2 Security audit shipped utilities
190 All the code we ship with Perl needs to be sensible about temporary file
191 handling, locking, input validation, and so on.
193 =head2 Custom opcodes
195 Have a way to introduce user-defined opcodes without the subroutine call
196 overhead of an XSUB; the user should be able to create PP code. Simon
197 Cozens has some ideas on this.
199 =head2 spawnvp() on Win32
201 Win32 has problems spawning processes, particularly when the arguments
202 to the child process contain spaces, quotes or tab characters.
204 =head2 DLL Versioning
206 Windows needs a way to know what version of a XS or C<libperl> DLL it's
209 =head2 Introduce @( and @)
211 C<$(> may return "foo bar baz". Unfortunately, since groups can
212 theoretically have spaces in their names, this could be one, two or
215 =head2 Floating point handling
217 C<NaN> and C<inf> support is particularly troublesome.
218 (fp_classify(), fp_class(), fp_class_d(), class(), isinf(),
219 isfinite(), finite(), isnormal(), unordered(), <ieeefp.h>,
220 <fp_class.h> (there are metaconfig units for all these) (I think),
221 fp_setmask(), fp_getmask(), fp_setround(), fp_getround()
222 (no metaconfig units yet for these). Don't forget finitel(), fp_classl(),
223 fp_class_l(), (yes, both do, unfortunately, exist), and unorderedl().)
225 As of Perl 5.6.1 is a Perl macro, Perl_isnan().
227 =head2 IV/UV preservation
229 Nicholas Clark has done a lot of work on this, but work is continuing.
230 C<+>, C<-> and C<*> work, but guards need to be in place for C<%>, C</>,
231 C<&>, C<oct>, C<hex> and C<pack>.
233 =head2 Replace pod2html with something using Pod::Parser
235 The CPAN module C<Malik::Pod::Html> may be a more suitable basis for a
236 C<pod2html> convertor; the current one duplicates the functionality
237 abstracted in C<Pod::Parser>, which makes updating the POD language
240 =head2 Automate module testing on CPAN
242 When a new Perl is being beta tested, porters have to manually grab
243 their favourite CPAN modules and test them - this should be done
246 =head2 sendmsg and recvmsg
248 We have all the other BSD socket functions but these. There are
249 metaconfig units for these functions which can be added. To avoid these
250 being new opcodes, a solution similar to the way C<sockatmark> was added
251 would be preferable. (Autoload the C<IO::whatever> module.)
253 =head2 Rewrite perlre documentation
255 The new-style patterns need full documentation, and the whole document
256 needs to be a lot clearer.
258 =head2 Convert example code to IO::Handle filehandles
260 =head2 Document Win32 choices
262 =head2 Check new modules
264 =head2 Make roffitall find pods and libs itself
266 Simon Cozens has done some work on this but it needs a rethink.
268 =head1 To do at some point
270 These are ideas that have been regularly tossed around, that most
271 people believe should be done maybe during 5.8.x
273 =head2 Remove regular expression recursion
275 Because the regular expression engine is recursive, badly designed
276 expressions can lead to lots of recursion filling up the stack. Ilya
277 claims that it is easy to convert the engine to being iterative, but
278 this has still not yet been done. There may be a regular expression
279 engine hit squad meeting at TPC5.
281 =head2 Memory leaks after failed eval
283 Perl will leak memory if you C<eval "hlagh hlagh hlagh hlagh">. This is
284 partially because it attempts to build up an op tree for that code and
285 doesn't properly free it. The same goes for non-syntactically-correct
286 regular expressions. Hugo looked into this, but decided it needed a
287 mark-and-sweep GC implementation.
289 Alan notes that: The basic idea was to extend the parser token stack
290 (C<YYSTYPE>) to include a type field so we knew what sort of thing each
291 element of the stack was. The F<<perly.c> code would then have to be
292 postprocessed to record the type of each entry on the stack as it was
293 created, and the parser patched so that it could unroll the stack
296 This is possible to do, but would be pretty messy to implement, as it
297 would rely on even more sed hackery in F<perly.fixer>.
299 =head2 pack "(stuff)*"
301 That's to say, C<pack "(sI)40"> would be the same as C<pack "sI"x40>
303 =head2 bitfields in pack
305 =head2 Cross compilation
307 Make Perl buildable with a cross-compiler. This will play havoc with
308 Configure, which needs to how how the target system will respond to
309 its tests; maybe C<microperl> will be a good starting point here.
310 (Indeed, Bart Schuller reports that he compiled up C<microperl> for
311 the Agenda PDA and it works fine.) A really big spanner in the works
312 is the bootstrapping build process of Perl: if the filesystem the
313 target systems sees is not the same what the build host sees, various
314 input, output, and (Perl) library files need to be copied back and forth.
316 As of 5.8.0 Configure mostly works for cross-compilation
317 (used successfully for iPAQ Linux), miniperl gets built,
318 but then building DynaLoader (and other extensions) fails
319 since MakeMaker knows nothing of cross-compilation.
320 (See INSTALL/Cross-compilation for the state of things.)
322 =head2 Perl preprocessor / macros
324 Source filters help with this, but do not get us all the way. For
325 instance, it should be possible to implement the C<??> operator somehow;
326 source filters don't (quite) cut it.
328 =head2 Perl lexer in Perl
330 Damian Conway is planning to work on this, but it hasn't happened yet.
332 =head2 Using POSIX calls internally
334 When faced with a BSD vs. SySV -style interface to some library or
335 system function, perl's roots show in that it typically prefers the BSD
336 interface (but falls back to the SysV one). One example is getpgrp().
337 Other examples include C<memcpy> vs. C<bcopy>. There are others, mostly in
340 Mostly, this item is a suggestion for which way to start a journey into
341 an C<#ifdef> forest. It is not primarily a suggestion to eliminate any of
342 the C<#ifdef> forests.
344 POSIX calls are perhaps more likely to be portable to unexpected
345 architectures. They are also perhaps more likely to be actively
346 maintained by a current vendor. They are also perhaps more likely to be
347 available in thread-safe versions, if appropriate.
349 =head2 -i rename file when changed
351 It's only necessary to rename a file when inplace editing when the file
352 has changed. Detecting a change is perhaps the difficult bit.
354 =head2 All ARGV input should act like E<lt>E<gt>
356 eg C<read(ARGV, ...)> doesn't currently read across multiple files.
358 =head2 Support for rerunning debugger
360 There should be a way of restarting the debugger on demand.
362 =head2 Test Suite for the Debugger
364 The debugger is a complex piece of software and fixing something
365 here may inadvertently break something else over there. To tame
366 this chaotic behaviour, a test suite is necessary.
368 =head2 my sub foo { }
370 The basic principle is sound, but there are problems with the semantics
371 of self-referential and mutually referential lexical subs: how to
374 =head2 One-pass global destruction
376 Sweeping away all the allocated memory in one go is a laudable goal, but
377 it's difficult and in most cases, it's easier to let the memory get
380 =head2 Rewrite regexp parser
382 There has been talk recently of rewriting the regular expression parser
383 to produce an optree instead of a chain of opcodes; it's unclear whether
384 or not this would be a win.
386 =head2 Cache recently used regexps
390 for my $re (@regexps) {
394 C<qr//> already gives us a way of saving compiled regexps, but it should
395 be done automatically.
397 =head2 Re-entrant functions
399 Add configure probes for C<_r> forms of system calls and fit them to the
400 core. Unfortunately, calling conventions for these functions and not
403 =head2 Cross-compilation support
405 Bart Schuller reports that using C<microperl> and a cross-compiler, he
406 got Perl working on the Agenda PDA. However, one cannot build a full
407 Perl because Configure needs to get the results for the target platform,
410 =head2 Bit-shifting bitvectors
414 vec($v, 1000, 1) = 1;
416 One should be able to do
420 and have the 999'th bit set.
422 Currently if you try with shift bitvectors you shift the NV/UV, instead
423 of the bits in the PV. Not very logical.
425 =head2 debugger pragma
427 The debugger is implemented in Perl in F<perl5db.pl>; turning it into a
428 pragma should be easy, but making it work lexically might be more
429 difficult. Fiddling with C<$^P> would be necessary.
431 =head2 use less pragma
433 Identify areas where speed/memory tradeoffs can be made and have a hint
434 to switch between them.
436 =head2 switch structures
438 Although we have C<Switch.pm> in core, Larry points to the dormant
439 C<nswitch> and C<cswitch> ops in F<pp.c>; using these opcodes would be
442 =head2 Cache eval tree
446 =head2 Shrink opcode tables
448 =head2 Optimize away @_
450 Look at the "reification" code in C<av.c>
452 =head2 Prototypes versus indirect objects
454 Currently, indirect object syntax bypasses prototype checks.
458 HTML versions of the documentation need to be installed by default; a
459 call to C<installhtml> from C<installperl> may be all that's necessary.
461 =head2 Prototype method calls
463 =head2 Return context prototype declarations
467 =head2 Garbage collection
469 There have been persistent mumblings about putting a mark-and-sweep
470 garbage detector into Perl; Alan Burlison has some ideas about this.
474 Mark-Jason Dominus has the beginnings of one of these.
476 =head2 pack/unpack tutorial
478 Simon Cozens has the beginnings of one of these.
480 =head2 Rewrite perldoc
482 There are a few suggestions for what to do with C<perldoc>: maybe a
483 full-text search, an index function, locating pages on a particular
484 high-level subject, and so on.
486 =head2 Install .3p manpages
488 This is a bone of contention; we can create C<.3p> manpages for each
489 built-in function, but should we install them by default? Tcl does this,
490 and it clutters up C<apropos>.
492 =head2 Unicode tutorial
494 Simon Cozens promises to do this before he gets old.
496 =head2 Update POSIX.pm for 1003.1-2
498 =head2 Retargetable installation
500 Allow C<@INC> to be changed after Perl is built.
502 =head2 POSIX emulation on non-POSIX systems
504 Make C<POSIX.pm> behave as POSIXly as possible everywhere, meaning we
505 have to implement POSIX equivalents for some functions if necessary.
507 =head2 Rename Win32 headers
509 =head2 Finish off lvalue functions
511 They don't work in the debugger, and they don't work for list or hash
514 =head2 Update sprintf documentation
516 Hugo van der Sanden plans to look at this.
518 =head2 Use fchown/fchmod internally
520 This has been done in places, but needs a thorough code review.
521 Also fchdir is available in some platforms.
525 Ideas which have been discussed, and which may or may not happen.
527 =head2 ref() in list context
529 It's unclear what this should do or how to do it without breaking old
532 =head2 Make tr/// return histogram of characters in list context
534 There is a patch for this, but it may require Unicodification.
536 =head2 Compile to real threaded code
538 =head2 Structured types
540 =head2 Modifiable $1 et al.
542 ($x = "elephant") =~ /e(ph)/;
543 $1 = "g"; # $x = "elegant"
545 What happens if there are multiple (nested?) brackets? What if the
546 string changes between the match and the assignment?
548 =head2 Procedural interfaces for IO::*, etc.
550 Some core modules have been accused of being overly-OO. Adding
551 procedural interfaces could demystify them.
555 =head2 Attach/detach debugger from running program
557 With C<gdb>, you can attach the debugger to a running program if you
558 pass the process ID. It would be good to do this with the Perl debugger
559 on a running Perl program, although I'm not sure how it would be done.
561 =head2 Alternative RE syntax module
564 $re = Regex::Newbie->new
567 ->repeat(Regex::Newbie->class("char"),3)
573 A non-core module that would use "native" GUI to create graphical
576 =head2 foreach(reverse ...)
580 foreach (reverse @_) { ... }
582 puts C<@_> on the stack, reverses it putting the reversed version on the
583 stack, then iterates forwards. Instead, it could be special-cased to put
584 C<@_> on the stack then iterate backwards.
586 =head2 Constant function cache
588 =head2 Approximate regular expression matching
592 These items B<always> need doing:
594 =head2 Update guts documentation
596 Simon Cozens tries to do this when possible, and contributions to the
597 C<perlapi> documentation is welcome.
599 =head2 Add more tests
601 Michael Schwern will donate $500 to Yet Another Society when all core
604 =head2 Update auxiliary tools
606 The code we ship with Perl should look like good Perl 5.
608 =head1 Recently done things
610 These are things which have been on the todo lists in previous releases
611 but have recently been completed.
613 =head2 Safe signal handling
615 A new signal model went into 5.7.1 without much fanfare. Operations and
616 C<malloc>s are no longer interrupted by signals, which are handled
617 between opcodes. This means that C<PERL_ASYNC_CHECK> now actually does
618 something. However, there are still a few things that need to be done.
622 Modules which implement arrays in terms of strings, substrings or files
623 can be found on the CPAN.
627 C<Time::Hires> has been integrated into the core.
629 =head2 setitimer and getimiter
631 Adding C<Time::Hires> got us this too.
633 =head2 Testing __DIE__ hook
635 Tests have been added.
637 =head2 CPP equivalent in Perl
639 A C Yardley will probably have done this by the time you can read this.
640 This allows for a generalization of the C constant detection used in
641 building C<Errno.pm>.
643 =head2 Explicit switch statements
645 C<Switch.pm> has been integrated into the core to give you all manner of
646 C<switch...case> semantics.
654 Nick Ing-Simmons has made UTF-EBCDIC (UTR13) work with Perl.
656 EBCDIC? http://www.unicode.org/unicode/reports/tr16/
660 Although there are probably some small bugs to be rooted out, Jarkko
661 Hietaniemi has made regular expressions polymorphic between bytes and
664 =head2 perlcc to produce executable
666 C<perlcc> was recently rewritten, and can now produce standalone
669 =head2 END blocks saved in compiled output
671 =head2 Secure temporary file module
673 Tim Jenness' C<File::Temp> is now in core.
675 =head2 Integrate Time::HiRes
677 This module is now part of core.
679 =head2 Turn Cwd into XS
681 Benjamin Sugars has done this.
683 =head2 Mmap for input
685 Nick Ing-Simmons' C<perlio> supports an C<mmap> IO method.
687 =head2 Byte to/from UTF8 and UTF8 to/from local conversion
689 C<Encode> provides this.
691 =head2 Add sockatmark support
695 =head2 Mailing list archives
697 http://lists.perl.org/, http://archive.develooper.com/
701 Richard Foley has written the bug tracking system at http://bugs.perl.org/
703 =head2 Integrate MacPerl
705 Chris Nandor and Matthias Neeracher have integrated the MacPerl changes
708 =head2 Web "nerve center" for Perl
710 http://use.perl.org/ is what you're looking for.
712 =head2 Regular expression tutorial
714 C<perlretut>, provided by Mark Kvale.
716 =head2 Debugging Tutorial
718 C<perldebtut>, written by Richard Foley.
720 =head2 Integrate new modules
722 Jarkko has been integrating madly into 5.7.x
724 =head2 Integrate profiler
726 C<Devel::DProf> is now a core module.
728 =head2 Y2K error detection
730 There's a configure option to detect unsafe concatenation with "19", and
731 a CPAN module. (C<D'oh::Year>)
733 =head2 Regular expression debugger
735 While not part of core, Mark-Jason Dominus has written C<Rx> and has
736 also come up with a generalised strategy for regular expression
741 That's, uh, F<podchecker>
743 =head2 "Dynamic" lexicals
745 =head2 Cache precompiled modules
747 =head1 Deprecated Wishes
749 These are items which used to be in the todo file, but have been
750 deprecated for some reason.
752 =head2 Loop control on do{}
754 This would break old code; use C<do{{ }}> instead.
756 =head2 Lexically scoped typeglobs
758 Not needed now we have lexical IO handles.
764 Damian Conway's text formatting modules seem to be the Way To Go.
766 =head2 Generalised want()/caller())
768 =head2 Named prototypes
770 These both seem to be delayed until Perl 6.
772 =head2 Built-in globbing
774 The C<File::Glob> module has been used to replace the C<glob> function.
776 =head2 Regression tests for suidperl
778 C<suidperl> is deprecated in favour of common sense.
780 =head2 Cached hash values
782 We have shared hash keys, which perform the same job.
784 =head2 Add compression modules
786 The compression modules are a little heavy; meanwhile, Nick Clark is
787 working on experimental pragmata to do transparent decompression on
790 =head2 Reorganise documentation into tutorials/references
792 Could not get consensus on P5P about this.
794 =head2 Remove distinction between functions and operators
796 Caution: highly flammable.
798 =head2 Make XS easier to use
800 Use C<Inline> instead, or SWIG.
802 =head2 Make embedding easier to use
808 See the Perl Power Tools. (http://language.perl.com/ppt/)
810 =head2 my $Package::variable
814 =head2 "or" tests defined, not truth
816 Suggesting this on P5P B<will> cause a boring and interminable flamewar.
818 =head2 "class"-based lexicals
820 Use flyweight objects, secure hashes or, dare I say it, pseudo-hashes instead.
821 (Or whatever will replace pseudohashes in 5.10.)
825 C<ByteLoader> covers this.
827 =head2 Lazy evaluation / tail recursion removal
829 C<List::Util> gives first() (a short-circuiting grep); tail recursion
830 removal is done manually, with C<goto &whoami;>. (However, MJD has
831 found that C<goto &whoami> introduces a performance penalty, so maybe
832 there should be a way to do this after all: C<sub foo {START: ... goto
835 =head2 Make "use utf8" the default
837 Because of backward compatibility this is difficult: scripts could not
838 contain B<any legacy eight-bit data> (like Latin-1) anymore, even in
839 string literals or pod. Also would introduce a measurable slowdown of
840 at least few percentages since all regular expression operations would
841 be done in full UTF-8. But if you want to try this, add
842 -DUSE_UTF8_SCRIPTS to your compilation flags.
844 =head2 Unicode collation and normalization
846 The Unicode::Collate and Unicode::Normalize modules
847 by SADAHIRO Tomoyuki have been included since 5.8.0.
849 Collation? http://www.unicode.org/unicode/reports/tr10/
850 Normalization? http://www.unicode.org/unicode/reports/tr15/
852 =head2 Create debugging macros
854 Debugging macros (like printsv, dump) can make debugging perl inside a
855 C debugger much easier. A good set for gdb comes with mod_perl.
856 Something similar should be distributed with perl.
858 The proper way to do this is to use and extend Devel::DebugInit.
859 Devel::DebugInit also needs to be extended to support threads.
861 See p5p archives for late May/early June 2001 for a recent discussion