X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperltodo.pod;h=53158e7848e2ef7a3453dba9c2808eeadfe15047;hb=e99d581a4aaa3c92d0b0dda6799157fe7a569f31;hp=a675b7a77e01319693f4f873672fb10d1159be0c;hpb=b2e2905cd6316367cb36fd419288b5b5df9c574c;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perltodo.pod b/pod/perltodo.pod
index a675b7a..53158e7 100644
--- a/pod/perltodo.pod
+++ b/pod/perltodo.pod
@@ -4,10 +4,11 @@ perltodo - Perl TO-DO List
=head1 DESCRIPTION
-This is a list of wishes for Perl. The tasks we think are smaller or easier
-are listed first. Anyone is welcome to work on any of these, but it's a good
-idea to first contact I to avoid duplication of
-effort. By all means contact a pumpking privately first if you prefer.
+This is a list of wishes for Perl. The tasks we think are smaller or
+easier are listed first. Anyone is welcome to work on any of these,
+but it's a good idea to first contact I to
+avoid duplication of effort, and to learn from any previous attempts.
+By all means contact a pumpking privately first if you prefer.
Whilst patches to make the list shorter are most welcome, ideas to add to
the list are also encouraged. Check the perl5-porters archives for past
@@ -20,39 +21,14 @@ not, but if your patch is incorporated, then we'll add your name to the
F file, which ships in the official distribution. How many other
programming languages offer you 1 line of immortality?
-=head1 The roadmap to 5.10
-
-The roadmap to 5.10 envisages feature based releases, as various items in this
-TODO are completed.
-
-=head2 Needed for a 5.9.5 release
-
-=over
-
-=item *
-
-Implement L
-
-=item *
-
-Review smart match semantics in light of Perl 6 developments.
-
-=item *
-
-Review assertions. Review syntax to combine assertions. Assertions could take
-advantage of the lexical pragmas work. L
-
-=item *
-
-C should be turned into a lexical pragma (probably).
-
-=back
-
-=head2 Needed for a 5.9.6 release
+=head1 Tasks that only need Perl knowledge
-Stabilisation. If all goes well, this will be the equivalent of a 5.10-beta.
+=head2 Remove duplication of test setup.
-=head1 Tasks that only need Perl knowledge
+Schwern notes, that there's duplication of code - lots and lots of tests have
+some variation on the big block of C<$Is_Foo> checks. We can safely put this
+into a file, change it to build an C<%Is> hash and require it. Maybe just put
+it into F. Throw in the handy tainting subroutines.
=head2 common test code for timed bail out
@@ -60,7 +36,7 @@ Write portable self destruct code for tests to stop them burning CPU in
infinite loops. This needs to avoid using alarm, as some of the tests are
testing alarm/sleep or timers.
-=head2 POD -> HTML conversion in the core still sucks
+=head2 POD -E HTML conversion in the core still sucks
Which is crazy given just how simple POD purports to be, and how simple HTML
can be. It's not actually I simple as it sounds, particularly with the
@@ -72,6 +48,13 @@ is needed to improve the cross-linking.
The addition of C and its related modules may make this task
easier to complete.
+=head2 merge checkpods and podchecker
+
+F (and C in the F subdirectory)
+implements a very basic check for pod files, but the errors it discovers
+aren't found by podchecker. Add this check to podchecker, get rid of
+checkpods and have C use podchecker.
+
=head2 Parallel testing
(This probably impacts much more than the core: also the Test::Harness
@@ -119,6 +102,28 @@ tests that are currently missing.
A full test suite for the B module would be nice.
+=head2 Deparse inlined constants
+
+Code such as this
+
+ use constant PI => 4;
+ warn PI
+
+will currently deparse as
+
+ use constant ('PI', 4);
+ warn 4;
+
+because the tokenizer inlines the value of the constant subroutine C.
+This allows various compile time optimisations, such as constant folding
+and dead code elimination. Where these haven't happened (such as the example
+above) it ought be possible to make B::Deparse work out the name of the
+original constant, because just enough information survives in the symbol
+table to do this. Specifically, the same scalar is used for the constant in
+the optree as is used for the constant subroutine, so by iterating over all
+symbol tables and generating a mapping of SV address to constant name, it
+would be possible to provide B::Deparse with this functionality.
+
=head2 A decent benchmark
C seems impervious to any recent changes made to the perl core. It
@@ -140,6 +145,18 @@ distribution needs to be dual lifed. Anything else can be too. Figure out what
changes would be needed to package that module and its tests up for CPAN, and
do so. Test it with older perl releases, and fix the problems you find.
+To make a minimal perl distribution, it's useful to look at
+F.
+
+=head2 Bundle dual life modules in ext/
+
+For maintenance (and branch merging) reasons, it would be useful to move
+some architecture-independent dual-life modules from lib/ to ext/, if this
+has no negative impact on the build of perl itself.
+
+However, we need to make sure that they are still installed in
+architecture-independent directories by C.
+
=head2 Improving C
Investigate whether C could share aggregates properly with
@@ -156,16 +173,37 @@ for example POSIX passes Exporter some very memory hungry data structures.
There is a script F that generates several header files to prefix
all of Perl's symbols in a consistent way, to provide some semblance of
namespace support in C. Functions are declared in F, variables
-in F and F. Quite a few of the functions and variables
+in F. Quite a few of the functions and variables
are conditionally declared there, using C<#ifdef>. However, F
doesn't understand the C macros, so the rules about which symbols are present
when is duplicated in F. Writing things twice is bad, m'kay.
It would be good to teach C to understand the conditional
compilation, and hence remove the duplication, and the mistakes it has caused.
+=head2 use strict; and AutoLoad
+
+Currently if you write
+ package Whack;
+ use AutoLoader 'AUTOLOAD';
+ use strict;
+ 1;
+ __END__
+ sub bloop {
+ print join (' ', No, strict, here), "!\n";
+ }
+then C
would be roughly equivalent to:
-so we can override qx// as well.
+ do { local $"='|'; /\b(?:P)\b/ }
+
+See L
+for the discussion.
=head2 optional optimizer
@@ -571,37 +903,171 @@ perl and XS subroutines. Subroutine implementations rarely change between
perl and XS at run time, so investigate using 2 ops to enter subs (one for
XS, one for perl) and swap between if a sub is redefined.
-=head2 Self ties
+=head2 Self-ties
-self ties are currently illegal because they caused too many segfaults. Maybe
-the causes of these could be tracked down and self-ties on all types re-
-instated.
+Self-ties are currently illegal because they caused too many segfaults. Maybe
+the causes of these could be tracked down and self-ties on all types
+reinstated.
=head2 Optimize away @_
The old perltodo notes "Look at the "reification" code in C".
-=head2 What hooks would assertions need?
-
-Assertions are in the core, and work. However, assertions needed to be added
-as a core patch, rather than an XS module in ext, or a CPAN module, because
-the core has no hooks in the necessary places. It would be useful to
-investigate what hooks would need to be added to make it possible to provide
-the full assertion support from a CPAN module, so that we aren't constraining
-the imagination of future CPAN authors.
-
-=head2 Properly Unicode safe tokeniser and pads.
-
-The tokeniser isn't actually very UTF-8 clean. C is a hack -
-variable names are stored in stashes as raw bytes, without the utf-8 flag
-set. The pad API only takes a C pointer, so that's all bytes too. The
-tokeniser ignores the UTF-8-ness of C, or any SVs returned from
-source filters. All this could be fixed.
+=head2 Virtualize operating system access
+
+Implement a set of "vtables" that virtualizes operating system access
+(open(), mkdir(), unlink(), readdir(), getenv(), etc.) At the very
+least these interfaces should take SVs as "name" arguments instead of
+bare char pointers; probably the most flexible and extensible way
+would be for the Perl-facing interfaces to accept HVs. The system
+needs to be per-operating-system and per-file-system
+hookable/filterable, preferably both from XS and Perl level
+(L is good reading at this point,
+in fact, all of L is.)
+
+This has actually already been implemented (but only for Win32),
+take a look at F and F. While all Win32
+variants go through a set of "vtables" for operating system access,
+non-Win32 systems currently go straight for the POSIX/UNIX-style
+system/library call. Similar system as for Win32 should be
+implemented for all platforms. The existing Win32 implementation
+probably does not need to survive alongside this proposed new
+implementation, the approaches could be merged.
+
+What would this give us? One often-asked-for feature this would
+enable is using Unicode for filenames, and other "names" like %ENV,
+usernames, hostnames, and so forth.
+(See L.)
+
+But this kind of virtualization would also allow for things like
+virtual filesystems, virtual networks, and "sandboxes" (though as long
+as dynamic loading of random object code is allowed, not very safe
+sandboxes since external code of course know not of Perl's vtables).
+An example of a smaller "sandbox" is that this feature can be used to
+implement per-thread working directories: Win32 already does this.
+
+See also L"Extend PerlIO and PerlIO::Scalar">.
+
+=head2 Investigate PADTMP hash pessimisation
+
+The peephole optimier converts constants used for hash key lookups to shared
+hash key scalars. Under ithreads, something is undoing this work.
+See http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-09/msg00793.html
+
+=head2 Store the current pad in the OP slab allocator
+
+=for clarification
+I hope that I got that "current pad" part correct
+
+Currently we leak ops in various cases of parse failure. I suggested that we
+could solve this by always using the op slab allocator, and walking it to
+free ops. Dave comments that as some ops are already freed during optree
+creation one would have to mark which ops are freed, and not double free them
+when walking the slab. He notes that one problem with this is that for some ops
+you have to know which pad was current at the time of allocation, which does
+change. I suggested storing a pointer to the current pad in the memory allocated
+for the slab, and swapping to a new slab each time the pad changes. Dave thinks
+that this would work.
+
+=head2 repack the optree
+
+Repacking the optree after execution order is determined could allow
+removal of NULL ops, and optimal ordering of OPs with respect to cache-line
+filling. The slab allocator could be reused for this purpose. I think that
+the best way to do this is to make it an optional step just before the
+completed optree is attached to anything else, and to use the slab allocator
+unchanged, so that freeing ops is identical whether or not this step runs.
+Note that the slab allocator allocates ops downwards in memory, so one would
+have to actually "allocate" the ops in reverse-execution order to get them
+contiguous in memory in execution order.
+
+See http://www.nntp.perl.org/group/perl.perl5.porters/2007/12/msg131975.html
+
+Note that running this copy, and then freeing all the old location ops would
+cause their slabs to be freed, which would eliminate possible memory wastage if
+the previous suggestion is implemented, and we swap slabs more frequently.
+
+=head2 eliminate incorrect line numbers in warnings
+
+This code
+
+ use warnings;
+ my $undef;
+
+ if ($undef == 3) {
+ } elsif ($undef == 0) {
+ }
+
+used to produce this output:
+
+ Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
+ Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
+
+where the line of the second warning was misreported - it should be line 5.
+Rafael fixed this - the problem arose because there was no nextstate OP
+between the execution of the C and the C, hence C still
+reports that the currently executing line is line 4. The solution was to inject
+a nextstate OPs for each C, although it turned out that the nextstate
+OP needed to be a nulled OP, rather than a live nextstate OP, else other line
+numbers became misreported. (Jenga!)
+
+The problem is more general than C (although the C case is the
+most common and the most confusing). Ideally this code
+
+ use warnings;
+ my $undef;
+
+ my $a = $undef + 1;
+ my $b
+ = $undef
+ + 1;
+
+would produce this output
+
+ Use of uninitialized value $undef in addition (+) at wrong.pl line 4.
+ Use of uninitialized value $undef in addition (+) at wrong.pl line 7.
+
+(rather than lines 4 and 5), but this would seem to require every OP to carry
+(at least) line number information.
+
+What might work is to have an optional line number in memory just before the
+BASEOP structure, with a flag bit in the op to say whether it's present.
+Initially during compile every OP would carry its line number. Then add a late
+pass to the optimiser (potentially combined with L) which
+looks at the two ops on every edge of the graph of the execution path. If
+the line number changes, flags the destination OP with this information.
+Once all paths are traced, replace every op with the flag with a
+nextstate-light op (that just updates C), which in turn then passes
+control on to the true op. All ops would then be replaced by variants that
+do not store the line number. (Which, logically, why it would work best in
+conjunction with L, as that is already copying/reallocating
+all the OPs)
+
+(Although I should note that we're not certain that doing this for the general
+case is worth it)
+
+=head2 optimize tail-calls
+
+Tail-calls present an opportunity for broadly applicable optimization;
+anywhere that C<< return foo(...) >> is called, the outer return can
+be replaced by a goto, and foo will return directly to the outer
+caller, saving (conservatively) 25% of perl's call&return cost, which
+is relatively higher than in C. The scheme language is known to do
+this heavily. B::Concise provides good insight into where this
+optimization is possible, ie anywhere entersub,leavesub op-sequence
+occurs.
+
+ perl -MO=Concise,-exec,a,b,-main -e 'sub a{ 1 }; sub b {a()}; b(2)'
+
+Bottom line on this is probably a new pp_tailcall function which
+combines the code in pp_entersub, pp_leavesub. This should probably
+be done 1st in XS, and using B::Generate to patch the new OP into the
+optrees.
=head1 Big projects
Tasks that will get your name mentioned in the description of the "Highlights
-of 5.10"
+of 5.12"
=head2 make ithreads more robust
@@ -612,6 +1078,8 @@ will be greatly appreciated.
One bit would be to write the missing code in sv.c:Perl_dirp_dup.
+Fix Perl_sv_dup, et al so that threads can return objects.
+
=head2 iCOW
Sarathy and Arthur have a proposal for an improved Copy On Write which
@@ -626,3 +1094,9 @@ Fix (or rewrite) the implementation of the C(?{...})/> closures.
This will allow the use of a regex from inside (?{ }), (??{ }) and
(?(?{ })|) constructs.
+
+=head2 Add class set operations to regexp engine
+
+Apparently these are quite useful. Anyway, Jeffery Friedl wants them.
+
+demerphq has this on his todo list, but right at the bottom.