X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperltodo.pod;h=a16cf0d604adff14905f09dfc0f43ed39d2aacd2;hb=e706c0cd31a70bd2c97d4510f261613278a7e1f5;hp=a4e655cdcbba2b181e35d89989b877842e5ccdb6;hpb=46925299862a0e463c499a99799cb56d12e9b3a9;p=p5sagit%2Fp5-mst-13.2.git
diff --git a/pod/perltodo.pod b/pod/perltodo.pod
index a4e655c..a16cf0d 100644
--- a/pod/perltodo.pod
+++ b/pod/perltodo.pod
@@ -4,10 +4,14 @@ perltodo - Perl TO-DO List
=head1 DESCRIPTION
-This is a list of wishes for Perl. The tasks we think are smaller or easier
-are listed first. Anyone is welcome to work on any of these, but it's a good
-idea to first contact I to avoid duplication of
-effort. By all means contact a pumpking privately first if you prefer.
+This is a list of wishes for Perl. The most up to date version of this file
+is at http://perl5.git.perl.org/perl.git/blob_plain/HEAD:/pod/perltodo.pod
+
+The tasks we think are smaller or easier are listed first. Anyone is welcome
+to work on any of these, but it's a good idea to first contact
+I to avoid duplication of effort, and to learn from
+any previous attempts. By all means contact a pumpking privately first if you
+prefer.
Whilst patches to make the list shorter are most welcome, ideas to add to
the list are also encouraged. Check the perl5-porters archives for past
@@ -15,19 +19,74 @@ ideas, and any discussion about them. One set of archives may be found at:
http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/
+What can we offer you in return? Fame, fortune, and everlasting glory? Maybe
+not, but if your patch is incorporated, then we'll add your name to the
+F file, which ships in the official distribution. How many other
+programming languages offer you 1 line of immortality?
+
+=head1 Tasks that only need Perl knowledge
+=head2 Improve Porting/cmpVERSION.pl to work from git tags
+See F for a bit more detail.
+=head2 Migrate t/ from custom TAP generation
-=head1 Tasks that only need Perl knowledge
+Many tests below F still generate TAP by "hand", rather than using library
+functions. As explained in L, tests in F are
+written in a particular way to test that more complex constructions actually
+work before using them routinely. Hence they don't use C, but
+instead there is an intentionally simpler library, F. However,
+quite a few tests in F have not been refactored to use it. Refactoring
+any of these tests, one at a time, is a useful thing TODO.
+
+The subdirectories F, F and F, that contain the most
+basic tests, should be excluded from this task.
+
+=head2 Test that regen.pl was run
+
+There are various generated files shipped with the perl distribution, for
+things like header files generate from data. The generation scripts are
+written in perl, and all can be run by F. However, because they're
+written in perl, we can't run them before we've built perl. We can't run them
+as part of the F, because changing files underneath F confuses
+it completely, and we don't want to run them automatically anyway, as they
+change files shipped by the distribution, something we seek not do to.
-=head2 common test code for timed bail out
+If someone changes the data, but forgets to re-run F then the
+generated files are out of sync. It would be good to have a test in
+F that checks that the generated files are in sync, and fails
+otherwise, to alert someone before they make a poor commit. I suspect that this
+would require adapting the scripts run from F to have dry-run
+options, and invoking them with these, or by refactoring them into a library
+that does the generation, which can be called by the scripts, and by the test.
-Write portable self destruct code for tests to stop them burning CPU in
-infinite loops. This needs to avoid using alarm, as some of the tests are
-testing alarm/sleep or timers.
+=head2 Automate perldelta generation
-=head2 POD -> HTML conversion in the core still sucks
+The perldelta file accompanying each release summaries the major changes.
+It's mostly manually generated currently, but some of that could be
+automated with a bit of perl, specifically the generation of
+
+=over
+
+=item Modules and Pragmata
+
+=item New Documentation
+
+=item New Tests
+
+=back
+
+See F for details.
+
+=head2 Remove duplication of test setup.
+
+Schwern notes, that there's duplication of code - lots and lots of tests have
+some variation on the big block of C<$Is_Foo> checks. We can safely put this
+into a file, change it to build an C<%Is> hash and require it. Maybe just put
+it into F. Throw in the handy tainting subroutines.
+
+=head2 POD -E HTML conversion in the core still sucks
Which is crazy given just how simple POD purports to be, and how simple HTML
can be. It's not actually I simple as it sounds, particularly with the
@@ -36,19 +95,32 @@ visual appeal of the HTML generated, and to avoid it having any validation
errors. See also L, as the layout of installation tree
is needed to improve the cross-linking.
+The addition of C and its related modules may make this task
+easier to complete.
+
+=head2 Make ExtUtils::ParseXS use strict;
+
+F contains this line
+
+ # use strict; # One of these days...
+
+Simply uncomment it, and fix all the resulting issues :-)
+
+The more practical approach, to break the task down into manageable chunks, is
+to work your way though the code from bottom to top, or if necessary adding
+extra C<{ ... }> blocks, and turning on strict within them.
+
=head2 Make Schwern poorer
-We should have for everything. When all the core's modules are tested,
+We should have tests for everything. When all the core's modules are tested,
Schwern has promised to donate to $500 to TPF. We may need volunteers to
hold him upside down and shake vigorously in order to actually extract the
cash.
-See F for the 3 remaining modules that need tests.
-
=head2 Improve the coverage of the core tests
-Use Devel::Cover to ascertain the core's test coverage, then add tests that
-are currently missing.
+Use Devel::Cover to ascertain the core modules' test coverage, then add
+tests that are currently missing.
=head2 test B
@@ -56,7 +128,7 @@ A full test suite for the B module would be nice.
=head2 A decent benchmark
-perlbench seems impervious to any recent changes made to the perl core. It
+C seems impervious to any recent changes made to the perl core. It
would be useful to have a reasonable general benchmarking suite that roughly
represented what current perl programs do, and measurably reported whether
tweaks to the core improve, degrade or don't really affect performance, to
@@ -75,10 +147,16 @@ distribution needs to be dual lifed. Anything else can be too. Figure out what
changes would be needed to package that module and its tests up for CPAN, and
do so. Test it with older perl releases, and fix the problems you find.
-=head2 Improving C
+To make a minimal perl distribution, it's useful to look at
+F.
+
+=head2 Move dual-life pod/*.PL into ext
-Investigate whether C could share aggregates properly with
-only Perl level changes to shared.pm
+Nearly all the dual-life modules have been moved to F. However, we
+still need to move F into their respective directories
+in F. They're referenced by (at least) C in F
+and C in F and F, and listed
+explicitly in F, F and F
=head2 POSIX memory footprint
@@ -86,11 +164,48 @@ Ilya observed that use POSIX; eats memory like there's no tomorrow, and at
various times worked to cut it down. There is probably still fat to cut out -
for example POSIX passes Exporter some very memory hungry data structures.
+=head2 embed.pl/makedef.pl
+There is a script F that generates several header files to prefix
+all of Perl's symbols in a consistent way, to provide some semblance of
+namespace support in C. Functions are declared in F, variables
+in F. Quite a few of the functions and variables
+are conditionally declared there, using C<#ifdef>. However, F
+doesn't understand the C macros, so the rules about which symbols are present
+when is duplicated in F. Writing things twice is bad, m'kay.
+It would be good to teach C to understand the conditional
+compilation, and hence remove the duplication, and the mistakes it has caused.
+=head2 use strict; and AutoLoad
+Currently if you write
+ package Whack;
+ use AutoLoader 'AUTOLOAD';
+ use strict;
+ 1;
+ __END__
+ sub bloop {
+ print join (' ', No, strict, here), "!\n";
+ }
+then C
would be roughly equivalent to:
-Introduce a new special block, UNITCHECK, which is run at the end of a
-compilation unit (module, file, eval(STRING) block). This will correspond to
-the Perl 6 CHECK. Perl 5's CHECK cannot be changed or removed because the
-O.pm/B.pm backend framework depends on it.
+ do { local $"='|'; /\b(?:P)\b/ }
+
+See L
+for the discussion.
=head2 optional optimizer
@@ -446,49 +1087,183 @@ perl and XS subroutines. Subroutine implementations rarely change between
perl and XS at run time, so investigate using 2 ops to enter subs (one for
XS, one for perl) and swap between if a sub is redefined.
-=head2 Self ties
+=head2 Self-ties
-self ties are currently illegal because they caused too many segfaults. Maybe
-the causes of these could be tracked down and self-ties on all types re-
-instated.
+Self-ties are currently illegal because they caused too many segfaults. Maybe
+the causes of these could be tracked down and self-ties on all types
+reinstated.
=head2 Optimize away @_
The old perltodo notes "Look at the "reification" code in C".
-=head2 switch ops
-
-The old perltodo notes "Although we have C in core, Larry points to
-the dormant C and C ops in F; using these opcodes would
-be much faster."
-
-=head2 What hooks would assertions need?
-
-Assertions are in the core, and work. However, assertions needed to be added
-as a core patch, rather than an XS module in ext, or a CPAN module, because
-the core has no hooks in the necessary places. It would be useful to
-investigate what hooks would need to be added to make it possible to provide
-the full assertion support from a CPAN module, so that we aren't constraining
-the imagination of future CPAN authors.
-
+=head2 Virtualize operating system access
+
+Implement a set of "vtables" that virtualizes operating system access
+(open(), mkdir(), unlink(), readdir(), getenv(), etc.) At the very
+least these interfaces should take SVs as "name" arguments instead of
+bare char pointers; probably the most flexible and extensible way
+would be for the Perl-facing interfaces to accept HVs. The system
+needs to be per-operating-system and per-file-system
+hookable/filterable, preferably both from XS and Perl level
+(L is good reading at this point,
+in fact, all of L is.)
+
+This has actually already been implemented (but only for Win32),
+take a look at F and F. While all Win32
+variants go through a set of "vtables" for operating system access,
+non-Win32 systems currently go straight for the POSIX/Unix-style
+system/library call. Similar system as for Win32 should be
+implemented for all platforms. The existing Win32 implementation
+probably does not need to survive alongside this proposed new
+implementation, the approaches could be merged.
+
+What would this give us? One often-asked-for feature this would
+enable is using Unicode for filenames, and other "names" like %ENV,
+usernames, hostnames, and so forth.
+(See L.)
+
+But this kind of virtualization would also allow for things like
+virtual filesystems, virtual networks, and "sandboxes" (though as long
+as dynamic loading of random object code is allowed, not very safe
+sandboxes since external code of course know not of Perl's vtables).
+An example of a smaller "sandbox" is that this feature can be used to
+implement per-thread working directories: Win32 already does this.
+
+See also L"Extend PerlIO and PerlIO::Scalar">.
+
+=head2 Investigate PADTMP hash pessimisation
+
+The peephole optimiser converts constants used for hash key lookups to shared
+hash key scalars. Under ithreads, something is undoing this work.
+See http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2007-09/msg00793.html
+
+=head2 Store the current pad in the OP slab allocator
+
+=for clarification
+I hope that I got that "current pad" part correct
+
+Currently we leak ops in various cases of parse failure. I suggested that we
+could solve this by always using the op slab allocator, and walking it to
+free ops. Dave comments that as some ops are already freed during optree
+creation one would have to mark which ops are freed, and not double free them
+when walking the slab. He notes that one problem with this is that for some ops
+you have to know which pad was current at the time of allocation, which does
+change. I suggested storing a pointer to the current pad in the memory allocated
+for the slab, and swapping to a new slab each time the pad changes. Dave thinks
+that this would work.
+
+=head2 repack the optree
+
+Repacking the optree after execution order is determined could allow
+removal of NULL ops, and optimal ordering of OPs with respect to cache-line
+filling. The slab allocator could be reused for this purpose. I think that
+the best way to do this is to make it an optional step just before the
+completed optree is attached to anything else, and to use the slab allocator
+unchanged, so that freeing ops is identical whether or not this step runs.
+Note that the slab allocator allocates ops downwards in memory, so one would
+have to actually "allocate" the ops in reverse-execution order to get them
+contiguous in memory in execution order.
+
+See http://www.nntp.perl.org/group/perl.perl5.porters/2007/12/msg131975.html
+
+Note that running this copy, and then freeing all the old location ops would
+cause their slabs to be freed, which would eliminate possible memory wastage if
+the previous suggestion is implemented, and we swap slabs more frequently.
+
+=head2 eliminate incorrect line numbers in warnings
+
+This code
+
+ use warnings;
+ my $undef;
+
+ if ($undef == 3) {
+ } elsif ($undef == 0) {
+ }
+used to produce this output:
+ Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
+ Use of uninitialized value in numeric eq (==) at wrong.pl line 4.
+where the line of the second warning was misreported - it should be line 5.
+Rafael fixed this - the problem arose because there was no nextstate OP
+between the execution of the C and the C, hence C still
+reports that the currently executing line is line 4. The solution was to inject
+a nextstate OPs for each C, although it turned out that the nextstate
+OP needed to be a nulled OP, rather than a live nextstate OP, else other line
+numbers became misreported. (Jenga!)
+The problem is more general than C (although the C case is the
+most common and the most confusing). Ideally this code
+ use warnings;
+ my $undef;
+
+ my $a = $undef + 1;
+ my $b
+ = $undef
+ + 1;
+
+would produce this output
+
+ Use of uninitialized value $undef in addition (+) at wrong.pl line 4.
+ Use of uninitialized value $undef in addition (+) at wrong.pl line 7.
+
+(rather than lines 4 and 5), but this would seem to require every OP to carry
+(at least) line number information.
+
+What might work is to have an optional line number in memory just before the
+BASEOP structure, with a flag bit in the op to say whether it's present.
+Initially during compile every OP would carry its line number. Then add a late
+pass to the optimiser (potentially combined with L) which
+looks at the two ops on every edge of the graph of the execution path. If
+the line number changes, flags the destination OP with this information.
+Once all paths are traced, replace every op with the flag with a
+nextstate-light op (that just updates C), which in turn then passes
+control on to the true op. All ops would then be replaced by variants that
+do not store the line number. (Which, logically, why it would work best in
+conjunction with L, as that is already copying/reallocating
+all the OPs)
+
+(Although I should note that we're not certain that doing this for the general
+case is worth it)
+
+=head2 optimize tail-calls
+
+Tail-calls present an opportunity for broadly applicable optimization;
+anywhere that C<< return foo(...) >> is called, the outer return can
+be replaced by a goto, and foo will return directly to the outer
+caller, saving (conservatively) 25% of perl's call&return cost, which
+is relatively higher than in C. The scheme language is known to do
+this heavily. B::Concise provides good insight into where this
+optimization is possible, ie anywhere entersub,leavesub op-sequence
+occurs.
+
+ perl -MO=Concise,-exec,a,b,-main -e 'sub a{ 1 }; sub b {a()}; b(2)'
+
+Bottom line on this is probably a new pp_tailcall function which
+combines the code in pp_entersub, pp_leavesub. This should probably
+be done 1st in XS, and using B::Generate to patch the new OP into the
+optrees.
=head1 Big projects
Tasks that will get your name mentioned in the description of the "Highlights
-of 5.10"
+of 5.12"
=head2 make ithreads more robust
-Generally make ithreads more robust. See also L
+Generally make ithreads more robust. See also L
This task is incremental - even a little bit of work on it will help, and
will be greatly appreciated.
+One bit would be to write the missing code in sv.c:Perl_dirp_dup.
+
+Fix Perl_sv_dup, et al so that threads can return objects.
+
=head2 iCOW
Sarathy and Arthur have a proposal for an improved Copy On Write which
@@ -503,3 +1278,30 @@ Fix (or rewrite) the implementation of the C(?{...})/> closures.
This will allow the use of a regex from inside (?{ }), (??{ }) and
(?(?{ })|) constructs.
+
+=head2 Add class set operations to regexp engine
+
+Apparently these are quite useful. Anyway, Jeffery Friedl wants them.
+
+demerphq has this on his todo list, but right at the bottom.
+
+
+=head1 Tasks for microperl
+
+
+[ Each and every one of these may be obsolete, but they were listed
+ in the old Todo.micro file]
+
+
+=head2 make creating uconfig.sh automatic
+
+=head2 make creating Makefile.micro automatic
+
+=head2 do away with fork/exec/wait?
+
+(system, popen should be enough?)
+
+=head2 some of the uconfig.sh really needs to be probed (using cc) in buildtime:
+
+(uConfigure? :-) native datatype widths and endianness come to mind
+