Some thoughts on foreach reverse

[p5sagit/p5-mst-13.2.git] / pod / perlhack.pod
diff --git a/pod/perlhack.pod b/pod/perlhack.pod

index 2750e9e..0d7581c 100644 (file)
--- a/pod/perlhack.pod
+++ b/pod/perlhack.pod
@@ -215,8 +215,20 @@ changed.  The current state of the main trunk of repository, and patches
 that describe the individual changes that have happened since the last
 public release are available at this location:
 
+    http://public.activestate.com/gsar/APC/
     ftp://ftp.linux.activestate.com/pub/staff/gsar/APC/
 
+If you're looking for a particular change, or a change that affected
+a particular set of files, you may find the B<Perl Repository Browser>
+useful:
+
+    http://public.activestate.com/cgi-bin/perlbrowse
+
+You may also want to subscribe to the perl5-changes mailing list to
+receive a copy of each patch that gets submitted to the maintenance
+and development "branches" of the perl repository.  See
+http://lists.perl.org/ for subscription information.
+
 If you are a member of the perl5-porters mailing list, it is a good
 thing to keep in touch with the most recent changes. If not only to
 verify if what you would have posted as a bug report isn't already
@@ -368,22 +380,13 @@ from Andreas K
 Since you don't have to apply the patches yourself, you are sure all
 files in the source tree are in the right state.
 
-=item It's more recent
-
-According to Gurusamy Sarathy:
-
-   "... The rsync mirror is automatic and syncs with the repository
-    every five minutes.
-
-   "Updating the patch  area  still  requires  manual  intervention
-    (with all the goofiness that implies,  which you've noted)  and
-    is typically on a daily cycle.   Making this process  automatic
-    is on my tuit list, but don't ask me when."
-
 =item It's more reliable
 
-Well, since the patches are updated by hand, I don't have to say any
-more ... (see Sarathy's remark).
+While both the rsync-able source and patch areas are automatically
+updated every few minutes, keep in mind that applying patches may
+sometimes mean careful hand-holding, especially if your version of
+the C<patch> program does not understand how to deal with new files,
+files with 8-bit characters, or files without trailing newlines.
 
 =back
 
@@ -476,66 +479,111 @@ for reference.
 
 =back
 
+=head2 Working with the source
+
+Because you cannot use the Perforce client, you cannot easily generate
+diffs against the repository, nor will merges occur when you update
+via rsync.  If you edit a file locally and then rsync against the
+latest source, changes made in the remote copy will I<overwrite> your
+local versions!
+
+The best way to deal with this is to maintain a tree of symlinks to
+the rsync'd source.  Then, when you want to edit a file, you remove
+the symlink, copy the real file into the other tree, and edit it.  You
+can then diff your edited file against the original to generate a
+patch, and you can safely update the original tree.
+
+Perl's F<Configure> script can generate this tree of symlinks for you.
+The following example assumes that you have used rsync to pull a copy
+of the Perl source into the F<perl-rsync> directory.  In the directory
+above that one, you can execute the following commands:
+
+  mkdir perl-dev
+  cd perl-dev
+  ../perl-rsync/Configure -Dmksymlinks -Dusedevel -D"optimize=-g"
+
+This will start the Perl configuration process.  After a few prompts,
+you should see something like this:
+
+  Symbolic links are supported.
+
+  Checking how to test for symbolic links...
+  Your builtin 'test -h' may be broken.
+  Trying external '/usr/bin/test -h'.
+  You can test for symbolic links with '/usr/bin/test -h'.
+
+  Creating the symbolic links...
+  (First creating the subdirectories...)
+  (Then creating the symlinks...)
+
+The specifics may vary based on your operating system, of course.
+After you see this, you can abort the F<Configure> script, and you
+will see that the directory you are in has a tree of symlinks to the
+F<perl-rsync> directories and files.
+
+If you plan to do a lot of work with the Perl source, here are some
+Bourne shell script functions that can make your life easier:
+
+    function edit {
+       if [ -L $1 ]; then
+           mv $1 $1.orig
+               cp $1.orig $1
+               vi $1
+       else
+           /bin/vi $1
+               fi
+    }
 
-=head2 Perlbug remote interface
-
-=over 4
-
-There are three (3) remote administrative interfaces for modifying bug
-status, category, etc.  In all cases an admin must be first registered
-with the Perlbug database by sending an email request to
-richard@perl.org or bugmongers@perl.org.
-
-The main requirement is the willingness to classify, (with the
-emphasis on closing where possible :), outstanding bugs.  Further
-explanation can be garnered from the web at http://bugs.perl.org/ , or
-by asking on the admin mailing list at: bugmongers@perl.org
-
-For more info on the web see
-
-       http://bugs.perl.org/perlbug.cgi?req=spec
-
-
-B<The interfaces:>
-
-
-=item 1 http://bugs.perl.org
-
-Login via the web, (remove B<admin/> if only browsing), where interested Cc's, tests, patches and change-ids, etc. may be assigned.
-
-       http://bugs.perl.org/admin/index.html
-
-
-=item 2 bugdb@perl.org
-
-Where the subject line is used for commands:
-
-       To: bugdb@perl.org
-       Subject: -a close bugid1 bugid2 aix install
-
-       To: bugdb@perl.org
-       Subject: -h
+    function unedit {
+       if [ -L $1.orig ]; then
+           rm $1
+               mv $1.orig $1
+               fi
+    }
 
+Replace "vi" with your favorite flavor of editor.
+
+Here is another function which will quickly generate a patch for the
+files which have been edited in your symlink tree:
+
+    mkpatchorig() {
+       local diffopts
+           for f in `find . -name '*.orig' | sed s,^\./,,`
+               do
+                   case `echo $f | sed 's,.orig$,,;s,.*\.,,'` in
+                       c)   diffopts=-p ;;
+               pod) diffopts='-F^=' ;;
+               *)   diffopts= ;;
+               esac
+                   diff -du $diffopts $f `echo $f | sed 's,.orig$,,'`
+                   done
+    }
 
-=item 3 commands_and_bugdids@bugs.perl.org
+This function produces patches which include enough context to make
+your changes obvious.  This makes it easier for the Perl pumpking(s)
+to review them when you send them to the perl5-porters list, and that
+means they're more likely to get applied.
 
-Where the address itself is the source for the commands:
+This function assumed a GNU diff, and may require some tweaking for
+other diff variants.
 
-       To: close_bugid1_bugid2_aix@bugs.perl.org
+=head2 Perlbug administration
 
-       To: help@bugs.perl.org
+There is a single remote administrative interface for modifying bug status, 
+category, open issues etc. using the B<RT> I<bugtracker> system, maintained
+by I<Robert Spier>.  Become an administrator, and close any bugs you can get 
+your sticky mitts on:
 
+       http://rt.perl.org
 
-=item notes, patches, tests
+The bugtracker mechanism for B<perl5> bugs in particular is at:
 
-For patches and tests, the message body is assigned to the appropriate bug/s and forwarded to p5p for their attention.  
+       http://bugs6.perl.org/perlbug
 
-       To: test_<bugid1>_aix_close@bugs.perl.org
-       Subject: this is a test for the (now closed) aix bug
+To email the bug system administrators:
 
-       Test is the body of the mail
+       "perlbug-admin" <perlbug-admin@perl.org>
 
-=back
 
 =head2 Submitting patches
 
@@ -568,10 +616,10 @@ so whether it was considered a bug.  See above for the location of
 the searchable archives.
 
 The CPAN testers ( http://testers.cpan.org/ ) are a group of
-volunteers who test CPAN modules on a variety of platforms.  Perl Labs
-( http://labs.perl.org/ ) automatically tests Perl source releases on
-platforms and gives feedback to the CPAN testers mailing list.  Both
-efforts welcome volunteers.
+volunteers who test CPAN modules on a variety of platforms.  Perl
+Smokers ( http://archives.develooper.com/daily-build@perl.org/ )
+automatically tests Perl source releases on platforms with various
+configurations.  Both efforts welcome volunteers.
 
 It's a good idea to read and lurk for a while before chipping in.
 That way you'll get to see the dynamic of the conversations, learn the
@@ -620,12 +668,11 @@ wanting to go about Perl development.
 
 =item The perl5-porters FAQ
 
-This is posted to perl5-porters at the beginning on every month, and
-should be available from http://perlhacker.org/p5p-faq ; alternatively,
-you can get the FAQ emailed to you by sending mail to
-C<perl5-porters-faq@perl.org>. It contains hints on reading
-perl5-porters, information on how perl5-porters works and how Perl
-development in general works.
+This should be available from http://simon-cozens.org/writings/p5p-faq ;
+alternatively, you can get the FAQ emailed to you by sending mail to
+C<perl5-porters-faq@perl.org>. It contains hints on reading perl5-porters,
+information on how perl5-porters works and how Perl development in general
+works.
 
 =back
 
@@ -915,7 +962,7 @@ C<"\0">.
 
 Line 13 manipulates the flags; since we've changed the PV, any IV or NV
 values will no longer be valid: if we have C<$a=10; $a.="6";> we don't
-want to use the old IV of 10. C<SvPOK_only_utf8> is a special UTF8-aware
+want to use the old IV of 10. C<SvPOK_only_utf8> is a special UTF-8-aware
 version of C<SvPOK_only>, a macro which turns off the IOK and NOK flags
 and turns on POK. The final C<SvTAINT> is a macro which launders tainted
 data if taint mode is turned on.
@@ -945,7 +992,8 @@ operations in.
 
 The easiest way to examine the op tree is to stop Perl after it has
 finished parsing, and get it to dump out the tree. This is exactly what
-the compiler backends L<B::Terse|B::Terse> and L<B::Debug|B::Debug> do.
+the compiler backends L<B::Terse|B::Terse>, L<B::Concise|B::Concise>
+and L<B::Debug|B::Debug> do.
 
 Let's have a look at how Perl sees C<$a = $b + $c>:
 
@@ -1255,6 +1303,14 @@ important ones are explained in L<perlxs> as well. Pay special attention
 to L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for information on
 the C<[pad]THX_?> macros.
 
+=head2 The .i Targets
+
+You can expand the macros in a F<foo.c> file by saying
+
+    make foo.i
+
+which will expand the macros using cpp.  Don't be scared by the results.
+
 =head2 Poking at Perl
 
 To really poke around with Perl, you'll probably want to build Perl for
@@ -1348,8 +1404,11 @@ blessing when stepping through miles of source code.
 =item print
 
 Execute the given C code and print its results. B<WARNING>: Perl makes
-heavy use of macros, and F<gdb> is not aware of macros. You'll have to
-substitute them yourself. So, for instance, you can't say
+heavy use of macros, and F<gdb> does not necessarily support macros
+(see later L</"gdb macro support">).  You'll have to substitute them
+yourself, or to invoke cpp on the source code files
+(see L</"The .i Targets">)
+So, for instance, you can't say
 
     print SvPV_nolen(sv)
 
@@ -1357,11 +1416,19 @@ but you have to say
 
     print Perl_sv_2pv_nolen(sv)
 
+=back
+
 You may find it helpful to have a "macro dictionary", which you can
 produce by saying C<cpp -dM perl.c | sort>. Even then, F<cpp> won't
-recursively apply the macros for you. 
+recursively apply those macros for you. 
 
-=back
+=head2 gdb macro support
+
+Recent versions of F<gdb> have fairly good macro support, but
+in order to use it you'll need to compile perl with macro definitions
+included in the debugging information.  Using F<gcc> version 3.1, this
+means configuring with C<-Doptimize=-g3>.  Other compilers might use a
+different switch (if they support debugging macros at all).
 
 =head2 Dumping Perl Data Structures
 
@@ -1459,7 +1526,7 @@ some things you'll need to know when fiddling with them. Let's now get
 on and create a simple patch. Here's something Larry suggested: if a
 C<U> is the first active format during a C<pack>, (for example, 
 C<pack "U3C8", @stuff>) then the resulting string should be treated as
-UTF8 encoded.
+UTF-8 encoded.
 
 How do we prepare to fix this up? First we locate the code in question -
 the C<pack> happens at runtime, so it's going to be in one of the F<pp>
@@ -1508,7 +1575,7 @@ of C<pat>:
     while (pat < patend) {
 
 Now if we see a C<U> which was at the start of the string, we turn on
-the UTF8 flag for the output SV, C<cat>:
+the C<UTF8> flag for the output SV, C<cat>:
 
  +  if (datumtype == 'U' && pat==patcopy+1)
  +      SvUTF8_on(cat);
@@ -1594,10 +1661,10 @@ this text in the description of C<pack>:
  =item *
 
  If the pattern begins with a C<U>, the resulting string will be treated
- as Unicode-encoded. You can force UTF8 encoding on in a string with an
- initial C<U0>, and the bytes that follow will be interpreted as Unicode
- characters. If you don't want this to happen, you can begin your pattern
- with C<C0> (or anything else) to force Perl not to UTF8 encode your
+ as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string
+ with an initial C<U0>, and the bytes that follow will be interpreted as
+ Unicode characters. If you don't want this to happen, you can begin your
+ pattern with C<C0> (or anything else) to force Perl not to UTF-8 encode your
  string, and then follow this with a C<U*> somewhere in your pattern.
 
 All done. Now let's create the patch. F<Porting/patching.pod> tells us
@@ -1758,6 +1825,18 @@ modules hanging around in here that need to be moved out into F<lib/>.
 Testing features of how perl actually runs, including exit codes and
 handling of PERL* environment variables.
 
+=item F<t/uni/>
+
+Tests for the core support of Unicode.
+
+=item F<t/win32/>
+
+Windows-specific tests.
+
+=item F<t/x2p>
+
+A test suite for the s2p converter.
+
 =back
 
 The core uses the same testing style as the rest of Perl, a simple
@@ -1814,17 +1893,28 @@ aliases.
 
 =item coretest
 
-Run F<perl> on all but F<lib/*> tests.
+Run F<perl> on all core tests (F<t/*> and F<lib/[a-z]*> pragma tests).
 
 =item test.deparse
 
-Run all the tests through the B::Deparse.  Not all tests will succeed.
+Run all the tests through B::Deparse.  Not all tests will succeed.
+
+=item test.taintwarn
+
+Run all tests with the B<-t> command-line switch.  Not all tests
+are expected to succeed (until they're specifically fixed, of course).
 
 =item minitest
 
 Run F<miniperl> on F<t/base>, F<t/comp>, F<t/cmd>, F<t/run>, F<t/io>,
 F<t/op>, and F<t/uni> tests.
 
+=item test.valgrind check.valgrind utest.valgrind ucheck.valgrind
+
+(Only in Linux) Run all the tests using the memory leak + naughty
+memory access tool "valgrind".  The log files will be named
+F<testname.valgrind>.
+
 =item test.third check.third utest.third ucheck.third
 
 (Only in Tru64)  Run all the tests using the memory leak + naughty
@@ -1834,7 +1924,7 @@ F<perl3.log.testname>.
 =item test.torture torturetest
 
 Run all the usual tests and some extra tests.  As of Perl 5.8.0 the
-only extra tests are Abigail's JAPHs, t/japh/abigail.t.
+only extra tests are Abigail's JAPHs, F<t/japh/abigail.t>.
 
 You can also run the torture test with F<t/harness> by giving
 C<-torture> argument to F<t/harness>.
@@ -1843,6 +1933,59 @@ C<-torture> argument to F<t/harness>.
 
 Run all the tests with -Mutf8.  Not all tests will succeed.
 
+=item test_harness
+
+Run the test suite with the F<t/harness> controlling program, instead of
+F<t/TEST>. F<t/harness> is more sophisticated, and uses the
+L<Test::Harness> module, thus using this test target supposes that perl
+mostly works. The main advantage for our purposes is that it prints a
+detailed summary of failed tests at the end. Also, unlike F<t/TEST>, it
+doesn't redirect stderr to stdout.
+
+=back
+
+=head2 Running tests by hand
+
+You can run part of the test suite by hand by using one the following
+commands from the F<t/> directory :
+
+    ./perl -I../lib TEST list-of-.t-files
+
+or
+
+    ./perl -I../lib harness list-of-.t-files
+
+(if you don't specify test scripts, the whole test suite will be run.)
+
+You can run an individual test by a command similar to
+
+    ./perl -I../lib patho/to/foo.t
+
+except that the harnesses set up some environment variables that may
+affect the execution of the test :
+
+=over 4 
+
+=item PERL_CORE=1
+
+indicates that we're running this test part of the perl core test suite.
+This is useful for modules that have a dual life on CPAN.
+
+=item PERL_DESTRUCT_LEVEL=2
+
+is set to 2 if it isn't set already (see L</PERL_DESTRUCT_LEVEL>)
+
+=item PERL
+
+(used only by F<t/TEST>) if set, overrides the path to the perl executable
+that should be used to run the tests (the default being F<./perl>).
+
+=item PERL_SKIP_TTY_TEST
+
+if set, tells to skip the tests that need a terminal. It's actually set
+automatically by the Makefile, but can also be forced artificially by
+running 'make test_notty'.
+
 =back
 
 =head1 EXTERNAL TOOLS FOR DEBUGGING PERL
@@ -1853,6 +1996,38 @@ some common testing and debugging tools with Perl.  This is
 meant as a guide to interfacing these tools with Perl, not
 as any kind of guide to the use of the tools themselves.
 
+B<NOTE 1>: Running under memory debuggers such as Purify, valgrind, or
+Third Degree greatly slows down the execution: seconds become minutes,
+minutes become hours.  For example as of Perl 5.8.1, the
+ext/Encode/t/Unicode.t takes extraordinarily long to complete under
+e.g. Purify, Third Degree, and valgrind.  Under valgrind it takes more
+than six hours, even on a snappy computer-- the said test must be
+doing something that is quite unfriendly for memory debuggers.  If you
+don't feel like waiting, that you can simply kill away the perl
+process.
+
+B<NOTE 2>: To minimize the number of memory leak false alarms (see
+L</PERL_DESTRUCT_LEVEL> for more information), you have to have
+environment variable PERL_DESTRUCT_LEVEL set to 2.  The F<TEST>
+and harness scripts do that automatically.  But if you are running
+some of the tests manually-- for csh-like shells:
+
+    setenv PERL_DESTRUCT_LEVEL 2
+
+and for Bourne-type shells:
+
+    PERL_DESTRUCT_LEVEL=2
+    export PERL_DESTRUCT_LEVEL
+
+or in UNIXy environments you can also use the C<env> command:
+
+    env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ...
+
+B<NOTE 3>: There are known memory leaks when there are compile-time
+errors within eval or require, seeing C<S_doeval> in the call stack
+is a good sign of these.  Fixing these leaks is non-trivial,
+unfortunately, but they must be fixed eventually.
+
 =head2 Rational Software's Purify
 
 Purify is a commercial tool that is helpful in identifying
@@ -1861,11 +2036,6 @@ badness.  Perl must be compiled in a specific way for
 optimal testing with Purify.  Purify is available under
 Windows NT, Solaris, HP-UX, SGI, and Siemens Unix.
 
-The only currently known leaks happen when there are
-compile-time errors within eval or require.  (Fixing these
-is non-trivial, unfortunately, but they must be fixed
-eventually.)
-
 =head2 Purify on Unix
 
 On Unix, Purify creates a new Perl binary.  To get the most
@@ -1914,17 +2084,6 @@ which creates a binary named 'pureperl' that has been Purify'ed.
 This binary is used in place of the standard 'perl' binary
 when you want to debug Perl memory problems.
 
-To minimize the number of memory leak false alarms
-(see L</PERL_DESTRUCT_LEVEL>), set environment variable
-PERL_DESTRUCT_LEVEL to 2.
-
-    setenv PERL_DESTRUCT_LEVEL 2
-
-In Bourne-type shells:
-
-    PERL_DESTRUCT_LEVEL=2
-    export PERL_DESTRUCT_LEVEL
-
 As an example, to show any memory leaks produced during the
 standard Perl testset you would create and run the Purify'ed
 perl as:
@@ -1950,12 +2109,12 @@ If you plan to use the "Viewer" windows, then you only need this option:
 
 In Bourne-type shells:
 
-    PURIFY_OPTIONS="..."
-    export PURIFY_OPTIONS
+    PURIFYOPTIONS="..."
+    export PURIFYOPTIONS
 
 or if you have the "env" utility:
 
-    env PURIFY_OPTIONS="..." ../pureperl ...
+    env PURIFYOPTIONS="..." ../pureperl ...
 
 =head2 Purify on NT
 
@@ -2007,7 +2166,24 @@ standard Perl testset you would create and run Purify as:
 which would instrument Perl in memory, run Perl on test.pl,
 then finally report any memory problems.
 
-=head2 Compaq's/Digital's Third Degree
+=head2 valgrind
+
+The excellent valgrind tool can be used to find out both memory leaks
+and illegal memory accesses.  As of August 2003 it unfortunately works
+only on x86 (ELF) Linux.  The special "test.valgrind" target can be used
+to run the tests under valgrind.  Found errors and memory leaks are
+logged in files named F<test.valgrind>.
+
+As system libraries (most notably glibc) are also triggering errors,
+valgrind allows to suppress such errors using suppression files. The
+default suppression file that comes with valgrind already catches a lot
+of them. Some additional suppressions are defined in F<t/perl.supp>.
+
+To get valgrind and for more information see
+
+    http://developer.kde.org/~sewardj/
+
+=head2 Compaq's/Digital's/HP's Third Degree
 
 Third Degree is a tool for memory leak detection and memory access checks.
 It is one of the many tools in the ATOM toolkit.  The toolkit is only
@@ -2027,33 +2203,23 @@ third for more information.  The most extensive Third Degree
 documentation is available in the Compaq "Tru64 UNIX Programmer's
 Guide", chapter "Debugging Programs with Third Degree".
 
-The "test.third" leaves a lot of files named F<perl.3log.*> in the t/
+The "test.third" leaves a lot of files named F<foo_bar.3log> in the t/
 subdirectory.  There is a problem with these files: Third Degree is so
 effective that it finds problems also in the system libraries.
-Therefore there are certain types of errors that you should ignore in
-your debugging.  Errors with stack traces matching
-
-    __actual_atof|__catgets|_doprnt|__exc_|__exec|_findio|__localtime|setlocale|__sia_|__strxfrm
-
-(all in libc.so) are known to be non-serious.  You can also
-ignore the combinations
-
-    Perl_gv_fetchfile() calling strcpy()
-    S_doopen_pmc() calling strcmp()
-
-causing "rih" (reading invalid heap) errors.
+Therefore you should used the Porting/thirdclean script to cleanup
+the F<*.3log> files.
 
 There are also leaks that for given certain definition of a leak,
 aren't.  See L</PERL_DESTRUCT_LEVEL> for more information.
 
 =head2 PERL_DESTRUCT_LEVEL
 
-If you want to run any of the tests yourself manually using the
-pureperl or perl.third executables, please note that by default
-perl B<does not> explicitly cleanup all the memory it has allocated
-(such as global memory arenas) but instead lets the exit() of
-the whole program "take care" of such allocations, also known
-as "global destruction of objects".
+If you want to run any of the tests yourself manually using e.g.
+valgrind, or the pureperl or perl.third executables, please note that
+by default perl B<does not> explicitly cleanup all the memory it has
+allocated (such as global memory arenas) but instead lets the exit()
+of the whole program "take care" of such allocations, also known as
+"global destruction of objects".
 
 There is a way to tell perl to do complete cleanup: set the
 environment variable PERL_DESTRUCT_LEVEL to a non-zero value.
@@ -2065,7 +2231,14 @@ For example, for "third-degreed" Perl:
 
 (Note: the mod_perl apache module uses also this environment variable
 for its own purposes and extended its semantics. Refer to the mod_perl
-documentation for more information.)
+documentation for more information. Also, spawned threads do the
+equivalent of setting this variable to the value 1.)
+
+If, at the end of a run you get the message I<N scalars leaked>, you can
+recompile with C<-DDEBUG_LEAKING_SCALARS>, which will cause
+the addresses of all those leaked SVs to be dumped; it also converts
+C<new_SV()> from a macro into a real function, so you can use your
+favourite debugger to discover where those pesky SVs were allocated.
 
 =head2 Profiling
 
@@ -2293,6 +2466,11 @@ Alternatively edit the init file interactively via:
 Note: you can define up to 20 conversion shortcuts in the gdb
 section.
 
+=item *
+
+If you see in a debugger a memory area mysteriously full of 0xabababab,
+you may be seeing the effect of the Poison() macro, see L<perlclib>.
+
 =back
 
 =head2 CONCLUSION