X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlsyn.pod;h=9cf39a3d5a6ce94fda48d9094de293426724fe08;hb=7cfe7857715f78206e6d7d6f7fd52983de4dec44;hp=3ddb493c8bd1d4e1d81da5dbfd96819931eb3ddb;hpb=a0d0e21ea6ea90a22318550944fe6cb09ae10cda;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlsyn.pod b/pod/perlsyn.pod index 3ddb493..9cf39a3 100644 --- a/pod/perlsyn.pod +++ b/pod/perlsyn.pod @@ -19,35 +19,43 @@ which lines you look at. (Actually, I'm lying--it is possible to do an implicit loop with either the B<-n> or B<-p> switch. It's just not the mandatory default like it is in B and B.) +=head2 Declarations + Perl is, for the most part, a free-form language. (The only exception to this is format declarations, for obvious reasons.) Comments are indicated by the "#" character, and extend to the end of the line. If you attempt to use C C-style comments, it will be interpreted either as division or pattern matching, depending on the context, and C++ -C comments just look like a null regular expression, So don't do +C comments just look like a null regular expression, so don't do that. A declaration can be put anywhere a statement can, but has no effect on the execution of the primary sequence of statements--declarations all take effect at compile time. Typically all the declarations are put at -the beginning or the end of the script. +the beginning or the end of the script. However, if you're using +lexically-scoped private variables created with my(), you'll have to make sure +your format or subroutine definition is within the same block scope +as the my if you expect to be able to access those private variables. -As of Perl 5, declaring a subroutine allows a subroutine name to be used -as if it were a list operator from that point forward in the program. You -can declare a subroutine without defining it by saying just +Declaring a subroutine allows a subroutine name to be used as if it were a +list operator from that point forward in the program. You can declare a +subroutine (prototyped to take one scalar parameter) without defining it by saying just: - sub myname; + sub myname ($); $me = myname $0 or die "can't get myname"; -Note that it functions as a list operator though, not a unary +Note that it functions as a list operator though, not as a unary operator, so be careful to use C instead of C<||> there. -Subroutines declarations can also be imported by a C statement. +Subroutines declarations can also be loaded up with the C statement +or both loaded and imported into your namespace with a C statement. +See L for details on this. -Also as of Perl 5, a statement sequence may contain declarations of -lexically scoped variables, but apart from declaring a variable name, -the declaration acts like an ordinary statement, and is elaborated within -the sequence of statements as if it were an ordinary statement. +A statement sequence may contain declarations of lexically-scoped +variables, but apart from declaring a variable name, the declaration acts +like an ordinary statement, and is elaborated within the sequence of +statements as if it were an ordinary statement. That means it actually +has both compile-time and run-time effects. =head2 Simple statements @@ -55,11 +63,10 @@ The only kind of simple statement is an expression evaluated for its side effects. Every simple statement must be terminated with a semicolon, unless it is the final statement in a block, in which case the semicolon is optional. (A semicolon is still encouraged there if the -block takes up more than one line, since you may add another line.) +block takes up more than one line, because you may eventually add another line.) Note that there are some operators like C and C that look like compound statements, but aren't (they're just TERMs in an expression), -and thus need an explicit termination -if used as the last item in a statement. +and thus need an explicit termination if used as the last item in a statement. Any simple statement may optionally be followed by a I modifier, just before the terminating semicolon (or block ending). The possible @@ -79,14 +86,14 @@ executes once before the conditional is evaluated. This is so that you can write loops like: do { - $_ = ; + $line = ; ... - } until $_ eq ".\n"; + } until $line eq ".\n"; See L. Note also that the loop control -statements described later will I work in this construct, since +statements described later will I work in this construct, because modifiers don't take loop labels. Sorry. You can always wrap -another block around it to do that sort of thing.) +another block around it to do that sort of thing. =head2 Compound statements @@ -106,7 +113,7 @@ The following compound statements may be used to control flow: LABEL while (EXPR) BLOCK LABEL while (EXPR) BLOCK continue BLOCK LABEL for (EXPR; EXPR; EXPR) BLOCK - LABEL foreach VAR (ARRAY) BLOCK + LABEL foreach VAR (LIST) BLOCK LABEL BLOCK continue BLOCK Note that, unlike C and Pascal, these are defined in terms of BLOCKs, @@ -121,21 +128,93 @@ all do the same thing: open(FOO) ? 'hi mom' : die "Can't open $FOO: $!"; # a bit exotic, that last one -The C statement is straightforward. Since BLOCKs are always +The C statement is straightforward. Because BLOCKs are always bounded by curly brackets, there is never any ambiguity about which C an C goes with. If you use C in place of C, the sense of the test is reversed. The C statement executes the block as long as the expression is true (does not evaluate to the null string or 0 or "0"). The LABEL is -optional, and if present, consists of an identifier followed by a -colon. The LABEL identifies the loop for the loop control statements -C, C, and C (see below). If there is a C -BLOCK, it is always executed just before the conditional is about to be -evaluated again, just like the third part of a C loop in C. -Thus it can be used to increment a loop variable, even when the loop -has been continued via the C statement (which is similar to the C -C statement). +optional, and if present, consists of an identifier followed by a colon. +The LABEL identifies the loop for the loop control statements C, +C, and C. If the LABEL is omitted, the loop control statement +refers to the innermost enclosing loop. This may include dynamically +looking back your call-stack at run time to find the LABEL. Such +desperate behavior triggers a warning if you use the B<-w> flag. + +If there is a C BLOCK, it is always executed just before the +conditional is about to be evaluated again, just like the third part of a +C loop in C. Thus it can be used to increment a loop variable, even +when the loop has been continued via the C statement (which is +similar to the C C statement). + +=head2 Loop Control + +The C command is like the C statement in C; it starts +the next iteration of the loop: + + LINE: while () { + next LINE if /^#/; # discard comments + ... + } + +The C command is like the C statement in C (as used in +loops); it immediately exits the loop in question. The +C block, if any, is not executed: + + LINE: while () { + last LINE if /^$/; # exit when done with header + ... + } + +The C command restarts the loop block without evaluating the +conditional again. The C block, if any, is I executed. +This command is normally used by programs that want to lie to themselves +about what was just input. + +For example, when processing a file like F. +If your input lines might end in backslashes to indicate continuation, you +want to skip ahead and get the next record. + + while (<>) { + chomp; + if (s/\\$//) { + $_ .= <>; + redo unless eof(); + } + # now process $_ + } + +which is Perl short-hand for the more explicitly written version: + + LINE: while ($line = ) { + chomp($line); + if ($line =~ s/\\$//) { + $line .= ; + redo LINE unless eof(); # not eof(ARGV)! + } + # now process $line + } + +Or here's a simpleminded Pascal comment stripper (warning: assumes no { or } in strings). + + LINE: while () { + while (s|({.*}.*){.*}|$1 |) {} + s|{.*}| |; + if (s|{.*| |) { + $front = $_; + while () { + if (/}/) { # end of comment? + s|^|$front{|; + redo LINE; + } + } + } + print; + } + +Note that if there were a C block on the above code, it would get +executed even on discarded lines. If the word C is replaced by the word C, the sense of the test is reversed, but the conditional is still tested before the first @@ -143,17 +222,20 @@ iteration. In either the C or the C statement, you may replace "(EXPR)" with a BLOCK, and the conditional is true if the value of the last -statement in that block is true. (This feature continues to work in Perl -5 but is deprecated. Please change any occurrences of "if BLOCK" to -"if (do BLOCK)".) +statement in that block is true. While this "feature" continues to work in +version 5, it has been deprecated, so please change any occurrences of "if BLOCK" to +"if (do BLOCK)". + +=head2 For Loops -The C-style C loop works exactly like the corresponding C loop: +Perl's C-style C loop works exactly like the corresponding C loop; +that means that this: for ($i = 1; $i < 10; $i++) { ... } -is the same as +is the same as this: $i = 1; while ($i < 10) { @@ -162,36 +244,99 @@ is the same as $i++; } -The foreach loop iterates over a normal list value and sets the -variable VAR to be each element of the list in turn. The variable is -implicitly local to the loop (unless declared previously with C), -and regains its former value upon exiting the loop. The C -keyword is actually a synonym for the C keyword, so you can use -C for readability or C for brevity. If VAR is omitted, $_ -is set to each value. If ARRAY is an actual array (as opposed to an -expression returning a list value), you can modify each element of the -array by modifying VAR inside the loop. Examples: - - for (@ary) { s/foo/bar/; } - - foreach $elem (@elements) { +(There is one minor difference: The first form implies a lexical scope +for variables declared with C in the initialization expression.) + +Besides the normal array index looping, C can lend itself +to many other interesting applications. Here's one that avoids the +problem you get into if you explicitly test for end-of-file on +an interactive file descriptor causing your program to appear to +hang. + + $on_a_tty = -t STDIN && -t STDOUT; + sub prompt { print "yes? " if $on_a_tty } + for ( prompt(); ; prompt() ) { + # do something + } + +=head2 Foreach Loops + +The C loop iterates over a normal list value and sets the +variable VAR to be each element of the list in turn. If the variable +is preceded with the keyword C, then it is lexically scoped, and +is therefore visible only within the loop. Otherwise, the variable is +implicitly local to the loop and regains its former value upon exiting +the loop. If the variable was previously declared with C, it uses +that variable instead of the global one, but it's still localized to +the loop. (Note that a lexically scoped variable can cause problems +with you have subroutine or format declarations.) + +The C keyword is actually a synonym for the C keyword, so +you can use C for readability or C for brevity. If VAR is +omitted, $_ is set to each value. If LIST is an actual array (as opposed +to an expression returning a list value), you can modify each element of +the array by modifying VAR inside the loop. That's because the C +loop index variable is an implicit alias for each item in the list that +you're looping over. + +Examples: + + for (@ary) { s/foo/bar/ } + + foreach my $elem (@elements) { $elem *= 2; } - for ((10,9,8,7,6,5,4,3,2,1,'BOOM')) { - print $_, "\n"; sleep(1); + for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') { + print $count, "\n"; sleep(1); } for (1..15) { print "Merry Christmas\n"; } - foreach $item (split(/:[\\\n:]*/, $ENV{'TERMCAP'})) { + foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) { print "Item: $item\n"; } -A BLOCK by itself (labeled or not) is semantically equivalent to a loop -that executes once. Thus you can use any of the loop control -statements in it to leave or restart the block. The C block -is optional. This construct is particularly nice for doing case +Here's how a C programmer might code up a particular algorithm in Perl: + + for (my $i = 0; $i < @ary1; $i++) { + for (my $j = 0; $j < @ary2; $j++) { + if ($ary1[$i] > $ary2[$j]) { + last; # can't go to outer :-( + } + $ary1[$i] += $ary2[$j]; + } + # this is where that last takes me + } + +Whereas here's how a Perl programmer more comfortable with the idiom might +do it: + + OUTER: foreach my $wid (@ary1) { + INNER: foreach my $jet (@ary2) { + next OUTER if $wid > $jet; + $wid += $jet; + } + } + +See how much easier this is? It's cleaner, safer, and faster. It's +cleaner because it's less noisy. It's safer because if code gets added +between the inner and outer loops later on, the new code won't be +accidentally executed. The C explicitly iterates the other loop +rather than merely terminating the inner one. And it's faster because +Perl executes a C statement more rapidly than it would the +equivalent C loop. + +=head2 Basic BLOCKs and Switch Statements + +A BLOCK by itself (labeled or not) is semantically equivalent to a +loop that executes once. Thus you can use any of the loop control +statements in it to leave or restart the block. (Note that this is +I true in C, C, or contrary to popular belief +C blocks, which do I count as loops.) The C +block is optional. + +The BLOCK construct is particularly nice for doing case structures. SWITCH: { @@ -212,7 +357,7 @@ above, you could write $nothing = 1; } -(That's actually not as strange as it looks one you realize that you can +(That's actually not as strange as it looks once you realize that you can use loop control "operators" within an expression, That's just the normal C comma operator.) @@ -265,3 +410,104 @@ or even, horrors, else { $nothing = 1 } + +A common idiom for a switch statement is to use C's aliasing to make +a temporary assignment to $_ for convenient matching: + + SWITCH: for ($where) { + /In Card Names/ && do { push @flags, '-e'; last; }; + /Anywhere/ && do { push @flags, '-h'; last; }; + /In Rulings/ && do { last; }; + die "unknown value for form variable where: `$where'"; + } + +Another interesting approach to a switch statement is arrange +for a C block to return the proper value: + + $amode = do { + if ($flag & O_RDONLY) { "r" } + elsif ($flag & O_WRONLY) { ($flag & O_APPEND) ? "a" : "w" } + elsif ($flag & O_RDWR) { + if ($flag & O_CREAT) { "w+" } + else { ($flag & O_APPEND) ? "a+" : "r+" } + } + }; + +=head2 Goto + +Although not for the faint of heart, Perl does support a C statement. +A loop's LABEL is not actually a valid target for a C; +it's just the name of the loop. There are three forms: goto-LABEL, +goto-EXPR, and goto-&NAME. + +The goto-LABEL form finds the statement labeled with LABEL and resumes +execution there. It may not be used to go into any construct that +requires initialization, such as a subroutine or a foreach loop. It +also can't be used to go into a construct that is optimized away. It +can be used to go almost anywhere else within the dynamic scope, +including out of subroutines, but it's usually better to use some other +construct such as last or die. The author of Perl has never felt the +need to use this form of goto (in Perl, that is--C is another matter). + +The goto-EXPR form expects a label name, whose scope will be resolved +dynamically. This allows for computed gotos per FORTRAN, but isn't +necessarily recommended if you're optimizing for maintainability: + + goto ("FOO", "BAR", "GLARCH")[$i]; + +The goto-&NAME form is highly magical, and substitutes a call to the +named subroutine for the currently running subroutine. This is used by +AUTOLOAD() subroutines that wish to load another subroutine and then +pretend that the other subroutine had been called in the first place +(except that any modifications to @_ in the current subroutine are +propagated to the other subroutine.) After the C, not even caller() +will be able to tell that this routine was called first. + +In almost all cases like this, it's usually a far, far better idea to use the +structured control flow mechanisms of C, C, or C instead of +resorting to a C. For certain applications, the catch and throw pair of +C and die() for exception processing can also be a prudent approach. + +=head2 PODs: Embedded Documentation + +Perl has a mechanism for intermixing documentation with source code. +While it's expecting the beginning of a new statement, if the compiler +encounters a line that begins with an equal sign and a word, like this + + =head1 Here There Be Pods! + +Then that text and all remaining text up through and including a line +beginning with C<=cut> will be ignored. The format of the intervening +text is described in L. + +This allows you to intermix your source code +and your documentation text freely, as in + + =item snazzle($) + + The snazzle() function will behave in the most spectacular + form that you can possibly imagine, not even excepting + cybernetic pyrotechnics. + + =cut back to the compiler, nuff of this pod stuff! + + sub snazzle($) { + my $thingie = shift; + ......... + } + +Note that pod translators should look at only paragraphs beginning +with a pod directive (it makes parsing easier), whereas the compiler +actually knows to look for pod escapes even in the middle of a +paragraph. This means that the following secret stuff will be +ignored by both the compiler and the translators. + + $a=3; + =secret stuff + warn "Neither POD nor CODE!?" + =cut back + print "got $a\n"; + +You probably shouldn't rely upon the warn() being podded out forever. +Not all pod translators are well-behaved in this regard, and perhaps +the compiler will become pickier.