X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlsyn.pod;h=28bb824ada99bf497848e2193ec0fb33941d9a6f;hb=6e0733998eff7a098d2d21d5602f3eb2a7521e1f;hp=e41caee3ec1666472805c9c380a6526e7a0b531d;hpb=cb1a09d0194fed9b905df7b04a4bc031d354609d;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlsyn.pod b/pod/perlsyn.pod index e41caee..28bb824 100644 --- a/pod/perlsyn.pod +++ b/pod/perlsyn.pod @@ -1,51 +1,80 @@ =head1 NAME +X perlsyn - Perl syntax =head1 DESCRIPTION -A Perl script consists of a sequence of declarations and statements. -The only things that need to be declared in Perl are report formats -and subroutines. See the sections below for more information on those -declarations. All uninitialized user-created objects are assumed to -start with a null or 0 value until they are defined by some explicit -operation such as assignment. (Though you can get warnings about the -use of undefined values if you like.) The sequence of statements is -executed just once, unlike in B and B scripts, where the -sequence of statements is executed for each input line. While this means -that you must explicitly loop over the lines of your input file (or -files), it also means you have much more control over which files and -which lines you look at. (Actually, I'm lying--it is possible to do an -implicit loop with either the B<-n> or B<-p> switch. It's just not the -mandatory default like it is in B and B.) +A Perl program consists of a sequence of declarations and statements +which run from the top to the bottom. Loops, subroutines and other +control structures allow you to jump around within the code. + +Perl is a B language, you can format and indent it however +you like. Whitespace mostly serves to separate tokens, unlike +languages like Python where it is an important part of the syntax. + +Many of Perl's syntactic elements are B. Rather than +requiring you to put parentheses around every function call and +declare every variable, you can often leave such explicit elements off +and Perl will figure out what you meant. This is known as B, abbreviated B. It allows programmers to be B and to +code in a style with which they are comfortable. + +Perl B and concepts from many languages: awk, sed, C, +Bourne Shell, Smalltalk, Lisp and even English. Other +languages have borrowed syntax from Perl, particularly its regular +expression extensions. So if you have programmed in another language +you will see familiar pieces in Perl. They often work the same, but +see L for information about how they differ. =head2 Declarations +X X X X + +The only things you need to declare in Perl are report formats and +subroutines (and sometimes not even subroutines). A variable holds +the undefined value (C) until it has been assigned a defined +value, which is anything other than C. When used as a number, +C is treated as C<0>; when used as a string, it is treated as +the empty string, C<"">; and when used as a reference that isn't being +assigned to, it is treated as an error. If you enable warnings, +you'll be notified of an uninitialized value whenever you treat +C as a string or a number. Well, usually. Boolean contexts, +such as: -Perl is, for the most part, a free-form language. (The only -exception to this is format declarations, for obvious reasons.) Comments -are indicated by the "#" character, and extend to the end of the line. If -you attempt to use C C-style comments, it will be interpreted -either as division or pattern matching, depending on the context, and C++ -C comments just look like a null regular expression, so don't do -that. + my $a; + if ($a) {} + +are exempt from warnings (because they care about truth rather than +definedness). Operators such as C<++>, C<-->, C<+=>, +C<-=>, and C<.=>, that operate on undefined left values such as: + + my $a; + $a++; + +are also always exempt from such warnings. A declaration can be put anywhere a statement can, but has no effect on the execution of the primary sequence of statements--declarations all take effect at compile time. Typically all the declarations are put at -the beginning or the end of the script. However, if you're using -lexically-scoped private variables created with my(), you'll have to make sure +the beginning or the end of the script. However, if you're using +lexically-scoped private variables created with C, you'll +have to make sure your format or subroutine definition is within the same block scope -as the my if you expect to to be able to access those private variables. +as the my if you expect to be able to access those private variables. Declaring a subroutine allows a subroutine name to be used as if it were a list operator from that point forward in the program. You can declare a -subroutine without defining it by saying just +subroutine without defining it by saying C, thus: +X sub myname; $me = myname $0 or die "can't get myname"; -Note that it functions as a list operator though, not as a unary -operator, so be careful to use C instead of C<||> there. +Note that myname() functions as a list operator, not as a unary operator; +so be careful to use C instead of C<||> in this case. However, if +you were to declare the subroutine as C, then +C would function as a unary operator, so either C or +C<||> would work. Subroutines declarations can also be loaded up with the C statement or both loaded and imported into your namespace with a C statement. @@ -57,16 +86,38 @@ like an ordinary statement, and is elaborated within the sequence of statements as if it were an ordinary statement. That means it actually has both compile-time and run-time effects. -=head2 Simple statements +=head2 Comments +X X<#> + +Text from a C<"#"> character until the end of the line is a comment, +and is ignored. Exceptions include C<"#"> inside a string or regular +expression. + +=head2 Simple Statements +X X X X<;> The only kind of simple statement is an expression evaluated for its side effects. Every simple statement must be terminated with a semicolon, unless it is the final statement in a block, in which case -the semicolon is optional. (A semicolon is still encouraged there if the -block takes up more than one line, since you may eventually add another line.) -Note that there are some operators like C and C that look -like compound statements, but aren't (they're just TERMs in an expression), -and thus need an explicit termination if used as the last item in a statement. +the semicolon is optional. (A semicolon is still encouraged if the +block takes up more than one line, because you may eventually add +another line.) Note that there are some operators like C and +C that look like compound statements, but aren't (they're just +TERMs in an expression), and thus need an explicit termination if used +as the last item in a statement. + +=head2 Truth and Falsehood +X X X X X X X X<0> + +The number 0, the strings C<'0'> and C<''>, the empty list C<()>, and +C are all false in a boolean context. All other values are true. +Negation of a true value by C or C returns a special false value. +When evaluated as a string it is treated as C<''>, but as a number, it +is treated as 0. + +=head2 Statement Modifiers +X X X X X +X X X X Any simple statement may optionally be followed by a I modifier, just before the terminating semicolon (or block ending). The possible @@ -76,26 +127,93 @@ modifiers are: unless EXPR while EXPR until EXPR + when EXPR + for LIST + foreach LIST + +The C following the modifier is referred to as the "condition". +Its truth or falsehood determines how the modifier will behave. + +C executes the statement once I and only if the condition is +true. C is the opposite, it executes the statement I +the condition is true (i.e., if the condition is false). + + print "Basset hounds got long ears" if length $ear >= 10; + go_outside() and play() unless $is_raining; -The C and C modifiers have the expected semantics, -presuming you're a speaker of English. The C and C -modifiers also have the usual "while loop" semantics (conditional -evaluated first), except when applied to a do-BLOCK (or to the -now-deprecated do-SUBROUTINE statement), in which case the block -executes once before the conditional is evaluated. This is so that you -can write loops like: +C executes the statement I C<$_> smart matches C, and +then either Cs out if it's enclosed in a C scope or skips +to the C element when it lies directly inside a C loop. +See also L. + + given ($something) { + $abc = 1 when /^abc/; + $just_a = 1 when /^a/; + $other = 1; + } + + for (@names) { + admin($_) when [ qw/Alice Bob/ ]; + regular($_) when [ qw/Chris David Ellen/ ]; + } + +The C modifier is an iterator: it executes the statement once +for each item in the LIST (with C<$_> aliased to each item in turn). + + print "Hello $_!\n" foreach qw(world Dolly nurse); + +C repeats the statement I the condition is true. +C does the opposite, it repeats the statement I the +condition is true (or while the condition is false): + + # Both of these count from 0 to 10. + print $i++ while $i <= 10; + print $j++ until $j > 10; + +The C and C modifiers have the usual "C loop" +semantics (conditional evaluated first), except when applied to a +C-BLOCK (or to the deprecated C-SUBROUTINE statement), in +which case the block executes once before the conditional is +evaluated. This is so that you can write loops like: do { $line = ; ... } until $line eq ".\n"; -See L. Note also that the loop control -statements described later will I work in this construct, since -modifiers don't take loop labels. Sorry. You can always wrap -another block around it to do that sort of thing. +See L. Note also that the loop control statements described +later will I work in this construct, because modifiers don't take +loop labels. Sorry. You can always put another block inside of it +(for C) or around it (for C) to do that sort of thing. +For C, just double the braces: +X X X + + do {{ + next if $x == $y; + # do something here + }} until $x++ > $z; + +For C, you have to be more elaborate: +X + + LOOP: { + do { + last if $x = $y**2; + # do something here + } while $x++ <= $z; + } + +B The behaviour of a C statement modified with a statement +modifier conditional or loop construct (e.g. C) is +B. The value of the C variable may be C, any +previously assigned value, or possibly anything else. Don't rely on +it. Future versions of perl might do something different from the +version of perl you try it out on. Here be dragons. +X -=head2 Compound statements +=head2 Compound Statements +X X X X X +X<{> X<}> X X X X X X X In Perl, a sequence of statements that defines a scope is called a block. Sometimes a block is delimited by the file containing it (in the case @@ -112,8 +230,11 @@ The following compound statements may be used to control flow: if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK LABEL while (EXPR) BLOCK LABEL while (EXPR) BLOCK continue BLOCK + LABEL until (EXPR) BLOCK + LABEL until (EXPR) BLOCK continue BLOCK LABEL for (EXPR; EXPR; EXPR) BLOCK LABEL foreach VAR (LIST) BLOCK + LABEL foreach VAR (LIST) BLOCK continue BLOCK LABEL BLOCK continue BLOCK Note that, unlike C and Pascal, these are defined in terms of BLOCKs, @@ -128,38 +249,40 @@ all do the same thing: open(FOO) ? 'hi mom' : die "Can't open $FOO: $!"; # a bit exotic, that last one -The C statement is straightforward. Since BLOCKs are always +The C statement is straightforward. Because BLOCKs are always bounded by curly brackets, there is never any ambiguity about which C an C goes with. If you use C in place of C, the sense of the test is reversed. The C statement executes the block as long as the expression is -true (does not evaluate to the null string or 0 or "0"). The LABEL is -optional, and if present, consists of an identifier followed by a colon. -The LABEL identifies the loop for the loop control statements C, -C, and C. If the LABEL is omitted, the loop control statement +L. +The C statement executes the block as long as the expression is +false. +The LABEL is optional, and if present, consists of an identifier followed +by a colon. The LABEL identifies the loop for the loop control +statements C, C, and C. +If the LABEL is omitted, the loop control statement refers to the innermost enclosing loop. This may include dynamically looking back your call-stack at run time to find the LABEL. Such -desperate behavior triggers a warning if you use the B<-w> flag. +desperate behavior triggers a warning if you use the C +pragma or the B<-w> flag. If there is a C BLOCK, it is always executed just before the -conditional is about to be evaluated again, just like the third part of a -C loop in C. Thus it can be used to increment a loop variable, even -when the loop has been continued via the C statement (which is -similar to the C C statement). +conditional is about to be evaluated again. Thus it can be used to +increment a loop variable, even when the loop has been continued via +the C statement. =head2 Loop Control +X X X X X X -The C command is like the C statement in C; it starts -the next iteration of the loop: +The C command starts the next iteration of the loop: LINE: while () { next LINE if /^#/; # discard comments ... } -The C command is like the C statement in C (as used in -loops); it immediately exits the loop in question. The +The C command immediately exits the loop in question. The C block, if any, is not executed: LINE: while () { @@ -178,57 +301,63 @@ want to skip ahead and get the next record. while (<>) { chomp; - if (s/\\$//) { - $_ .= <>; + if (s/\\$//) { + $_ .= <>; redo unless eof(); } # now process $_ - } + } which is Perl short-hand for the more explicitly written version: - LINE: while ($line = ) { + LINE: while (defined($line = )) { chomp($line); - if ($line =~ s/\\$//) { - $line .= ; + if ($line =~ s/\\$//) { + $line .= ; redo LINE unless eof(); # not eof(ARGV)! } # now process $line - } + } -Or here's a a simpleminded Pascal comment stripper (warning: assumes no { or } in strings) +Note that if there were a C block on the above code, it would +get executed only on lines discarded by the regex (since redo skips the +continue block). A continue block is often used to reset line counters +or C one-time matches: - LINE: while () { - while (s|({.*}.*){.*}|$1 |) {} - s|{.*}| |; - if (s|{.*| |) { - $front = $_; - while () { - if (/}/) { # end of comment? - s|^|$front{|; - redo LINE; - } - } - } - print; + # inspired by :1,$g/fred/s//WILMA/ + while (<>) { + ?(fred)? && s//WILMA $1 WILMA/; + ?(barney)? && s//BETTY $1 BETTY/; + ?(homer)? && s//MARGE $1 MARGE/; + } continue { + print "$ARGV $.: $_"; + close ARGV if eof(); # reset $. + reset if eof(); # reset ?pat? } -Note that if there were a C block on the above code, it would get -executed even on discarded lines. - If the word C is replaced by the word C, the sense of the test is reversed, but the conditional is still tested before the first iteration. -In either the C or the C statement, you may replace "(EXPR)" -with a BLOCK, and the conditional is true if the value of the last -statement in that block is true. While this "feature" continues to work in -version 5, it has been deprecated, so please change any occurrences of "if BLOCK" to -"if (do BLOCK)". +The loop control statements don't work in an C or C, since +they aren't loops. You can double the braces to make them such, though. + + if (/pattern/) {{ + last if /fred/; + next if /barney/; # same effect as "last", but doesn't document as well + # do something here + }} + +This is caused by the fact that a block by itself acts as a loop that +executes once, see L<"Basic BLOCKs">. + +The form C, available in Perl 4, is no longer +available. Replace any occurrence of C by C. =head2 For Loops +X X -Perl's C-style C loop works exactly like the corresponding C loop; +Perl's C-style C loop works like the corresponding C loop; that means that this: for ($i = 1; $i < 10; $i++) { @@ -244,41 +373,74 @@ is the same as this: $i++; } +There is one minor difference: if variables are declared with C +in the initialization section of the C, the lexical scope of +those variables is exactly the C loop (the body of the loop +and the control sections). +X + Besides the normal array index looping, C can lend itself to many other interesting applications. Here's one that avoids the -problem you get into if you explicitly test for end-of-file on -an interactive file descriptor causing your program to appear to +problem you get into if you explicitly test for end-of-file on +an interactive file descriptor causing your program to appear to hang. +X X X $on_a_tty = -t STDIN && -t STDOUT; sub prompt { print "yes? " if $on_a_tty } for ( prompt(); ; prompt() ) { # do something - } + } + +Using C (or the operator form, C<< >>) as the +conditional of a C loop is shorthand for the following. This +behaviour is the same as a C loop conditional. +X X<< <> >> + + for ( prompt(); defined( $_ = ); prompt() ) { + # do something + } =head2 Foreach Loops +X X The C loop iterates over a normal list value and sets the -variable VAR to be each element of the list in turn. The variable is -implicitly local to the loop and regains its former value upon exiting the -loop. If the variable was previously declared with C, it uses that -variable instead of the global one, but it's still localized to the loop. -This can cause problems if you have subroutine or format declarations -within that block's scope. +variable VAR to be each element of the list in turn. If the variable +is preceded with the keyword C, then it is lexically scoped, and +is therefore visible only within the loop. Otherwise, the variable is +implicitly local to the loop and regains its former value upon exiting +the loop. If the variable was previously declared with C, it uses +that variable instead of the global one, but it's still localized to +the loop. This implicit localisation occurs I in a C +loop. +X X The C keyword is actually a synonym for the C keyword, so -you can use C for readability or C for brevity. If VAR is -omitted, $_ is set to each value. If LIST is an actual array (as opposed -to an expression returning a list value), you can modify each element of -the array by modifying VAR inside the loop. That's because the C -loop index variable is an implicit alias for each item in the list that -you're looping over. +you can use C for readability or C for brevity. (Or because +the Bourne shell is more familiar to you than I, so writing C +comes more naturally.) If VAR is omitted, C<$_> is set to each value. +X<$_> + +If any element of LIST is an lvalue, you can modify it by modifying +VAR inside the loop. Conversely, if any element of LIST is NOT an +lvalue, any attempt to modify that element will fail. In other words, +the C loop index variable is an implicit alias for each item +in the list that you're looping over. +X + +If any part of LIST is an array, C will get very confused if +you add or remove elements within the loop body, for example with +C. So don't do that. +X + +C probably won't do what you expect if VAR is a tied or other +special variable. Don't do that either. Examples: for (@ary) { s/foo/bar/ } - foreach $elem (@elements) { + for my $elem (@elements) { $elem *= 2; } @@ -294,8 +456,8 @@ Examples: Here's how a C programmer might code up a particular algorithm in Perl: - for ($i = 0; $i < @ary1; $i++) { - for ($j = 0; $j < @ary2; $j++) { + for (my $i = 0; $i < @ary1; $i++) { + for (my $j = 0; $j < @ary2; $j++) { if ($ary1[$i] > $ary2[$j]) { last; # can't go to outer :-( } @@ -304,33 +466,35 @@ Here's how a C programmer might code up a particular algorithm in Perl: # this is where that last takes me } -Whereas here's how a Perl programmer more confortable with the idiom might +Whereas here's how a Perl programmer more comfortable with the idiom might do it: - OUTER: foreach $wid (@ary1) { - INNER: foreach $jet (@ary2) { + OUTER: for my $wid (@ary1) { + INNER: for my $jet (@ary2) { next OUTER if $wid > $jet; $wid += $jet; - } - } + } + } See how much easier this is? It's cleaner, safer, and faster. It's cleaner because it's less noisy. It's safer because if code gets added -between the inner and outer loops later, you won't accidentally excecute -it because you've explicitly asked to iterate the other loop rather than -merely terminating the inner one. And it's faster because Perl executes a -C statement more rapidly than it would the equivalent C -loop. +between the inner and outer loops later on, the new code won't be +accidentally executed. The C explicitly iterates the other loop +rather than merely terminating the inner one. And it's faster because +Perl executes a C statement more rapidly than it would the +equivalent C loop. -=head2 Basic BLOCKs and Switch Statements +=head2 Basic BLOCKs +X -A BLOCK by itself (labeled or not) is semantically equivalent to a loop -that executes once. Thus you can use any of the loop control -statements in it to leave or restart the block. The C block -is optional. +A BLOCK by itself (labeled or not) is semantically equivalent to a +loop that executes once. Thus you can use any of the loop control +statements in it to leave or restart the block. (Note that this is +I true in C, C, or contrary to popular belief +C blocks, which do I count as loops.) The C +block is optional. -The BLOCK construct is particularly nice for doing case -structures. +The BLOCK construct can be used to emulate case structures. SWITCH: { if (/^abc/) { $abc = 1; last SWITCH; } @@ -339,146 +503,316 @@ structures. $nothing = 1; } -There is no official switch statement in Perl, because there are -already several ways to write the equivalent. In addition to the -above, you could write +Such constructs are quite frequently used, because older versions +of Perl had no official C statement. - SWITCH: { - $abc = 1, last SWITCH if /^abc/; - $def = 1, last SWITCH if /^def/; - $xyz = 1, last SWITCH if /^xyz/; - $nothing = 1; - } +=head2 Switch statements +X X X X X -(That's actually not as strange as it looks once you realize that you can -use loop control "operators" within an expression, That's just the normal -C comma operator.) +Starting from Perl 5.10, you can say -or + use feature "switch"; - SWITCH: { - /^abc/ && do { $abc = 1; last SWITCH; }; - /^def/ && do { $def = 1; last SWITCH; }; - /^xyz/ && do { $xyz = 1; last SWITCH; }; - $nothing = 1; +which enables a switch feature that is closely based on the +Perl 6 proposal. + +The keywords C and C are analogous +to C and C in other languages, so the code +above could be written as + + given($_) { + when (/^abc/) { $abc = 1; } + when (/^def/) { $def = 1; } + when (/^xyz/) { $xyz = 1; } + default { $nothing = 1; } } -or formatted so it stands out more as a "proper" switch statement: +This construct is very flexible and powerful. For example: - SWITCH: { - /^abc/ && do { - $abc = 1; - last SWITCH; - }; - - /^def/ && do { - $def = 1; - last SWITCH; - }; - - /^xyz/ && do { - $xyz = 1; - last SWITCH; - }; - $nothing = 1; + use feature ":5.10"; + given($foo) { + when (undef) { + say '$foo is undefined'; + } + + when ("foo") { + say '$foo is the string "foo"'; + } + + when ([1,3,5,7,9]) { + say '$foo is an odd digit'; + continue; # Fall through + } + + when ($_ < 100) { + say '$foo is numerically less than 100'; + } + + when (\&complicated_check) { + say 'complicated_check($foo) is true'; + } + + default { + die q(I don't know what to do with $foo); + } } -or +C will assign the value of EXPR to C<$_> +within the lexical scope of the block, so it's similar to - SWITCH: { - /^abc/ and $abc = 1, last SWITCH; - /^def/ and $def = 1, last SWITCH; - /^xyz/ and $xyz = 1, last SWITCH; - $nothing = 1; + do { my $_ = EXPR; ... } + +except that the block is automatically broken out of by a +successful C or an explicit C. + +Most of the power comes from implicit smart matching: + + when($foo) + +is exactly equivalent to + + when($_ ~~ $foo) + +In fact C is treated as an implicit smart match most of the +time. The exceptions are that when EXPR is: + +=over 4 + +=item * + +a subroutine or method call + +=item * + +a regular expression match, i.e. C or C<$foo =~ /REGEX/>, +or a negated regular expression match C<$foo !~ /REGEX/>. + +=item * + +a comparison such as C<$_ E 10> or C<$x eq "abc"> +(or of course C<$_ ~~ $c>) + +=item * + +C, C, or C + +=item * + +A negated expression C or C, or a logical +exclusive-or C<(...) xor (...)>. + +=back + +then the value of EXPR is used directly as a boolean. +Furthermore: + +=over 4 + +=item o + +If EXPR is C<... && ...> or C<... and ...>, the test +is applied recursively to both arguments. If I +arguments pass the test, then the argument is treated +as boolean. + +=item o + +If EXPR is C<... || ...> or C<... or ...>, the test +is applied recursively to the first argument. + +=back + +These rules look complicated, but usually they will do what +you want. For example you could write: + + when (/^\d+$/ && $_ < 75) { ... } + +Another useful shortcut is that, if you use a literal array +or hash as the argument to C, it is turned into a +reference. So C is the same as C, +for example. + +C behaves exactly like C, which is +to say that it always matches. + +See L for more information +on smart matching. + +=head3 Breaking out + +You can use the C keyword to break out of the enclosing +C block. Every C block is implicitly ended with +a C. + +=head3 Fall-through + +You can use the C keyword to fall through from one +case to the next: + + given($foo) { + when (/x/) { say '$foo contains an x'; continue } + when (/y/) { say '$foo contains a y' } + default { say '$foo does not contain a y' } } -or even, horrors, - - if (/^abc/) - { $abc = 1 } - elsif (/^def/) - { $def = 1 } - elsif (/^xyz/) - { $xyz = 1 } - else - { $nothing = 1 } - - -A common idiom for a switch statement is to use C's aliasing to make -a temporary assignment to $_ for convenient matching: - - SWITCH: for ($where) { - /In Card Names/ && do { push @flags, '-e'; last; }; - /Anywhere/ && do { push @flags, '-h'; last; }; - /In Rulings/ && do { last; }; - die "unknown value for form variable where: `$where'"; - } - -Another interesting approach to a switch statement is arrange -for a C block to return the proper value: - - $amode = do { - if ($flag & O_RDONLY) { "r" } - elsif ($flag & O_WRONLY) { ($flag & O_APPEND) ? "w" : "a" } - elsif ($flag & O_RDWR) { - if ($flag & O_CREAT) { "w+" } - else { ($flag & O_APPEND) ? "r+" : "a+" } - } - }; +=head3 Switching in a loop + +Instead of using C, you can use a C loop. +For example, here's one way to count how many times a particular +string occurs in an array: + + my $count = 0; + for (@array) { + when ("foo") { ++$count } + } + print "\@array contains $count copies of 'foo'\n"; + +On exit from the C block, there is an implicit C. +You can override that with an explicit C if you're only +interested in the first match. + +This doesn't work if you explicitly specify a loop variable, +as in C. You have to use the default +variable C<$_>. (You can use C.) + +=head3 Smart matching in detail + +The behaviour of a smart match depends on what type of thing +its arguments are. It is always commutative, i.e. C<$a ~~ $b> +behaves the same as C<$b ~~ $a>. The behaviour is determined +by the following table: the first row that applies, in either +order, determines the match behaviour. + + + $a $b Type of Match Implied Matching Code + ====== ===== ===================== ============= + (overloading trumps everything) + + Code[+] Code[+] referential equality $a == $b + Any Code[+] scalar sub truth $b->($a) + + Hash Hash hash keys identical [sort keys %$a]~~[sort keys %$b] + Hash Array hash slice existence @$b == grep {exists $a->{$_}} @$b + Hash Regex hash key grep grep /$b/, keys %$a + Hash Any hash entry existence exists $a->{$b} + + Array Array arrays are identical[*] + Array Regex array grep grep /$b/, @$a + Array Num array contains number grep $_ == $b, @$a + Array Any array contains string grep $_ eq $b, @$a + + Any undef undefined !defined $a + Any Regex pattern match $a =~ /$b/ + Code() Code() results are equal $a->() eq $b->() + Any Code() simple closure truth $b->() # ignoring $a + Num numish[!] numeric equality $a == $b + Any Str string equality $a eq $b + Any Num numeric equality $a == $b + + Any Any string equality $a eq $b + + + + - this must be a code reference whose prototype (if present) is not "" + (subs with a "" prototype are dealt with by the 'Code()' entry lower down) + * - that is, each element matches the element of same index in the other + array. If a circular reference is found, we fall back to referential + equality. + ! - either a real number, or a string that looks like a number + +The "matching code" doesn't represent the I matching code, +of course: it's just there to explain the intended meaning. Unlike +C, the smart match operator will short-circuit whenever it can. + +=head3 Custom matching via overloading + +You can change the way that an object is matched by overloading +the C<~~> operator. This trumps the usual smart match semantics. +See L. + +=head3 Differences from Perl 6 + +The Perl 5 smart match and C/C constructs are not +absolutely identical to their Perl 6 analogues. The most visible +difference is that, in Perl 5, parentheses are required around +the argument to C and C (except when this last +one is used as a statement modifier). Parentheses in Perl 6 +are always optional in a control construct such as C, +C, or C; they can't be made optional in Perl +5 without a great deal of potential confusion, because Perl 5 +would parse the expression + + given $foo { + ... + } + +as though the argument to C were an element of the hash +C<%foo>, interpreting the braces as hash-element syntax. + +The table of smart matches is not identical to that proposed by the +Perl 6 specification, mainly due to the differences between Perl 6's +and Perl 5's data models. + +In Perl 6, C will always do an implicit smart match +with its argument, whilst it is convenient in Perl 5 to +suppress this implicit smart match in certain situations, +as documented above. (The difference is largely because Perl 5 +does not, even internally, have a boolean type.) =head2 Goto +X -Although not for the faint of heart, Perl does support a C statement. -A loop's LABEL is not actually a valid target for a C; -it's just the name of the loop. There are three forms: goto-LABEL, -goto-EXPR, and goto-&NAME. +Although not for the faint of heart, Perl does support a C +statement. There are three forms: C-LABEL, C-EXPR, and +C-&NAME. A loop's LABEL is not actually a valid target for +a C; it's just the name of the loop. -The goto-LABEL form finds the statement labeled with LABEL and resumes +The C-LABEL form finds the statement labeled with LABEL and resumes execution there. It may not be used to go into any construct that -requires initialization, such as a subroutine or a foreach loop. It +requires initialization, such as a subroutine or a C loop. It also can't be used to go into a construct that is optimized away. It can be used to go almost anywhere else within the dynamic scope, including out of subroutines, but it's usually better to use some other -construct such as last or die. The author of Perl has never felt the -need to use this form of goto (in Perl, that is--C is another matter). +construct such as C or C. The author of Perl has never felt the +need to use this form of C (in Perl, that is--C is another matter). -The goto-EXPR form expects a label name, whose scope will be resolved -dynamically. This allows for computed gotos per FORTRAN, but isn't +The C-EXPR form expects a label name, whose scope will be resolved +dynamically. This allows for computed Cs per FORTRAN, but isn't necessarily recommended if you're optimizing for maintainability: - goto ("FOO", "BAR", "GLARCH")[$i]; + goto(("FOO", "BAR", "GLARCH")[$i]); -The goto-&NAME form is highly magical, and substitutes a call to the +The C-&NAME form is highly magical, and substitutes a call to the named subroutine for the currently running subroutine. This is used by -AUTOLOAD() subroutines that wish to load another subroutine and then +C subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place -(except that any modifications to @_ in the current subroutine are -propagated to the other subroutine.) After the C, not even caller() +(except that any modifications to C<@_> in the current subroutine are +propagated to the other subroutine.) After the C, not even C will be able to tell that this routine was called first. -In almost cases like this, it's usually a far, far better idea to use the -structured control flow mechanisms of C, C, or C insetad +In almost all cases like this, it's usually a far, far better idea to use the +structured control flow mechanisms of C, C, or C instead of resorting to a C. For certain applications, the catch and throw pair of C and die() for exception processing can also be a prudent approach. =head2 PODs: Embedded Documentation +X X Perl has a mechanism for intermixing documentation with source code. -If while expecting the beginning of a new statement, the compiler +While it's expecting the beginning of a new statement, if the compiler encounters a line that begins with an equal sign and a word, like this =head1 Here There Be Pods! Then that text and all remaining text up through and including a line beginning with C<=cut> will be ignored. The format of the intervening -text is described in L. +text is described in L. This allows you to intermix your source code and your documentation text freely, as in =item snazzle($) - The snazzle() function will behave in the most spectacular + The snazzle() function will behave in the most spectacular form that you can possibly imagine, not even excepting cybernetic pyrotechnics. @@ -487,11 +821,11 @@ and your documentation text freely, as in sub snazzle($) { my $thingie = shift; ......... - } + } -Note that pod translators should only look at paragraphs beginning -with a pod diretive (it makes parsing easier), whereas the compiler -actually knows to look for pod escapes even in the middle of a +Note that pod translators should look at only paragraphs beginning +with a pod directive (it makes parsing easier), whereas the compiler +actually knows to look for pod escapes even in the middle of a paragraph. This means that the following secret stuff will be ignored by both the compiler and the translators. @@ -501,6 +835,62 @@ ignored by both the compiler and the translators. =cut back print "got $a\n"; -You probably shouldn't rely upon the warn() being podded out forever. +You probably shouldn't rely upon the C being podded out forever. Not all pod translators are well-behaved in this regard, and perhaps the compiler will become pickier. + +One may also use pod directives to quickly comment out a section +of code. + +=head2 Plain Old Comments (Not!) +X X X<#> X X + +Perl can process line directives, much like the C preprocessor. Using +this, one can control Perl's idea of filenames and line numbers in +error or warning messages (especially for strings that are processed +with C). The syntax for this mechanism is the same as for most +C preprocessors: it matches the regular expression + + # example: '# line 42 "new_filename.plx"' + /^\# \s* + line \s+ (\d+) \s* + (?:\s("?)([^"]+)\2)? \s* + $/x + +with C<$1> being the line number for the next line, and C<$3> being +the optional filename (specified with or without quotes). + +There is a fairly obvious gotcha included with the line directive: +Debuggers and profilers will only show the last source line to appear +at a particular line number in a given file. Care should be taken not +to cause line number collisions in code you'd like to debug later. + +Here are some examples that you should be able to type into your command +shell: + + % perl + # line 200 "bzzzt" + # the `#' on the previous line must be the first char on line + die 'foo'; + __END__ + foo at bzzzt line 201. + + % perl + # line 200 "bzzzt" + eval qq[\n#line 2001 ""\ndie 'foo']; print $@; + __END__ + foo at - line 2001. + + % perl + eval qq[\n#line 200 "foo bar"\ndie 'foo']; print $@; + __END__ + foo at foo bar line 200. + + % perl + # line 345 "goop" + eval "\n#line " . __LINE__ . ' "' . __FILE__ ."\"\ndie 'foo'"; + print $@; + __END__ + foo at goop line 345. + +=cut