From: David Landgren Date: Fri, 28 Sep 2007 20:42:56 +0000 (+0200) Subject: POD cleanups X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=commitdiff_plain;h=353c650532037e4006fbdb2176350717f320f7c3;p=p5sagit%2Fp5-mst-13.2.git POD cleanups Message-ID: <46FD4B30.9070802@landgren.net> p4raw-id: //depot/perl@32026 --- diff --git a/pod/perl5005delta.pod b/pod/perl5005delta.pod index 91d9a82..39646b6 100644 --- a/pod/perl5005delta.pod +++ b/pod/perl5005delta.pod @@ -572,7 +572,7 @@ in perl. =item Test -A framework for writing testsuites. +A framework for writing test suites. =item Tie::Array diff --git a/pod/perl561delta.pod b/pod/perl561delta.pod index ab6067c..4aa9474 100644 --- a/pod/perl561delta.pod +++ b/pod/perl561delta.pod @@ -1409,7 +1409,7 @@ See L. =item B The Perl Compiler suite has been extensively reworked for this -release. More of the standard Perl testsuite passes when run +release. More of the standard Perl test suite passes when run under the Compiler, but there is still a significant way to go to achieve production quality compiled executables. diff --git a/pod/perl56delta.pod b/pod/perl56delta.pod index 89d6237..0ee586f 100644 --- a/pod/perl56delta.pod +++ b/pod/perl56delta.pod @@ -811,7 +811,7 @@ See L. =item B The Perl Compiler suite has been extensively reworked for this -release. More of the standard Perl testsuite passes when run +release. More of the standard Perl test suite passes when run under the Compiler, but there is still a significant way to go to achieve production quality compiled executables. diff --git a/pod/perl571delta.pod b/pod/perl571delta.pod index aff02e5..56eb74f 100644 --- a/pod/perl571delta.pod +++ b/pod/perl571delta.pod @@ -901,7 +901,7 @@ is made, a warning is given. =item * C and C (with no values to push or unshift) -now give a warning. This may be a problem for generated and evaled +now give a warning. This may be a problem for generated and eval'ed code. =back diff --git a/pod/perl581delta.pod b/pod/perl581delta.pod index 4589392..ecefbf7 100644 --- a/pod/perl581delta.pod +++ b/pod/perl581delta.pod @@ -474,7 +474,7 @@ The Perl debugger (F) has now been extensively documented and bugs found while documenting have been fixed. C has been rewritten from scratch to be more robust and -featureful. +feature rich. C works now at least somewhat better, while C is rather more broken. (The Perl compiler suite as a whole continues @@ -812,7 +812,7 @@ know about or hack Perl internals (using Devel::Peek or any of the C modules counts), or like to run Perl with the C<-D> option. The embedding examples of L have been reviewed to be -uptodate and consistent: for example, the correct use of +up to date and consistent: for example, the correct use of PERL_SYS_INIT3() and PERL_SYS_TERM(). Extensive reworking of the pad code (the code responsible diff --git a/pod/perl58delta.pod b/pod/perl58delta.pod index 04a0374..6bfdfb6 100644 --- a/pod/perl58delta.pod +++ b/pod/perl58delta.pod @@ -2929,7 +2929,7 @@ is made, a warning is given. =item * C and C (with no values to push or unshift) -now give a warning. This may be a problem for generated and evaled +now give a warning. This may be a problem for generated and eval'ed code. =item * diff --git a/pod/perl590delta.pod b/pod/perl590delta.pod index 389105e..9a2797e 100644 --- a/pod/perl590delta.pod +++ b/pod/perl590delta.pod @@ -454,7 +454,7 @@ The Perl debugger (F) has now been extensively documented and bugs found while documenting have been fixed. C has been rewritten from scratch to be more robust and -featureful. +feature rich. C works now at least somewhat better, while C is rather more broken. (The Perl compiler suite as a whole continues @@ -787,7 +787,7 @@ know about or hack Perl internals (using Devel::Peek or any of the C modules counts), or like to run Perl with the C<-D> option. The embedding examples of L have been reviewed to be -uptodate and consistent: for example, the correct use of +up to date and consistent: for example, the correct use of PERL_SYS_INIT3() and PERL_SYS_TERM(). Extensive reworking of the pad code (the code responsible diff --git a/pod/perl592delta.pod b/pod/perl592delta.pod index 12e27eb..c1cc7db 100644 --- a/pod/perl592delta.pod +++ b/pod/perl592delta.pod @@ -139,7 +139,7 @@ by perl at startup. =item * -C, by Autrijus Tang, is a module to emit warnings +C, by Audrey Tang, is a module to emit warnings whenever an ASCII character string containing high-bit bytes is implicitly converted into UTF-8. diff --git a/pod/perl595delta.pod b/pod/perl595delta.pod index cbe033a..aaf8618 100644 --- a/pod/perl595delta.pod +++ b/pod/perl595delta.pod @@ -243,12 +243,12 @@ to, 5.9.5. A new pragma, C (for Method Resolution Order) has been added. It permits to switch, on a per-class basis, the algorithm that perl uses to -find inherited methods in case of a mutiple inheritance hierachy. The +find inherited methods in case of a multiple inheritance hierarchy. The default MRO hasn't changed (DFS, for Depth First Search). Another MRO is available: the C3 algorithm. See L for more information. (Brandon Black) -Note that, due to changes in the implentation of class hierarchy search, +Note that, due to changes in the implementation of class hierarchy search, code that used to undef the C<*ISA> glob will most probably break. Anyway, undef'ing C<*ISA> had the side-effect of removing the magic on the @ISA array and should not have been done in the first place. diff --git a/pod/perlapi.pod b/pod/perlapi.pod index 13f83df..e74fb85 100644 --- a/pod/perlapi.pod +++ b/pod/perlapi.pod @@ -5199,7 +5199,7 @@ Found in file sv.c =item newSV_type X -Creates a new SV, of the type specificied. The reference count for the new SV +Creates a new SV, of the type specified. The reference count for the new SV is set to 1. SV* newSV_type(svtype type) diff --git a/pod/perldata.pod b/pod/perldata.pod index c960a0e..29004f0 100644 --- a/pod/perldata.pod +++ b/pod/perldata.pod @@ -425,7 +425,7 @@ token was encountered. The filehandle is left open pointing to the contents after __DATA__. It is the program's responsibility to C when it is done reading from it. For compatibility with older scripts written before __DATA__ was introduced, __END__ behaves -like __DATA__ in the toplevel script (but not in files loaded with +like __DATA__ in the top level script (but not in files loaded with C or C) and leaves the remaining contents of the file accessible via C. diff --git a/pod/perldebug.pod b/pod/perldebug.pod index 390eb96..8c6e940 100644 --- a/pod/perldebug.pod +++ b/pod/perldebug.pod @@ -430,7 +430,7 @@ X<< debugger command, > >> Set an action (Perl command) to happen after the prompt when you've just given a command to return to executing the script. A multi-line command may be entered by backslashing the newlines (we bet you -couldn't've guessed this by now). +couldn't have guessed this by now). =item > * X<< debugger command, > >> @@ -638,7 +638,7 @@ of warning (this is often annoying) or exception (this is often valuable). Unfortunately, the debugger cannot discern fatal exceptions from non-fatal ones. If C is even 1, then your non-fatal exceptions are also traced and unceremoniously altered if they -came from C strings or from any kind of C within modules +came from C strings or from any kind of C within modules you're attempting to load. If C is 2, the debugger doesn't care where they came from: It usurps your exception handler and prints out a trace, then modifies all exceptions with its own embellishments. diff --git a/pod/perldiag.pod b/pod/perldiag.pod index 1d2650f..1c5128a 100644 --- a/pod/perldiag.pod +++ b/pod/perldiag.pod @@ -3574,7 +3574,7 @@ a reference count of other than 1. (F) You used C<\g0> or similar in a regular expression. You may refer to capturing parentheses only with strictly positive integers (normal -backreferences) or with stricly negative integers (relative +backreferences) or with strictly negative integers (relative backreferences), but using 0 does not make sense. =item Reference to nonexistent group in regex; marked by <-- HERE in m/%s/ @@ -4765,7 +4765,7 @@ to be huge numbers, and so usually indicates programmer error. If you really do mean it, explicitly numify your reference, like so: C<$array[0+$ref]>. This warning is not given for overloaded objects, either, because you can overload the numification and stringification -operators and then you assumedly know what you are doing. +operators and then you assumably know what you are doing. =item Use of reserved word "%s" is deprecated diff --git a/pod/perlembed.pod b/pod/perlembed.pod index 41028f8..f4b13a3 100644 --- a/pod/perlembed.pod +++ b/pod/perlembed.pod @@ -173,7 +173,7 @@ information you may find useful. In a sense, perl (the C program) is a good example of embedding Perl (the language), so I'll demonstrate embedding with I, -included in the source distribution. Here's a bastardized, nonportable +included in the source distribution. Here's a bastardized, non-portable version of I containing the essentials of embedding: #include /* from the Perl distribution */ @@ -352,7 +352,7 @@ I to create a string: a = Just Another Perl Hacker In the example above, we've created a global variable to temporarily -store the computed value of our eval'd expression. It is also +store the computed value of our eval'ed expression. It is also possible and in most cases a better strategy to fetch the return value from I instead. Example: diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 9184a8a..e6f6cc4 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -188,7 +188,7 @@ X C, C, C, C, C, C -=item Keywords related to classes and object-orientedness +=item Keywords related to classes and object-orientation X X X C, C, C, C, C, C, C, diff --git a/pod/perlglossary.pod b/pod/perlglossary.pod index 69692b3..d22e2ac 100644 --- a/pod/perlglossary.pod +++ b/pod/perlglossary.pod @@ -798,7 +798,7 @@ enclose" (like these parentheses are doing). Deprecated modules and features are those which were part of a stable release, but later found to be subtly flawed, and which should be avoided. They are subject to removal and/or bug-incompatible reimplementation in -the next major release (but they will be preserved through maintainance +the next major release (but they will be preserved through maintenance releases). Deprecation warnings are issued under B<-w> or C, and notices are found in Ls, as well as various other PODs. Coding practices that misuse features, such as C. "Just Another Perl Hacker," a clever but cryptic bit of Perl code that when executed, evaluates to that string. Often used to illustrate a -particular Perl feature, and something of an ungoing Obfuscated Perl +particular Perl feature, and something of an ongoing Obfuscated Perl Contest seen in Usenix signatures. =back diff --git a/pod/perlhack.pod b/pod/perlhack.pod index 1e5f02f..3acee30 100644 --- a/pod/perlhack.pod +++ b/pod/perlhack.pod @@ -156,7 +156,7 @@ altogether without further notice. =item Is the implementation generic enough to be portable? The worst patches make use of a system-specific features. It's highly -unlikely that nonportable additions to the Perl language will be +unlikely that non-portable additions to the Perl language will be accepted. =item Is the implementation tested? @@ -2829,7 +2829,7 @@ not perfect, because the below is a compile-time check): #endif How does the HAS_QUUX become defined where it needs to be? Well, if -Foonix happens to be UNIXy enought to be able to run the Configure +Foonix happens to be UNIXy enough to be able to run the Configure script, and Configure has been taught about detecting and testing quux(), the HAS_QUUX will be correctly defined. In other platforms, the corresponding configuration step will hopefully do the same. @@ -2858,7 +2858,7 @@ But in any case, try to keep the features and operating systems separate. =item * -malloc(0), realloc(0), calloc(0, 0) are nonportable. To be portable +malloc(0), realloc(0), calloc(0, 0) are non-portable. To be portable allocate at least one byte. (In general you should rarely need to work at this low level, but instead use the various malloc wrappers.) diff --git a/pod/perliol.pod b/pod/perliol.pod index 3798a97..136faa6 100644 --- a/pod/perliol.pod +++ b/pod/perliol.pod @@ -246,7 +246,7 @@ representing open (allocated) handles. For example the first three slots in the table correspond to C,C and C. The table in turn points to the current "top" layer for the handle - in this case an instance of the generic buffering layer "perlio". That layer in turn -points to the next layer down - in this case the lowlevel "unix" layer. +points to the next layer down - in this case the low-level "unix" layer. The above is roughly equivalent to a "stdio" buffered stream, but with much more flexibility: diff --git a/pod/perlipc.pod b/pod/perlipc.pod index f027d23..f0722f7 100644 --- a/pod/perlipc.pod +++ b/pod/perlipc.pod @@ -119,7 +119,7 @@ handlers: But that will be problematic for the more complicated handlers that need to reinstall themselves. Because Perl's signal mechanism is currently based on the signal(3) function from the C library, you may sometimes be so -misfortunate as to run on systems where that function is "broken", that +unfortunate as to run on systems where that function is "broken", that is, it behaves in the old unreliable SysV way rather than the newer, more reasonable BSD and POSIX fashion. So you'll see defensive people writing signal handlers like this: diff --git a/pod/perlmodlib.pod b/pod/perlmodlib.pod index 450d36c..cd8bf09 100644 --- a/pod/perlmodlib.pod +++ b/pod/perlmodlib.pod @@ -1667,7 +1667,7 @@ Base class for test modules =item Test::Builder::Tester -Test testsuites that have been built with +Test test suites that have been built with =item Test::Builder::Tester::Color diff --git a/pod/perlop.pod b/pod/perlop.pod index 9ef1aec..47184f3 100644 --- a/pod/perlop.pod +++ b/pod/perlop.pod @@ -200,7 +200,7 @@ concatenated with the identifier is returned. Otherwise, if the string starts with a plus or minus, a string starting with the opposite sign is returned. One effect of these rules is that -bareword is equivalent to the string "-bareword". If, however, the string begins with a -non-alphabetic character (exluding "+" or "-"), Perl will attempt to convert +non-alphabetic character (excluding "+" or "-"), Perl will attempt to convert the string to a numeric and the arithmetic negation is performed. If the string cannot be cleanly converted to a numeric, Perl will give the warning B. diff --git a/pod/perlpod.pod b/pod/perlpod.pod index 1251ea5..9fc7bed 100644 --- a/pod/perlpod.pod +++ b/pod/perlpod.pod @@ -214,7 +214,7 @@ formatter that can use that format will use the region, otherwise it will be completely ignored. A command "=begin I", some paragraphs, and a -command "=end I", mean that the text/data inbetween +command "=end I", mean that the text/data in between is meant for formatters that understand the special format called I. For example, diff --git a/pod/perlpodspec.pod b/pod/perlpodspec.pod index c33b68f..5e38e2c 100644 --- a/pod/perlpodspec.pod +++ b/pod/perlpodspec.pod @@ -65,7 +65,7 @@ directly formatting it). A B (or B) is a module or program that converts Pod to some other format (HTML, plaintext, TeX, PostScript, RTF). A B might be a formatter or translator, or might be a program that does something -else with the Pod (like wordcounting it, scanning for index points, +else with the Pod (like counting words, scanning for index points, etc.). Pod content is contained in B. A Pod block starts with a @@ -724,7 +724,7 @@ period-space-space or period-newline sequences). Pod parsers should not, by default, try to coerce apostrophe (') and quote (") into smart quotes (little 9's, 66's, 99's, etc), nor try to turn backtick (`) into anything else but a single backtick character -(distinct from an openquote character!), nor "--" into anything but +(distinct from an open quote character!), nor "--" into anything but two minus signs. They I do any of those things to text in CE...> formatting codes, and never I to text in verbatim paragraphs. @@ -959,7 +959,7 @@ for idiosyncratic mappings of Unicode-to-I. =item * -It is up to individual Pod formatter to display good judgment when +It is up to individual Pod formatter to display good judgement when confronted with an unrenderable character (which is distinct from an unknown EEthing> sequence that the parser couldn't resolve to anything, renderable or not). It is good practice to map Latin letters @@ -1541,7 +1541,7 @@ probably want to format it like so: Ut Enim -But (for the forseeable future), Pod does not provide any way for Pod +But (for the foreseeable future), Pod does not provide any way for Pod authors to distinguish which grouping is meant by the above "=item"-cluster structure. So formatters should format it like so: diff --git a/pod/perlre.pod b/pod/perlre.pod index 96ed872..8bd517e 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -389,7 +389,7 @@ If the C pragma is not used but the C pragma is, the classes correlate with the usual isalpha(3) interface (except for "word" and "blank"). -The assumedly non-obviously named classes are: +The other named classes are: =over 4 @@ -1386,7 +1386,7 @@ If we add a C<(*PRUNE)> before the count like the following print "Count=$count\n"; we prevent backtracking and find the count of the longest matching -at each matching startpoint like so: +at each matching starting point like so: aaab aab @@ -1432,7 +1432,7 @@ outputs Count=2 Once the 'aaab' at the start of the string has matched, and the C<(*SKIP)> -executed, the next startpoint will be where the cursor was when the +executed, the next starting point will be where the cursor was when the C<(*SKIP)> was executed. =item C<(*MARK:NAME)> C<(*:NAME)> diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod index bb32d06..e1fdebb 100644 --- a/pod/perlrebackslash.pod +++ b/pod/perlrebackslash.pod @@ -202,7 +202,7 @@ the following rules: =item 1 -If the backslash is followed by a single digit, it's a backrefence. +If the backslash is followed by a single digit, it's a backreference. =item 2 diff --git a/pod/perlretut.pod b/pod/perlretut.pod index 360ee73..67e0670 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -320,7 +320,7 @@ backslash C<\> to represent themselves. The same is true in a character class, but the sets of ordinary and special characters inside a character class are different than those outside a character class. The special characters for a character class are C<-]\^$> (and -the pattern delimiter, whatever it is). +the pattern delimiter, whatever it is). C<]> is special because it denotes the end of a character class. C<$> is special because it denotes a scalar variable. C<\> is special because it is used in escape sequences, just like above. Here is how the @@ -332,7 +332,7 @@ special characters C<]$\> are handled: /[\$x]at/; # matches '$at' or 'xat' /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat' -The last two are a little tricky. in C<[\$x]>, the backslash protects +The last two are a little tricky. In C<[\$x]>, the backslash protects the dollar sign, so the character class has two members C<$> and C. In C<[\\$x]>, the backslash is protected, so C<$x> is treated as a variable and substituted in double quote fashion. @@ -681,8 +681,8 @@ possible character positions have been exhausted does Perl give up and declare S> to be false. Even with all this work, regexp matching happens remarkably fast. To -speed things up, Perl compiles the regexp into a compact sequence of -opcodes that can often fit inside a processor cache. When the code is +speed things up, Perl compiles the regexp into a compact sequence of +opcodes that can often fit inside a processor cache. When the code is executed, these opcodes can then run at full throttle and search very quickly. @@ -765,7 +765,7 @@ so may lead to surprising and unsatisfactory results. =head2 Relative backreferences Counting the opening parentheses to get the correct number for a -backreference is errorprone as soon as there is more than one +backreference is errorprone as soon as there is more than one capturing group. A more convenient technique became available with Perl 5.10: relative backreferences. To refer to the immediately preceding capture group one now may write C<\g{-1}>, the next but @@ -775,7 +775,7 @@ Another good reason in addition to readability and maintainability for using relative backreferences is illustrated by the following example, where a simple pattern for matching peculiar strings is used: - $a99a = '([a-z])(\d)\2\1'; # matches a11a, g22g, x33x, etc. + $a99a = '([a-z])(\d)\2\1'; # matches a11a, g22g, x33x, etc. Now that we have this pattern stored as a handy string, we might feel tempted to use it as a part of some other pattern: @@ -807,9 +807,9 @@ same name to more than one group, but then only the leftmost one of the eponymous set can be referenced. Outside of the pattern a named capture buffer is accessible through the C<%+> hash. -Assuming that we have to match calendar dates which may be given in one +Assuming that we have to match calendar dates which may be given in one of the three formats yyyy-mm-dd, mm/dd/yyyy or dd.mm.yyyy, we can write -three suitable patterns where we use 'd', 'm' and 'y' respectively as the +three suitable patterns where we use 'd', 'm' and 'y' respectively as the names of the buffers capturing the pertaining components of a date. The matching operation combines the three patterns as alternatives: @@ -837,9 +837,9 @@ Consider a pattern for matching a time of the day, civil or military style: } Processing the results requires an additional if statement to determine -whether C<$1> and C<$2> or C<$3> and C<$4> contain the goodies. It would +whether C<$1> and C<$2> or C<$3> and C<$4> contain the goodies. It would be easier if we could use buffer numbers 1 and 2 in second alternative as -well, and this is exactly what the parenthesized construct C<(?|...)>, +well, and this is exactly what the parenthesized construct C<(?|...)>, set around an alternative achieves. Here is an extended version of the previous pattern: @@ -849,8 +849,7 @@ previous pattern: Within the alternative numbering group, buffer numbers start at the same position for each alternative. After the group, numbering continues -with one higher than the maximum reached across all the alteratives. - +with one higher than the maximum reached across all the alternatives. =head2 Position information @@ -900,11 +899,11 @@ C<@+> instead: =head2 Non-capturing groupings -A group that is required to bundle a set of alternatives may or may not be +A group that is required to bundle a set of alternatives may or may not be useful as a capturing group. If it isn't, it just creates a superfluous addition to the set of available capture buffer values, inside as well as outside the regexp. Non-capturing groupings, denoted by C<(?:regexp)>, -still allow the regexp to be treated as a single unit, but don't establish +still allow the regexp to be treated as a single unit, but don't establish a capturing buffer at the same time. Both capturing and non-capturing groupings are allowed to co-exist in the same regexp. Because there is no extraction, non-capturing groupings are faster than capturing @@ -1288,28 +1287,30 @@ the simple pattern Whenever this is applied to a string which doesn't quite meet the pattern's expectations such as S> or S>, -the regex engine will backtrack, approximately once for each character -in the string. But we know that there is no way around taking I -of the inital word characters to match the first repetition, that I +the regex engine will backtrack, approximately once for each character +in the string. But we know that there is no way around taking I +of the initial word characters to match the first repetition, that I spaces must be eaten by the middle part, and the same goes for the second -word. With the introduction of the I in -Perl 5.10 we have a way of instructing the regexp engine not to backtrack, -with the usual quantifiers with a C<+> appended to them. This makes them -greedy as well as stingy; once they succeed they won't give anything back -to permit another solution. They have the following meanings: +word. + +With the introduction of the I in Perl 5.10, we +have a way of instructing the regex engine not to backtrack, with the +usual quantifiers with a C<+> appended to them. This makes them greedy as +well as stingy; once they succeed they won't give anything back to permit +another solution. They have the following meanings: =over 4 =item * -C means: match at least C times, not more than C times, -as many times as possible, and don't give anything up. C is short +C means: match at least C times, not more than C times, +as many times as possible, and don't give anything up. C is short for C =item * C means: match at least C times, but as many times as possible, -and don't give anything up. C is short for C and C is +and don't give anything up. C is short for C and C is short for C. =item * @@ -1319,15 +1320,15 @@ notational consistency. =back -These possessive quantifiers represent a special case of a more general -concept, the I, see below. +These possessive quantifiers represent a special case of a more general +concept, the I, see below. As an example where a possessive quantifier is suitable we consider matching a quoted string, as it appears in several programming languages. The backslash is used as an escape character that indicates that the next character is to be taken literally, as another character for the string. Therefore, after the opening quote, we expect a (possibly -empty) sequence of alternatives: either some character except an +empty) sequence of alternatives: either some character except an unescaped quote or backslash or an escaped character. /"(?:[^"\\]++|\\.)*+"/; @@ -1492,12 +1493,12 @@ C and arbitrary delimiter C forms. We have used the binding operator C<=~> and its negation C to test for string matches. Associated with the matching operator, we have discussed the single line C, multi-line C, case-insensitive C and -extended C modifiers. There are a few more things you might -want to know about matching operators. +extended C modifiers. There are a few more things you might +want to know about matching operators. =head3 Optimizing pattern evaluation -We pointed out earlier that variables in regexps are substituted +We pointed out earlier that variables in regexps are substituted before the regexp is evaluated: $pattern = 'Seuss'; @@ -1531,7 +1532,7 @@ special delimiter C: print if m'@pattern'; # matches literal '@pattern', not 'Seuss' } -Similar to strings, C acts like apostrophes on a regexp; all other +Similar to strings, C acts like apostrophes on a regexp; all other C delimiters act like quotes. If the regexp evaluates to the empty string, the regexp in the I is used instead. So we have @@ -1747,10 +1748,10 @@ matches. =head3 The split function The C function is another place where a regexp is used. -C separates the C operand into -a list of substrings and returns that list. The regexp must be designed +C separates the C operand into +a list of substrings and returns that list. The regexp must be designed to match whatever constitutes the separators for the desired substrings. -The C, if present, constrains splitting into no more than C +The C, if present, constrains splitting into no more than C number of strings. For example, to split a string into words, use $x = "Calvin and Hobbes"; @@ -1806,7 +1807,7 @@ haven't covered yet. There are several escape sequences that convert characters or strings between upper and lower case, and they are also available within -patterns. C<\l> and C<\u> convert the next character to lower or +patterns. C<\l> and C<\u> convert the next character to lower or upper case, respectively: $x = "perl"; @@ -1940,7 +1941,7 @@ For the full list see L. The Unicode has also been separated into various sets of characters which you can test with C<\p{...}> (in) and C<\P{...}> (not in). To test whether a character is (or is not) an element of a script -you would use the script name, for example C<\p{Latin}>, C<\p{Greek}>, +you would use the script name, for example C<\p{Latin}>, C<\p{Greek}>, or C<\P{Katakana}>. Other sets are the Unicode blocks, the names of which begin with "In". One such block is dedicated to mathematical operators, and its pattern formula is }>. @@ -2048,10 +2049,10 @@ flexibility without sacrificing speed. Backtracking is more efficient than repeated tries with different regular expressions. If there are several regular expressions and a match with -any of them is acceptable, then it is possible to combine them into a set +any of them is acceptable, then it is possible to combine them into a set of alternatives. If the individual expressions are input data, this -can be done by programming a join operation. We'll exploit this idea in -an improved version of the C program: a program that matches +can be done by programming a join operation. We'll exploit this idea in +an improved version of the C program: a program that matches multiple patterns: % cat > multi_grep @@ -2075,9 +2076,9 @@ multiple patterns: Sometimes it is advantageous to construct a pattern from the I that is to be analyzed and use the permissible values on the left hand side of the matching operations. As an example for this somewhat -paradoxical situation, let's assume that our input contains a command +paradoxical situation, let's assume that our input contains a command verb which should match one out of a set of available command verbs, -with the additional twist that commands may be abbreviated as long as +with the additional twist that commands may be abbreviated as long as the given string is unique. The program below demonstrates the basic algorithm. @@ -2106,12 +2107,11 @@ algorithm. Rather than trying to match the input against the keywords, we match the combined set of keywords against the input. The pattern matching -operation S> does several things at the -same time. It makes sure that the given command begins where a keyword -begins (C<\b>). It tolerates abbreviations due to the added C<\w*>. It -tells us the number of matches (C) and all the keywords +operation S> does several things at the +same time. It makes sure that the given command begins where a keyword +begins (C<\b>). It tolerates abbreviations due to the added C<\w*>. It +tells us the number of matches (C) and all the keywords that were actually matched. You could hardly ask for more. - =head2 Embedding comments and modifiers in a regular expression @@ -2133,7 +2133,7 @@ example is This style of commenting has been largely superseded by the raw, freeform commenting that is allowed with the C modifier. -The modifiers C, C, C, C and C (or any +The modifiers C, C, C, C and C (or any combination thereof) can also embedded in a regexp using C<(?i)>, C<(?m)>, C<(?s)>, and C<(?x)>. For instance, @@ -2190,7 +2190,7 @@ characters (advance the character position) if they match. The examples we have seen so far are the anchors. The anchor C<^> matches the beginning of the line, but doesn't eat any characters. Similarly, the word boundary anchor C<\b> matches wherever a character matching C<\w> -is next to a character that doesn't, but it doesn't eat up any +is next to a character that doesn't, but it doesn't eat up any characters itself. Anchors are examples of I. Zero-width, because they consume no characters, and assertions, because they test some property of the @@ -2340,7 +2340,7 @@ integer in parentheses C<(integer)>. It is true if the corresponding backreference C<\integer> matched earlier in the regexp. The same thing can be done with a name associated with a capture buffer, written as C<< () >> or C<< ('name') >>. The second form is a bare -zero width assertion C<(?...)>, either a lookahead, a lookbehind, or a +zero width assertion C<(?...)>, either a lookahead, a lookbehind, or a code assertion (discussed in the next section). The third set of forms provides tests that return true if the expression is executed within a recursion (C<(R)>) or is being called from some capturing group, @@ -2391,7 +2391,7 @@ group at the end of the pattern contains their definition. Notice that the decimal fraction pattern is the first place where we can reuse the integer pattern. - /^ (?&osg)\ * ( (?&int)(?&dec)? | (?&dec) ) + /^ (?&osg)\ * ( (?&int)(?&dec)? | (?&dec) ) (?: [eE](?&osg)(?&int) )? $ (?(DEFINE) @@ -2406,7 +2406,7 @@ reuse the integer pattern. This feature (introduced in Perl 5.10) significantly extends the power of Perl's pattern matching. By referring to some other capture group anywhere in the pattern with the construct -C<(?group-ref)>, the I within the referenced group is used +C<(?group-ref)>, the I within the referenced group is used as an independent subpattern in place of the group reference itself. Because the group reference may be contained I the group it refers to, it is now possible to apply pattern matching to tasks that @@ -2444,7 +2444,7 @@ arbitrary Perl code to be a part of a regexp. A code evaluation expression is denoted C<(?{code})>, with I a string of Perl statements. -Be warned that this feature is considered experimental, and may be +Be warned that this feature is considered experimental, and may be changed without notice. Code expressions are zero-width assertions, and the value they return @@ -2658,7 +2658,7 @@ The regexp without the C modifier is /^1(?:((??{ $z0 }))1(?{ $z0 = $z1; $z1 .= $^N; }))+$/ which shows that spaces are still possible in the code parts. Nevertheless, -when working with code and conditional expressions, the extended form of +when working with code and conditional expressions, the extended form of regexps is almost necessary in creating and debugging regexps. @@ -2676,9 +2676,9 @@ Below is just one example, illustrating the control verb C<(*FAIL)>, which may be abbreviated as C<(*F)>. If this is inserted in a regexp it will cause to fail, just like at some mismatch between the pattern and the string. Processing of the regexp continues like after any "normal" -failure, so that, for instance, the next position in the string or another -alternative will be tried. As failing to match doesn't preserve capture -buffers or produce results, it may be necessary to use this in +failure, so that, for instance, the next position in the string or another +alternative will be tried. As failing to match doesn't preserve capture +buffers or produce results, it may be necessary to use this in combination with embedded code. %count = (); @@ -2686,11 +2686,11 @@ combination with embedded code. /([aeiou])(?{ $count{$1}++; })(*FAIL)/oi; printf "%3d '%s'\n", $count{$_}, $_ for (sort keys %count); -The pattern begins with a class matching a subset of letters. Whenever -this matches, a statement like C<$count{'a'}++;> is executed, incrementing -the letter's counter. Then C<(*FAIL)> does what it says, and -the regexp engine proceeds according to the book: as long as the end of -the string hasn't been reached, the position is advanced before looking +The pattern begins with a class matching a subset of letters. Whenever +this matches, a statement like C<$count{'a'}++;> is executed, incrementing +the letter's counter. Then C<(*FAIL)> does what it says, and +the regexp engine proceeds according to the book: as long as the end of +the string hasn't been reached, the position is advanced before looking for another vowel. Thus, match or no match makes no difference, and the regexp engine proceeds until the the entire string has been inspected. (It's remarkable that an alternative solution using something like diff --git a/pod/perlrun.pod b/pod/perlrun.pod index 7880135..c8bc669 100644 --- a/pod/perlrun.pod +++ b/pod/perlrun.pod @@ -977,7 +977,7 @@ C<__END__> if there is trailing garbage to be ignored (the program can process any or all of the trailing garbage via the DATA filehandle if desired). -The directory, if specified, must appear immedately following the B<-x> +The directory, if specified, must appear immediately following the B<-x> with no intervening whitespace. =back diff --git a/pod/perlxs.pod b/pod/perlxs.pod index 966cdc8..facfcf9 100644 --- a/pod/perlxs.pod +++ b/pod/perlxs.pod @@ -554,7 +554,7 @@ exception happens if this C<;> terminates the line, then this C<;> is quietly ignored. The following code demonstrates how to supply initialization code for -function parameters. The initialization code is eval'd within double +function parameters. The initialization code is eval'ed within double quotes by the compiler before it is added to the output so anything which should be interpreted literally [mainly C<$>, C<@>, or C<\\>] must be protected with backslashes. The variables $var, $arg, @@ -1313,7 +1313,7 @@ this: In this case, the function will overload both of the three way comparison operators. For all overload operations using non-alpha -characters, you must type the parameter without quoting, seperating +characters, you must type the parameter without quoting, separating multiple overloads with whitespace. Note that "" (the stringify overload) should be entered as \"\" (i.e. escaped). diff --git a/pod/perlxstut.pod b/pod/perlxstut.pod index 4f8bbc1..2446cc4 100644 --- a/pod/perlxstut.pod +++ b/pod/perlxstut.pod @@ -333,7 +333,7 @@ the .pm or .xs files, you should increment the value of this variable. =head2 Writing good test scripts -The importance of writing good test scripts cannot be overemphasized. You +The importance of writing good test scripts cannot be over-emphasized. You should closely follow the "ok/not ok" style that Perl itself uses, so that it is very easy and unambiguous to determine the outcome of each test case. When you find and fix a bug, make sure you add a test case for it.