X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=pod%2Fperlfaq4.pod;h=f671b626bf762f3ab911c9884554a57e72723dac;hb=ba336be1bea3ce6a079a831a7a58533d4e92ecc9;hp=e9c4ab316a618990aafa0d7da03b16f8640b5669;hpb=ac9dac7f0e1dffa602850506b980a255334a4f40;p=p5sagit%2Fp5-mst-13.2.git diff --git a/pod/perlfaq4.pod b/pod/perlfaq4.pod index e9c4ab3..f671b62 100644 --- a/pod/perlfaq4.pod +++ b/pod/perlfaq4.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq4 - Data Manipulation ($Revision: 6816 $) +perlfaq4 - Data Manipulation =head1 DESCRIPTION @@ -17,11 +17,11 @@ exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl. -L show the gory details of number representations and +L shows the gory details of number representations and conversions. To limit the number of decimal places in your numbers, you can use the -printf or sprintf function. See the L<"Floating Point +C or C function. See the L<"Floating Point Arithmetic"|perlop> for more details. printf "%.2f", 10/3; @@ -48,35 +48,45 @@ numbers. What you think in the above as 'three' is really more like =head2 Why isn't my octal data interpreted correctly? -Perl only understands octal and hex numbers as such when they occur as -literals in your program. Octal literals in perl must start with a -leading C<0> and hexadecimal literals must start with a leading C<0x>. -If they are read in from somewhere and assigned, no automatic -conversion takes place. You must explicitly use C or C if you -want the values converted to decimal. C interprets hexadecimal (C<0x350>), -octal (C<0350> or even without the leading C<0>, like C<377>) and binary -(C<0b1010>) numbers, while C only converts hexadecimal ones, with -or without a leading C<0x>, such as C<0x255>, C<3A>, C, or C. -The inverse mapping from decimal to octal can be done with either the -<%o> or C<%O> C formats. +(contributed by brian d foy) + +You're probably trying to convert a string to a number, which Perl only +converts as a decimal number. When Perl converts a string to a number, it +ignores leading spaces and zeroes, then assumes the rest of the digits +are in base 10: + + my $string = '0644'; + + print $string + 0; # prints 644 + + print $string + 44; # prints 688, certainly not octal! + +This problem usually involves one of the Perl built-ins that has the +same name a Unix command that uses octal numbers as arguments on the +command line. In this example, C on the command line knows that +its first argument is octal because that's what it does: -This problem shows up most often when people try using C, -C, C, or C, which by widespread tradition -typically take permissions in octal. + %prompt> chmod 644 file - chmod(644, $file); # WRONG - chmod(0644, $file); # right +If you want to use the same literal digits (644) in Perl, you have to tell +Perl to treat them as octal numbers either by prefixing the digits with +a C<0> or using C: -Note the mistake in the first line was specifying the decimal literal -C<644>, rather than the intended octal literal C<0644>. The problem can -be seen with: + chmod( 0644, $file); # right, has leading zero + chmod( oct(644), $file ); # also correct - printf("%#o",644); # prints 01204 +The problem comes in when you take your numbers from something that Perl +thinks is a string, such as a command line argument in C<@ARGV>: -Surely you had not intended C - did you? If you -want to use numeric literals as arguments to chmod() et al. then please -try to express them as octal constants, that is with a leading zero and -with the following digits restricted to the set C<0..7>. + chmod( $ARGV[0], $file); # wrong, even if "0644" + + chmod( oct($ARGV[0]), $file ); # correct, treat string as octal + +You can always check the value you're using by printing it in octal +notation to ensure it matches what you think it should be. Print it +in octal and decimal format: + + printf "0%o %d", $number, $number; =head2 Does Perl have a round() function? What about ceil() and floor()? Trig functions? @@ -362,19 +372,18 @@ pseudorandom generator than comes with your operating system, look at =head2 How do I get a random number between X and Y? -To get a random number between two values, you can use the -C builtin to get a random number between 0 and +To get a random number between two values, you can use the C +built-in to get a random number between 0 and 1. From there, you shift +that into the range that you want. -C returns a number such that -C<< 0 <= rand($x) < $x >>. Thus what you want to have perl -figure out is a random number in the range from 0 to the -difference between your I and I. +C returns a number such that C<< 0 <= rand($x) < $x >>. Thus +what you want to have perl figure out is a random number in the range +from 0 to the difference between your I and I. -That is, to get a number between 10 and 15, inclusive, you -want a random number between 0 and 5 that you can then add -to 10. +That is, to get a number between 10 and 15, inclusive, you want a +random number between 0 and 5 that you can then add to 10. - my $number = 10 + int rand( 15-10+1 ); + my $number = 10 + int rand( 15-10+1 ); # ( 10,11,12,13,14, or 15 ) Hence you derive the following simple function to abstract that. It selects a random integer between the two given @@ -479,6 +488,9 @@ Julian day) 31 =head2 How do I find yesterday's date? +X X X X X +X X X X +X (contributed by brian d foy) @@ -491,49 +503,60 @@ give you the same time of day, only the day before. print "Yesterday was $yesterday\n"; -You can also use the C module using its Today_and_Now +You can also use the C module using its C function. use Date::Calc qw( Today_and_Now Add_Delta_DHMS ); my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 ); - print "@date\n"; + print "@date_time\n"; Most people try to use the time rather than the calendar to figure out dates, but that assumes that days are twenty-four hours each. For most people, there are two days a year when they aren't: the switch to and from summer time throws this off. Let the modules do the work. -=head2 Does Perl have a Year 2000 problem? Is Perl Y2K compliant? +If you absolutely must do it yourself (or can't use one of the +modules), here's a solution using C, which comes with +Perl: -Short answer: No, Perl does not have a Year 2000 problem. Yes, Perl is -Y2K compliant (whatever that means). The programmers you've hired to -use it, however, probably are not. + # contributed by Gunnar Hjalmarsson + use Time::Local; + my $today = timelocal 0, 0, 12, ( localtime )[3..5]; + my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; + printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d; -Long answer: The question belies a true understanding of the issue. -Perl is just as Y2K compliant as your pencil--no more, and no less. -Can you use your pencil to write a non-Y2K-compliant memo? Of course -you can. Is that the pencil's fault? Of course it isn't. +In this case, you measure the day starting at noon, and subtract 24 +hours. Even if the length of the calendar day is 23 or 25 hours, +you'll still end up on the previous calendar day, although not at +noon. Since you don't care about the time, the one hour difference +doesn't matter and you end up with the previous date. + +=head2 Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant? + +(contributed by brian d foy) -The date and time functions supplied with Perl (gmtime and localtime) -supply adequate information to determine the year well beyond 2000 -(2038 is when trouble strikes for 32-bit machines). The year returned -by these functions when used in a list context is the year minus 1900. -For years between 1910 and 1999 this I to be a 2-digit decimal -number. To avoid the year 2000 problem simply do not treat the year as -a 2-digit number. It isn't. +Perl itself never had a Y2K problem, although that never stopped people +from creating Y2K problems on their own. See the documentation for +C for its proper use. -When gmtime() and localtime() are used in scalar context they return -a timestamp string that contains a fully-expanded year. For example, -C<$timestamp = gmtime(1005613200)> sets $timestamp to "Tue Nov 13 01:00:00 -2001". There's no year 2000 problem here. +Starting with Perl 5.11, C and C can handle dates past +03:14:08 January 19, 2038, when a 32-bit based time would overflow. You +still might get a warning on a 32-bit C: -That doesn't mean that Perl can't be used to create non-Y2K compliant -programs. It can. But so can your pencil. It's the fault of the user, -not the language. At the risk of inflaming the NRA: "Perl doesn't -break Y2K, people do." See http://www.perl.org/about/y2k.html for -a longer exposition. + % perl5.11.2 -E 'say scalar localtime( 0x9FFF_FFFFFFFF )' + Integer overflow in hexadecimal number at -e line 1. + Wed Nov 1 19:42:39 5576711 + +On a 64-bit C, you can get even larger dates for those really long +running projects: + + % perl5.11.2 -E 'say scalar gmtime( 0x9FFF_FFFFFFFF )' + Thu Nov 2 00:42:39 5576711 + +You're still out of luck if you need to keep tracking of decaying protons +though. =head1 Data: Strings @@ -600,7 +623,9 @@ anonymous array. In this case, we call the function in list context. If we want to call the function in scalar context, we have to do a bit more work. We can really have any code we like inside the braces, so we simply have to end with the scalar reference, although how you do -that is up to you, and you can use code inside the braces. +that is up to you, and you can use code inside the braces. Note that +the use of parens creates a list context, so we need C to +force the scalar context on the function: print "The time is ${\(scalar localtime)}.\n" @@ -780,14 +805,33 @@ result to a scalar, producing a count of the number of matches. $count = () = $string =~ /-\d+/g; =head2 How do I capitalize all the words on one line? +X X X X -To make the first letter of each word upper case: +(contributed by brian d foy) - $line =~ s/\b(\w)/\U$1/g; +Damian Conway's L handles all of the thinking +for you. -This has the strange effect of turning "C" into "C". Sometimes you might want this. Other times you might need a -more thorough solution (Suggested by brian d foy): + use Text::Autoformat; + my $x = "Dr. Strangelove or: How I Learned to Stop ". + "Worrying and Love the Bomb"; + + print $x, "\n"; + for my $style (qw( sentence title highlight )) { + print autoformat($x, { case => $style }), "\n"; + } + +How do you want to capitalize those words? + + FRED AND BARNEY'S LODGE # all uppercase + Fred And Barney's Lodge # title case + Fred and Barney's Lodge # highlight case + +It's not as easy a problem as it looks. How many words do you think +are in there? Wait for it... wait for it.... If you answered 5 +you're right. Perl words are groups of C<\w+>, but that's not what +you want to capitalize. How is Perl supposed to know not to capitalize +that C after the apostrophe? You could try a regular expression: $string =~ s/ ( (^\w) #at the beginning of the line @@ -798,34 +842,8 @@ more thorough solution (Suggested by brian d foy): $string =~ s/([\w']+)/\u\L$1/g; -To make the whole line upper case: - - $line = uc($line); - -To force each word to be lower case, with the first letter upper case: - - $line =~ s/(\w+)/\u\L$1/g; - -You can (and probably should) enable locale awareness of those -characters by placing a C pragma in your program. -See L for endless details on locales. - -This is sometimes referred to as putting something into "title -case", but that's not quite accurate. Consider the proper -capitalization of the movie I, for example. - -Damian Conway's L module provides some smart -case transformations: - - use Text::Autoformat; - my $x = "Dr. Strangelove or: How I Learned to Stop ". - "Worrying and Love the Bomb"; - - print $x, "\n"; - for my $style (qw( sentence title highlight )) { - print autoformat($x, { case => $style }), "\n"; - } +Now, what if you don't want to capitalize that "and"? Just use +L and get on with the next problem. :) =head2 How can I split a [character] delimited string except when inside [character]? @@ -958,25 +976,39 @@ Left and right padding with any character, modifying C<$text> directly: =head2 How do I extract selected columns from a string? -Use C or C, both documented in L. -If you prefer thinking in terms of columns instead of widths, -you can use this kind of thing: - - # determine the unpack format needed to split Linux ps output - # arguments are cut columns - my $fmt = cut2fmt(8, 14, 20, 26, 30, 34, 41, 47, 59, 63, 67, 72); - - sub cut2fmt { - my(@positions) = @_; - my $template = ''; - my $lastpos = 1; - for my $place (@positions) { - $template .= "A" . ($place - $lastpos) . " "; - $lastpos = $place; - } - $template .= "A*"; - return $template; - } +(contributed by brian d foy) + +If you know where the columns that contain the data, you can +use C to extract a single column. + + my $column = substr( $line, $start_column, $length ); + +You can use C if the columns are separated by whitespace or +some other delimiter, as long as whitespace or the delimiter cannot +appear as part of the data. + + my $line = ' fred barney betty '; + my @columns = split /\s+/, $line; + # ( '', 'fred', 'barney', 'betty' ); + + my $line = 'fred||barney||betty'; + my @columns = split /\|/, $line; + # ( 'fred', '', 'barney', '', 'betty' ); + +If you want to work with comma-separated values, don't do this since +that format is a bit more complicated. Use one of the modules that +handle that format, such as C, C, or +C. + +If you want to break apart an entire line of fixed columns, you can use +C with the A (ASCII) format. By using a number after the format +specifier, you can denote the column width. See the C and C +entries in L for more details. + + my @fields = unpack( $line, "A8 A8 A8 A16 A4" ); + +Note that spaces in the format argument to C do not denote literal +spaces. If you have space separated data, you may want C instead. =head2 How do I find the soundex value of a string? @@ -988,37 +1020,64 @@ C, and C modules. =head2 How can I expand variables in text strings? -Let's assume that you have a string that contains placeholder -variables. +(contributed by brian d foy) + +If you can avoid it, don't, or if you can use a templating system, +such as C or C