=head1 NAME
-perlfaq4 - Data Manipulation ($Revision: 1.49 $, $Date: 1999/05/23 20:37:49 $)
+perlfaq4 - Data Manipulation ($Revision: 1.19 $, $Date: 2002/03/11 22:15:19 $)
=head1 DESCRIPTION
-The section of the FAQ answers questions related to the manipulation
-of data as numbers, dates, strings, arrays, hashes, and miscellaneous
-data issues.
+This section of the FAQ answers questions related to manipulating
+numbers, dates, strings, arrays, hashes, and miscellaneous data issues.
=head1 Data: Numbers
machines) will work pretty much like mathematical integers. Other numbers
are not guaranteed.
-=head2 How do I convert bits into ints?
+=head2 How do I convert between numeric representations?
-To turn a string of 1s and 0s like C<10110110> into a scalar containing
-its binary value, use the pack() and unpack() functions (documented in
-L<perlfunc/"pack"> and L<perlfunc/"unpack">):
+As always with Perl there is more than one way to do it. Below
+are a few examples of approaches to making common conversions
+between number representations. This is intended to be representational
+rather than exhaustive.
- $decimal = unpack('c', pack('B8', '10110110'));
+Some of the examples below use the Bit::Vector module from CPAN.
+The reason you might choose Bit::Vector over the perl built in
+functions is that it works with numbers of ANY size, that it is
+optimized for speed on some operations, and for at least some
+programmers the notation might be familiar.
-This packs the string C<10110110> into an eight bit binary structure.
-This is then unpacked as a character, which returns its ordinal value.
+=item B<How do I convert Hexadecimal into decimal:>
-This does the same thing:
+Using perl's built in conversion of 0x notation:
+
+ $int = 0xDEADBEEF;
+ $dec = sprintf("%d", $int);
+
+Using the hex function:
+
+ $int = hex("DEADBEEF");
+ $dec = sprintf("%d", $int);
+
+Using pack:
+
+ $int = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8)));
+ $dec = sprintf("%d", $int);
+
+Using the CPAN module Bit::Vector:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Hex(32, "DEADBEEF");
+ $dec = $vec->to_Dec();
+
+=item B<How do I convert from decimal to hexadecimal:>
+
+Using sprint:
+
+ $hex = sprintf("%X", 3735928559);
+
+Using unpack
+
+ $hex = unpack("H*", pack("N", 3735928559));
+
+Using Bit::Vector
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(32, -559038737);
+ $hex = $vec->to_Hex();
+
+And Bit::Vector supports odd bit counts:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(33, 3735928559);
+ $vec->Resize(32); # suppress leading 0 if unwanted
+ $hex = $vec->to_Hex();
+
+=item B<How do I convert from octal to decimal:>
+
+Using Perl's built in conversion of numbers with leading zeros:
+
+ $int = 033653337357; # note the leading 0!
+ $dec = sprintf("%d", $int);
+
+Using the oct function:
+
+ $int = oct("33653337357");
+ $dec = sprintf("%d", $int);
+
+Using Bit::Vector:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new(32);
+ $vec->Chunk_List_Store(3, split(//, reverse "33653337357"));
+ $dec = $vec->to_Dec();
+
+=item B<How do I convert from decimal to octal:>
+
+Using sprintf:
+
+ $oct = sprintf("%o", 3735928559);
+
+Using Bit::Vector
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(32, -559038737);
+ $oct = reverse join('', $vec->Chunk_List_Read(3));
+
+=item B<How do I convert from binary to decimal:>
+
+Using pack and ord
$decimal = ord(pack('B8', '10110110'));
-Here's an example of going the other way:
+Using pack and unpack for larger strings
+
+ $int = unpack("N", pack("B32",
+ substr("0" x 32 . "11110101011011011111011101111", -32)));
+ $dec = sprintf("%d", $int);
+
+ # substr() is used to left pad a 32 character string with zeros.
+
+Using Bit::Vector:
+
+ $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111");
+ $dec = $vec->to_Dec();
+
+=item B<How do I convert from decimal to binary:>
+
+Using unpack;
+
+ $bin = unpack("B*", pack("N", 3735928559));
+
+Using Bit::Vector:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(32, -559038737);
+ $bin = $vec->to_Bin();
+
+The remaining transformations (e.g. hex -> oct, bin -> hex, etc.)
+are left as an exercise to the inclined reader.
- $binary_string = unpack('B*', "\x29");
=head2 Why doesn't & work the way I want it to?
=head2 How can I output Roman numerals?
-Get the http://www.perl.com/CPAN/modules/by-module/Roman module.
+Get the http://www.cpan.org/modules/by-module/Roman module.
=head2 Why aren't my random numbers random?
than more.
Computers are good at being predictable and bad at being random
-(despite appearances caused by bugs in your programs :-).
-http://www.perl.com/CPAN/doc/FMTEYEWTK/random , courtesy of Tom
-Phoenix, talks more about this. John von Neumann said, ``Anyone who
-attempts to generate random numbers by deterministic means is, of
+(despite appearances caused by bugs in your programs :-). see the
+F<random> artitcle in the "Far More Than You Ever Wanted To Know"
+collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz , courtesy of
+Tom Phoenix, talks more about this. John von Neumann said, ``Anyone
+who attempts to generate random numbers by deterministic means is, of
course, living in a state of sin.''
If you want numbers that are more random than C<rand> with C<srand>
pseudorandom generator than comes with your operating system, look at
``Numerical Recipes in C'' at http://www.nr.com/ .
+=head2 How do I get a random number between X and Y?
+
+Use the following simple function. It selects a random integer between
+(and possibly including!) the two given integers, e.g.,
+C<random_int_in(50,120)>
+
+ sub random_int_in ($$) {
+ my($min, $max) = @_;
+ # Assumes that the two arguments are integers themselves!
+ return $min if $min == $max;
+ ($min, $max) = ($max, $min) if $min > $max;
+ return $min + int rand(1 + $max - $min);
+ }
+
=head1 Data: Dates
=head2 How do I find the week-of-the-year/day-of-the-year?
@( = ('(','');
@) = (')','');
($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs;
- @$ = (eval{/$re/},$@!~/unmatched/);
+ @$ = (eval{/$re/},$@!~/unmatched/i);
print join("\n",@$[0..$#$]) if( $$[-1] );
=head2 How do I reverse a string?
The paragraphs you give to Text::Wrap should not contain embedded
newlines. Text::Wrap doesn't justify the lines (flush-right).
+Or use the CPAN module Text::Autoformat. Formatting files can be easily
+done by making a shell alias, like so:
+
+ alias fmt="perl -i -MText::Autoformat -n0777 \
+ -e 'print autoformat $_, {all=>1}' $*"
+
+See the documentation for Text::Autoformat to appreciate its many
+capabilities.
+
=head2 How can I access/change the first N letters of a string?
There are many ways. If you just want to grab a copy, use
while ($string =~ /-\d+/g) { $count++ }
print "There are $count negative numbers in the string";
+Another version uses a global match in list context, then assigns the
+result to a scalar, producing a count of the number of matches.
+
+ $count = () = $string =~ /-\d+/g;
+
=head2 How do I capitalize all the words on one line?
To make the first letter of each word upper case:
This has the strange effect of turning "C<don't do it>" into "C<Don'T
Do It>". Sometimes you might want this. Other times you might need a
-more thorough solution (Suggested by brian d. foy):
+more thorough solution (Suggested by brian d foy):
$string =~ s/ (
(^\w) #at the beginning of the line
would deliver us. You are a liar, Saruman, and a corrupter
of men's hearts. --Theoden in /usr/src/perl/taint.c
FINIS
- $quote =~ s/\s*--/\n--/;
+ $quote =~ s/\s+--/\n--/;
A nice general-purpose fixer-upper function for indented here documents
follows. It expects to be called with a here document as its argument.
That being said, there are several ways to approach this. If you
are going to make this query many times over arbitrary string values,
-the fastest way is probably to invert the original array and keep an
-associative array lying about whose keys are the first array's values.
+the fastest way is probably to invert the original array and maintain a
+hash whose keys are the first array's values.
@blues = qw/azure cerulean teal turquoise lapis-lazuli/;
- undef %is_blue;
+ %is_blue = ();
for (@blues) { $is_blue{$_} = 1 }
Now you can check whether $is_blue{$some_color}. It might have been a
array. This kind of an array will take up less space:
@primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
- undef @is_tiny_prime;
+ @is_tiny_prime = ();
for (@primes) { $is_tiny_prime[$_] = 1 }
# or simply @istiny_prime[@primes] = (1) x @primes;
If not, you can use this:
- # fisher_yates_shuffle( \@array ) :
- # generate a random permutation of @array in place
+ # fisher_yates_shuffle
+ # generate a random permutation of an array in place
+ # As in shuffling a deck of cards
+ #
sub fisher_yates_shuffle {
- my $array = shift;
- my $i;
- for ($i = @$array; --$i; ) {
+ my $deck = shift; # $deck is a reference to an array
+ my $i = @$deck;
+ while (--$i) {
my $j = int rand ($i+1);
- @$array[$i,$j] = @$array[$j,$i];
+ @$deck[$i,$j] = @$deck[$j,$i];
}
}
- fisher_yates_shuffle( \@array ); # permutes @array in place
+And here is an example of using it:
+
+ #
+ # shuffle my mpeg collection
+ #
+ my @mpeg = <audio/*/*.mp3>;
+ fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place
+ print @mpeg;
Note that the above implementation shuffles an array in place,
unlike the List::Util::shuffle() which takes a list and returns
This can be conveniently combined with precalculation of keys as given
above.
-See http://www.perl.com/CPAN/doc/FMTEYEWTK/sort.html for more about
-this approach.
+See the F<sort> artitcle article in the "Far More Than You Ever Wanted
+To Know" collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz for
+more about this approach.
See also the question below on sorting hashes.
$vec = '';
foreach(@ints) { vec($vec,$_,1) = 1 }
-And here's how, given a vector in $vec, you can
+Here's how, given a vector in $vec, you can
get those bits into your @ints array:
sub bitvec_to_list {
This method gets faster the more sparse the bit vector is.
(Courtesy of Tim Bunce and Winfried Koenig.)
-Here's a demo on how to use vec():
+Or use the CPAN module Bit::Vector:
+
+ $vector = Bit::Vector->new($num_of_bits);
+ $vector->Index_List_Store(@ints);
+ @ints = $vector->Index_List_Read();
+
+Bit::Vector provides efficient methods for bit vector, sets of small integers
+and "big int" math.
+
+Here's a more extensive illustration using vec():
# vec demo
$vector = "\xff\x0f\xef\xfe";
=head2 How can I know how many entries are in a hash?
If you mean how many keys, then all you have to do is
-take the scalar sense of the keys() function:
+use the keys() function in a scalar context:
- $num_keys = scalar keys %hash;
+ $num_keys = keys %hash;
-The keys() function also resets the iterator, which in void context is
-faster for tied hashes than would be iterating through the whole
-hash, one key-value pair at a time.
+The keys() function also resets the iterator, which means that you may
+see strange results if you use this between uses of other hash operators
+such as each().
=head2 How do I sort a hash (optionally by value instead of key)?
=head2 How can I use a reference as a hash key?
-You can't do this directly, but you could use the standard Tie::Refhash
+You can't do this directly, but you could use the standard Tie::RefHash
module distributed with Perl.
=head1 Data: Misc
if (/^-?\d+$/) { print "is an integer\n" }
if (/^[+-]?\d+$/) { print "is a +/- integer\n" }
if (/^-?\d+\.?\d*$/) { print "is a real number\n" }
- if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number" }
+ if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number\n" }
if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/)
- { print "a C float" }
+ { print "a C float\n" }
If you're on a POSIX system, Perl's supports the C<POSIX::strtod>
function. Its semantics are somewhat cumbersome, so here's a C<getnum>
=head2 How do I keep persistent data across program calls?
For some specific applications, you can use one of the DBM modules.
-See L<AnyDBM_File>. More generically, you should consult the FreezeThaw,
-Storable, or Class::Eroot modules from CPAN. Starting from Perl 5.8
-Storable is part of the standard distribution. Here's one example using
-Storable's C<store> and C<retrieve> functions:
+See L<AnyDBM_File>. More generically, you should consult the FreezeThaw
+or Storable modules from CPAN. Starting from Perl 5.8 Storable is part
+of the standard distribution. Here's one example using Storable's C<store>
+and C<retrieve> functions:
use Storable;
store(\%hash, "filename");
=head1 AUTHOR AND COPYRIGHT
-Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington.
+Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington.
All rights reserved.
This documentation is free; you can redistribute it and/or modify it