=head1 NAME
-perlfaq4 - Data Manipulation ($Revision: 1.49 $, $Date: 1999/05/23 20:37:49 $)
+perlfaq4 - Data Manipulation ($Revision: 1.20 $, $Date: 2002/04/07 18:46:13 $)
=head1 DESCRIPTION
-The section of the FAQ answers questions related to the manipulation
-of data as numbers, dates, strings, arrays, hashes, and miscellaneous
-data issues.
+This section of the FAQ answers questions related to manipulating
+numbers, dates, strings, arrays, hashes, and miscellaneous data issues.
=head1 Data: Numbers
machines) will work pretty much like mathematical integers. Other numbers
are not guaranteed.
-=head2 How do I convert bits into ints?
+=head2 How do I convert between numeric representations?
-To turn a string of 1s and 0s like C<10110110> into a scalar containing
-its binary value, use the pack() and unpack() functions (documented in
-L<perlfunc/"pack"> and L<perlfunc/"unpack">):
+As always with Perl there is more than one way to do it. Below
+are a few examples of approaches to making common conversions
+between number representations. This is intended to be representational
+rather than exhaustive.
- $decimal = unpack('c', pack('B8', '10110110'));
+Some of the examples below use the Bit::Vector module from CPAN.
+The reason you might choose Bit::Vector over the perl built in
+functions is that it works with numbers of ANY size, that it is
+optimized for speed on some operations, and for at least some
+programmers the notation might be familiar.
-This packs the string C<10110110> into an eight bit binary structure.
-This is then unpacked as a character, which returns its ordinal value.
+=item B<How do I convert Hexadecimal into decimal:>
-This does the same thing:
+Using perl's built in conversion of 0x notation:
+
+ $int = 0xDEADBEEF;
+ $dec = sprintf("%d", $int);
+
+Using the hex function:
+
+ $int = hex("DEADBEEF");
+ $dec = sprintf("%d", $int);
+
+Using pack:
+
+ $int = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8)));
+ $dec = sprintf("%d", $int);
+
+Using the CPAN module Bit::Vector:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Hex(32, "DEADBEEF");
+ $dec = $vec->to_Dec();
+
+=item B<How do I convert from decimal to hexadecimal:>
+
+Using sprint:
+
+ $hex = sprintf("%X", 3735928559);
+
+Using unpack
+
+ $hex = unpack("H*", pack("N", 3735928559));
+
+Using Bit::Vector
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(32, -559038737);
+ $hex = $vec->to_Hex();
+
+And Bit::Vector supports odd bit counts:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(33, 3735928559);
+ $vec->Resize(32); # suppress leading 0 if unwanted
+ $hex = $vec->to_Hex();
+
+=item B<How do I convert from octal to decimal:>
+
+Using Perl's built in conversion of numbers with leading zeros:
+
+ $int = 033653337357; # note the leading 0!
+ $dec = sprintf("%d", $int);
+
+Using the oct function:
+
+ $int = oct("33653337357");
+ $dec = sprintf("%d", $int);
+
+Using Bit::Vector:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new(32);
+ $vec->Chunk_List_Store(3, split(//, reverse "33653337357"));
+ $dec = $vec->to_Dec();
+
+=item B<How do I convert from decimal to octal:>
+
+Using sprintf:
+
+ $oct = sprintf("%o", 3735928559);
+
+Using Bit::Vector
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(32, -559038737);
+ $oct = reverse join('', $vec->Chunk_List_Read(3));
+
+=item B<How do I convert from binary to decimal:>
+
+Using pack and ord
$decimal = ord(pack('B8', '10110110'));
-Here's an example of going the other way:
+Using pack and unpack for larger strings
+
+ $int = unpack("N", pack("B32",
+ substr("0" x 32 . "11110101011011011111011101111", -32)));
+ $dec = sprintf("%d", $int);
+
+ # substr() is used to left pad a 32 character string with zeros.
+
+Using Bit::Vector:
+
+ $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111");
+ $dec = $vec->to_Dec();
+
+=item B<How do I convert from decimal to binary:>
+
+Using unpack;
+
+ $bin = unpack("B*", pack("N", 3735928559));
+
+Using Bit::Vector:
+
+ use Bit::Vector;
+ $vec = Bit::Vector->new_Dec(32, -559038737);
+ $bin = $vec->to_Bin();
+
+The remaining transformations (e.g. hex -> oct, bin -> hex, etc.)
+are left as an exercise to the inclined reader.
- $binary_string = unpack('B*', "\x29");
=head2 Why doesn't & work the way I want it to?
=head2 How can I output Roman numerals?
-Get the http://www.perl.com/CPAN/modules/by-module/Roman module.
+Get the http://www.cpan.org/modules/by-module/Roman module.
=head2 Why aren't my random numbers random?
than more.
Computers are good at being predictable and bad at being random
-(despite appearances caused by bugs in your programs :-).
-http://www.perl.com/CPAN/doc/FMTEYEWTK/random , courtesy of Tom
-Phoenix, talks more about this. John von Neumann said, ``Anyone who
-attempts to generate random numbers by deterministic means is, of
+(despite appearances caused by bugs in your programs :-). see the
+F<random> artitcle in the "Far More Than You Ever Wanted To Know"
+collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz , courtesy of
+Tom Phoenix, talks more about this. John von Neumann said, ``Anyone
+who attempts to generate random numbers by deterministic means is, of
course, living in a state of sin.''
If you want numbers that are more random than C<rand> with C<srand>
pseudorandom generator than comes with your operating system, look at
``Numerical Recipes in C'' at http://www.nr.com/ .
+=head2 How do I get a random number between X and Y?
+
+Use the following simple function. It selects a random integer between
+(and possibly including!) the two given integers, e.g.,
+C<random_int_in(50,120)>
+
+ sub random_int_in ($$) {
+ my($min, $max) = @_;
+ # Assumes that the two arguments are integers themselves!
+ return $min if $min == $max;
+ ($min, $max) = ($max, $min) if $min > $max;
+ return $min + int rand(1 + $max - $min);
+ }
+
=head1 Data: Dates
=head2 How do I find the week-of-the-year/day-of-the-year?
@( = ('(','');
@) = (')','');
($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs;
- @$ = (eval{/$re/},$@!~/unmatched/);
+ @$ = (eval{/$re/},$@!~/unmatched/i);
print join("\n",@$[0..$#$]) if( $$[-1] );
=head2 How do I reverse a string?
The paragraphs you give to Text::Wrap should not contain embedded
newlines. Text::Wrap doesn't justify the lines (flush-right).
+Or use the CPAN module Text::Autoformat. Formatting files can be easily
+done by making a shell alias, like so:
+
+ alias fmt="perl -i -MText::Autoformat -n0777 \
+ -e 'print autoformat $_, {all=>1}' $*"
+
+See the documentation for Text::Autoformat to appreciate its many
+capabilities.
+
=head2 How can I access/change the first N letters of a string?
There are many ways. If you just want to grab a copy, use
while ($string =~ /-\d+/g) { $count++ }
print "There are $count negative numbers in the string";
+Another version uses a global match in list context, then assigns the
+result to a scalar, producing a count of the number of matches.
+
+ $count = () = $string =~ /-\d+/g;
+
=head2 How do I capitalize all the words on one line?
To make the first letter of each word upper case:
This has the strange effect of turning "C<don't do it>" into "C<Don'T
Do It>". Sometimes you might want this. Other times you might need a
-more thorough solution (Suggested by brian d. foy):
+more thorough solution (Suggested by brian d foy):
$string =~ s/ (
(^\w) #at the beginning of the line
would deliver us. You are a liar, Saruman, and a corrupter
of men's hearts. --Theoden in /usr/src/perl/taint.c
FINIS
- $quote =~ s/\s*--/\n--/;
+ $quote =~ s/\s+--/\n--/;
A nice general-purpose fixer-upper function for indented here documents
follows. It expects to be called with a here document as its argument.
That being said, there are several ways to approach this. If you
are going to make this query many times over arbitrary string values,
-the fastest way is probably to invert the original array and keep an
-associative array lying about whose keys are the first array's values.
+the fastest way is probably to invert the original array and maintain a
+hash whose keys are the first array's values.
@blues = qw/azure cerulean teal turquoise lapis-lazuli/;
- undef %is_blue;
+ %is_blue = ();
for (@blues) { $is_blue{$_} = 1 }
Now you can check whether $is_blue{$some_color}. It might have been a
array. This kind of an array will take up less space:
@primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
- undef @is_tiny_prime;
+ @is_tiny_prime = ();
for (@primes) { $is_tiny_prime[$_] = 1 }
# or simply @istiny_prime[@primes] = (1) x @primes;
=head2 How do I shuffle an array randomly?
-Use this:
+If you either have Perl 5.8.0 or later installed, or if you have
+Scalar-List-Utils 1.03 or later installed, you can say:
+
+ use List::Util 'shuffle';
+
+ @shuffled = shuffle(@list);
+
+If not, you can use a Fisher-Yates shuffle.
- # fisher_yates_shuffle( \@array ) :
- # generate a random permutation of @array in place
sub fisher_yates_shuffle {
- my $array = shift;
- my $i;
- for ($i = @$array; --$i; ) {
+ my $deck = shift; # $deck is a reference to an array
+ my $i = @$deck;
+ while ($i--) {
my $j = int rand ($i+1);
- @$array[$i,$j] = @$array[$j,$i];
+ @$deck[$i,$j] = @$deck[$j,$i];
}
}
- fisher_yates_shuffle( \@array ); # permutes @array in place
+ # shuffle my mpeg collection
+ #
+ my @mpeg = <audio/*/*.mp3>;
+ fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place
+ print @mpeg;
+
+Note that the above implementation shuffles an array in place,
+unlike the List::Util::shuffle() which takes a list and returns
+a new shuffled list.
You've probably seen shuffling algorithms that work using splice,
randomly picking another element to swap the current element with
This can be conveniently combined with precalculation of keys as given
above.
-See http://www.perl.com/CPAN/doc/FMTEYEWTK/sort.html for more about
-this approach.
+See the F<sort> artitcle article in the "Far More Than You Ever Wanted
+To Know" collection in http://www.cpan.org/olddoc/FMTEYEWTK.tgz for
+more about this approach.
See also the question below on sorting hashes.
$vec = '';
foreach(@ints) { vec($vec,$_,1) = 1 }
-And here's how, given a vector in $vec, you can
+Here's how, given a vector in $vec, you can
get those bits into your @ints array:
sub bitvec_to_list {
This method gets faster the more sparse the bit vector is.
(Courtesy of Tim Bunce and Winfried Koenig.)
-Here's a demo on how to use vec():
+Or use the CPAN module Bit::Vector:
+
+ $vector = Bit::Vector->new($num_of_bits);
+ $vector->Index_List_Store(@ints);
+ @ints = $vector->Index_List_Read();
+
+Bit::Vector provides efficient methods for bit vector, sets of small integers
+and "big int" math.
+
+Here's a more extensive illustration using vec():
# vec demo
$vector = "\xff\x0f\xef\xfe";
=head2 How can I know how many entries are in a hash?
If you mean how many keys, then all you have to do is
-take the scalar sense of the keys() function:
+use the keys() function in a scalar context:
- $num_keys = scalar keys %hash;
+ $num_keys = keys %hash;
-The keys() function also resets the iterator, which in void context is
-faster for tied hashes than would be iterating through the whole
-hash, one key-value pair at a time.
+The keys() function also resets the iterator, which means that you may
+see strange results if you use this between uses of other hash operators
+such as each().
=head2 How do I sort a hash (optionally by value instead of key)?
=head2 How can I use a reference as a hash key?
-You can't do this directly, but you could use the standard Tie::Refhash
+You can't do this directly, but you could use the standard Tie::RefHash
module distributed with Perl.
=head1 Data: Misc
if (/^-?\d+$/) { print "is an integer\n" }
if (/^[+-]?\d+$/) { print "is a +/- integer\n" }
if (/^-?\d+\.?\d*$/) { print "is a real number\n" }
- if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number" }
+ if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number\n" }
if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/)
- { print "a C float" }
+ { print "a C float\n" }
If you're on a POSIX system, Perl's supports the C<POSIX::strtod>
function. Its semantics are somewhat cumbersome, so here's a C<getnum>
=head2 How do I keep persistent data across program calls?
For some specific applications, you can use one of the DBM modules.
-See L<AnyDBM_File>. More generically, you should consult the FreezeThaw,
-Storable, or Class::Eroot modules from CPAN. Starting from Perl 5.8
-Storable is part of the standard distribution. Here's one example using
-Storable's C<store> and C<retrieve> functions:
+See L<AnyDBM_File>. More generically, you should consult the FreezeThaw
+or Storable modules from CPAN. Starting from Perl 5.8 Storable is part
+of the standard distribution. Here's one example using Storable's C<store>
+and C<retrieve> functions:
use Storable;
store(\%hash, "filename");
=head1 AUTHOR AND COPYRIGHT
-Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington.
+Copyright (c) 1997-2002 Tom Christiansen and Nathan Torkington.
All rights reserved.
This documentation is free; you can redistribute it and/or modify it