an entire pasture:
# Cow::speak, Horse::speak, Sheep::speak as before
- @pasture = qw(Cow Cow Horse Sheep Sheep);
- foreach $animal (@pasture) {
+ my @pasture = qw(Cow Cow Horse Sheep Sheep);
+ foreach my $animal (@pasture) {
&{$animal."::speak"};
}
a Sheep goes baaaah!
Wow. That symbolic coderef de-referencing there is pretty nasty.
-We're counting on C<no strict subs> mode, certainly not recommended
-for larger programs. And why was that necessary? Because the name of
-the package seems to be inseparable from the name of the subroutine we
-want to invoke within that package.
+We're counting on L<strict|C<no strict refs>> mode, certainly not
+recommended for larger programs. And why was that necessary? Because
+the name of the package seems to be inseparable from the name of the
+subroutine we want to invoke within that package.
Or is it?
That's not fun yet. Same number of characters, all constant, no
variables. But yet, the parts are separable now. Watch:
- $a = "Cow";
+ my $a = "Cow";
$a->speak; # invokes Cow->speak
Ahh! Now that the package name has been parted from the subroutine
name, we can use a variable package name. And this time, we've got
-something that works even when C<use strict refs> is enabled.
+something that works even when L<strict|C<use strict refs>> is
+enabled.
=head2 Invoking a barnyard
print "a Sheep goes baaaah!\n"
}
- @pasture = qw(Cow Cow Horse Sheep Sheep);
- foreach $animal (@pasture) {
+ my @pasture = qw(Cow Cow Horse Sheep Sheep);
+ foreach my $animal (@pasture) {
$animal->speak;
}
Let's call out from C<speak> to a helper method called C<sound>.
This method provides the constant text for the sound itself.
- { package Cow;
+ {
+ package Cow;
+
sub sound { "moooo" }
+
sub speak {
- my $class = shift;
- print "a $class goes ", $class->sound, "!\n"
+ my $class = shift;
+ print "a $class goes ", $class->sound, "!\n"
}
}
C<speak>. This in turn selects the C<< Cow->sound >> method, which
returns C<moooo>. But how different would this be for the C<Horse>?
- { package Horse;
+ {
+ package Horse;
+
sub sound { "neigh" }
+
sub speak {
my $class = shift;
print "a $class goes ", $class->sound, "!\n"
We'll define a common subroutine package called C<Animal>, with the
definition for C<speak>:
- { package Animal;
+ {
+ package Animal;
+
sub speak {
my $class = shift;
print "a $class goes ", $class->sound, "!\n"
Then, for each animal, we say it "inherits" from C<Animal>, along
with the animal-specific sound:
- { package Cow;
+ {
+ package Cow;
+
+ # Not safe under `use strict', see below
@ISA = qw(Animal);
+
sub sound { "moooo" }
}
Or allow it as an implicitly named package variable:
package Cow;
- use vars qw(@ISA);
- @ISA = qw(Animal);
+ our @ISA = qw(Animal);
If you're bringing in the class from outside, via an object-oriented
module, you change:
package Cow;
use Animal;
- use vars qw(@ISA);
- @ISA = qw(Animal);
+ our @ISA = qw(Animal);
into just:
package Cow;
use base qw(Animal);
-And that's pretty darn compact.
+And that's pretty darn compact. Read about the L<base|base> pragma.
=head2 Overriding the methods
Let's add a mouse, which can barely be heard:
- # Animal package from before
- { package Mouse;
- @ISA = qw(Animal);
+ # Animal package that we wrote before, goes here
+ {
+ package Mouse;
+
+ our @ISA = qw(Animal);
+
sub sound { "squeak" }
+
sub speak {
my $class = shift;
print "a $class goes ", $class->sound, "!\n";
First, we can invoke the C<Animal::speak> method directly:
- # Animal package from before
- { package Mouse;
- @ISA = qw(Animal);
+ # Animal package that we wrote before, goes here
+ {
+ package Mouse;
+
+ our @ISA = qw(Animal);
+
sub sound { "squeak" }
+
sub speak {
my $class = shift;
Animal::speak($class);
in the inheritance chain:
# same Animal as before
- { package Mouse;
+ {
+ package Mouse;
+
# same @ISA, &sound as before
+
sub speak {
my $class = shift;
$class->Animal::speak;
listed in C<@ISA>) automatically:
# same Animal as before
- { package Mouse;
+ {
+ package Mouse;
+
# same @ISA, &sound as before
+
sub speak {
my $class = shift;
$class->SUPER::speak;
or the equivalent:
- $a = "Class";
+ my $a = "Class";
$a->method(@args);
which constructs an argument list of:
Let's start with the code for the C<Animal> class
and the C<Horse> class:
- { package Animal;
+ {
+ package Animal;
+
sub speak {
my $class = shift;
print "a $class goes ", $class->sound, "!\n"
}
}
- { package Horse;
- @ISA = qw(Animal);
+
+ {
+ package Horse;
+
+ our @ISA = qw(Animal);
+
sub sound { "neigh" }
}
can be an instance, so let's start with the simplest reference
that can hold a horse's name: a scalar reference.
- my $name = "Mr. Ed";
+ my $name = "Mr. Ed";
my $talking = \$name;
So now C<$talking> is a reference to what will be the instance-specific
the instance-specific data. In this case, let's add a way to get at
the name:
- { package Horse;
- @ISA = qw(Animal);
+ {
+ package Horse;
+
+ our @ISA = qw(Animal);
+
sub sound { "neigh" }
+
sub name {
my $self = shift;
$$self;
but not if you just like to own horses. So, let's let the Horse class
build a new horse:
- { package Horse;
- @ISA = qw(Animal);
+ {
+ package Horse;
+
+ our @ISA = qw(Animal);
+
sub sound { "neigh" }
+
sub name {
my $self = shift;
$$self;
}
+
sub named {
my $class = shift;
- my $name = shift;
+ my $name = shift;
bless \$name, $class;
}
}
it's also the same recipe for building anything else that inherited from
C<Animal>, so let's put it there:
- { package Animal;
+ {
+ package Animal;
+
sub speak {
my $class = shift;
print "a $class goes ", $class->sound, "!\n"
}
+
sub name {
my $self = shift;
$$self;
}
+
sub named {
my $class = shift;
- my $name = shift;
+ my $name = shift;
bless \$name, $class;
}
}
- { package Horse;
- @ISA = qw(Animal);
+
+ {
+ package Horse;
+
+ our @ISA = qw(Animal);
+
sub sound { "neigh" }
}
sub name {
my $either = shift;
ref $either
- ? $$either # it's an instance, return name
+ ? $$either # it's an instance, return name
: "an unnamed $either"; # it's a class, return generic
}
holder to C<$either> to show that this is intended:
my $talking = Horse->named("Mr. Ed");
- print Horse->name, "\n"; # prints "an unnamed Horse\n"
+
+ print Horse->name, "\n"; # prints "an unnamed Horse\n"
print $talking->name, "\n"; # prints "Mr Ed.\n"
and now we'll fix C<speak> to use this:
Let's train our animals to eat:
- { package Animal;
+ {
+ package Animal;
sub named {
my $class = shift;
- my $name = shift;
+ my $name = shift;
bless \$name, $class;
}
+
sub name {
my $either = shift;
ref $either
- ? $$either # it's an instance, return name
+ ? $$either # it's an instance, return name
: "an unnamed $either"; # it's a class, return generic
}
+
sub speak {
my $either = shift;
print $either->name, " goes ", $either->sound, "\n";
}
+
sub eat {
my $either = shift;
- my $food = shift;
+ my $food = shift;
print $either->name, " eats $food.\n";
}
}
- { package Horse;
- @ISA = qw(Animal);
+
+ {
+ package Horse;
+
+ our @ISA = qw(Animal);
+
sub sound { "neigh" }
}
- { package Sheep;
- @ISA = qw(Animal);
+
+ {
+ package Sheep;
+
+ our @ISA = qw(Animal);
+
sub sound { "baaaah" }
}
my $talking = Horse->named("Mr. Ed");
$talking->eat("hay");
+
Sheep->eat("grass");
which prints:
Let's make a sheep that has a name and a color:
- my $bad = bless { Name => "Evil", Color => "black" }, Sheep;
+ my $data = { Name => "Evil", Color => "black" };
+ my $bad = bless $data, Sheep;
so C<< $bad->{Name} >> has C<Evil>, and C<< $bad->{Color} >> has
C<black>. But we want to make C<< $bad->name >> access the name, and
## in Animal
sub named {
my $class = shift;
- my $name = shift;
- my $self = { Name => $name, Color => $class->default_color };
+ my $name = shift;
+ my $self = { Name => $name, Color => $class->default_color };
+
bless $self, $class;
}
sub color {
$_[0]->{Color}
}
+
sub set_color {
$_[0]->{Color} = $_[1];
}
package Foo;
sub new {
- my $type = shift;
+ my $type = shift;
my %params = @_;
- my $self = {};
- $self->{'High'} = $params{'High'};
- $self->{'Low'} = $params{'Low'};
+ my $self = {};
+
+ $self->{High} = $params{High};
+ $self->{Low} = $params{Low};
+
bless $self, $type;
}
package Bar;
sub new {
- my $type = shift;
+ my $type = shift;
my %params = @_;
- my $self = [];
- $self->[0] = $params{'Left'};
- $self->[1] = $params{'Right'};
+ my $self = [];
+
+ $self->[0] = $params{Left};
+ $self->[1] = $params{Right};
+
bless $self, $type;
}
package main;
- $a = Foo->new( 'High' => 42, 'Low' => 11 );
- print "High=$a->{'High'}\n";
- print "Low=$a->{'Low'}\n";
+ my $a = Foo->new( High => 42, Low => 11 );
+ print "High = $a->{High}\n";
+ print "Low = $a->{Low}\n";
- $b = Bar->new( 'Left' => 78, 'Right' => 40 );
- print "Left=$b->[0]\n";
- print "Right=$b->[1]\n";
+ my $b = Bar->new( Left => 78, Right => 40 );
+ print "Left = $b->[0]\n";
+ print "Right = $b->[1]\n";
=head1 SCALAR INSTANCE VARIABLES
sub new {
my $type = shift;
- my $self;
- $self = shift;
+ my $self = shift;
+
bless \$self, $type;
}
package main;
- $a = Foo->new( 42 );
- print "a=$$a\n";
+ my $a = Foo->new( 42 );
+ print "a = $$a\n";
=head1 INSTANCE VARIABLE INHERITANCE
sub new {
my $type = shift;
my $self = {};
- $self->{'buz'} = 42;
+
+ $self->{buz} = 42;
+
bless $self, $type;
}
package Foo;
- @ISA = qw( Bar );
+ our @ISA = qw( Bar );
sub new {
my $type = shift;
my $self = Bar->new;
- $self->{'biz'} = 11;
+
+ $self->{biz} = 11;
+
bless $self, $type;
}
package main;
- $a = Foo->new;
- print "buz = ", $a->{'buz'}, "\n";
- print "biz = ", $a->{'biz'}, "\n";
+ my $a = Foo->new;
+ print "buz = $a->{buz}\n";
+ print "biz = $a->{biz}\n";
sub new {
my $type = shift;
my $self = {};
- $self->{'buz'} = 42;
+
+ $self->{buz} = 42;
+
bless $self, $type;
}
sub new {
my $type = shift;
my $self = {};
- $self->{'Bar'} = Bar->new;
- $self->{'biz'} = 11;
+
+ $self->{Bar} = Bar->new;
+ $self->{biz} = 11;
+
bless $self, $type;
}
package main;
- $a = Foo->new;
- print "buz = ", $a->{'Bar'}->{'buz'}, "\n";
- print "biz = ", $a->{'biz'}, "\n";
+ my $a = Foo->new;
+ print "buz = $a->{Bar}->{buz}\n";
+ print "biz = $a->{biz}\n";
package Buz;
sub goo { print "here's the goo\n" }
- package Bar; @ISA = qw( Buz );
+
+ package Bar;
+ our @ISA = qw( Buz );
sub google { print "google here\n" }
+
package Baz;
sub mumble { print "mumbling\n" }
package Foo;
- @ISA = qw( Bar Baz );
+ our @ISA = qw( Bar Baz );
sub new {
my $type = shift;
package main;
- $foo = Foo->new;
+ my $foo = Foo->new;
$foo->mumble;
$foo->grr;
$foo->goo;
package Mydbm;
- require SDBM_File;
- require Tie::Hash;
- @ISA = qw( Tie::Hash );
+ use SDBM_File;
+ use Tie::Hash;
+
+ our @ISA = qw( Tie::Hash );
sub TIEHASH {
my $type = shift;
my $ref = SDBM_File->new(@_);
- bless {'dbm' => $ref}, $type;
+ bless { dbm => $ref }, $type;
}
+
sub FETCH {
my $self = shift;
- my $ref = $self->{'dbm'};
+ my $ref = $self->{dbm};
$ref->FETCH(@_);
}
+
sub STORE {
my $self = shift;
- if (defined $_[0]){
- my $ref = $self->{'dbm'};
+
+ if ( defined $_[0] ) {
+ my $ref = $self->{dbm};
$ref->STORE(@_);
} else {
die "Cannot STORE an undefined key in Mydbm\n";
package main;
use Fcntl qw( O_RDWR O_CREAT );
- tie %foo, "Mydbm", "Sdbm", O_RDWR|O_CREAT, 0640;
- $foo{'bar'} = 123;
- print "foo-bar = $foo{'bar'}\n";
+ tie my %foo, 'Mydbm', 'Sdbm', O_RDWR|O_CREAT, 0640;
+ $foo{bar} = 123;
+ print "foo-bar = $foo{bar}\n";
- tie %bar, "Mydbm", "Sdbm2", O_RDWR|O_CREAT, 0640;
- $bar{'Cathy'} = 456;
- print "bar-Cathy = $bar{'Cathy'}\n";
+ tie my %bar, 'Mydbm', 'Sdbm2', O_RDWR|O_CREAT, 0640;
+ $bar{Cathy} = 456;
+ print "bar-Cathy = $bar{Cathy}\n";
=head1 THINKING OF CODE REUSE
my $type = shift;
bless {}, $type;
}
+
sub bar {
my $self = shift;
$self->FOO::private::BAZ;
package main;
- $a = FOO->new;
+ my $a = FOO->new;
$a->bar;
Now we try to override the BAZ() method. We would like FOO::bar() to call
my $type = shift;
bless {}, $type;
}
+
sub bar {
my $self = shift;
$self->FOO::private::BAZ;
}
package GOOP;
- @ISA = qw( FOO );
+
+ our @ISA = qw( FOO );
+
sub new {
my $type = shift;
bless {}, $type;
package main;
- $a = GOOP->new;
+ my $a = GOOP->new;
$a->bar;
To create reusable code we must modify class FOO, flattening class
my $type = shift;
bless {}, $type;
}
+
sub bar {
my $self = shift;
$self->BAZ;
}
package GOOP;
- @ISA = qw( FOO );
+
+ our @ISA = qw( FOO );
sub new {
my $type = shift;
bless {}, $type;
}
+
sub BAZ {
print "in GOOP::BAZ\n";
}
package main;
- $a = GOOP->new;
+ my $a = GOOP->new;
$a->bar;
=head1 CLASS CONTEXT AND THE OBJECT
package Bar;
- %fizzle = ( 'Password' => 'XYZZY' );
+ my %fizzle = ( Password => 'XYZZY' );
sub new {
my $type = shift;
my $self = {};
- $self->{'fizzle'} = \%fizzle;
+ $self->{fizzle} = \%fizzle;
bless $self, $type;
}
# or %Foo::fizzle. The object already knows which
# we should use, so just ask it.
#
- my $fizzle = $self->{'fizzle'};
+ my $fizzle = $self->{fizzle};
- print "The word is ", $fizzle->{'Password'}, "\n";
+ print "The word is $fizzle->{Password}\n";
}
package Foo;
- @ISA = qw( Bar );
- %fizzle = ( 'Password' => 'Rumple' );
+ our @ISA = qw( Bar );
+
+ my %fizzle = ( Password => 'Rumple' );
sub new {
my $type = shift;
my $self = Bar->new;
- $self->{'fizzle'} = \%fizzle;
+ $self->{fizzle} = \%fizzle;
bless $self, $type;
}
package main;
- $a = Bar->new;
- $b = Foo->new;
+ my $a = Bar->new;
+ my $b = Foo->new;
+
$a->enter;
$b->enter;
}
package BAR;
- @ISA = qw(FOO);
+
+ our @ISA = qw(FOO);
sub baz {
print "in BAR::baz()\n";
package main;
- $a = BAR->new;
+ my $a = BAR->new;
$a->baz;
=head1 DELEGATION
package Mydbm;
- require SDBM_File;
- require Tie::Hash;
- @ISA = qw(Tie::Hash);
+ use SDBM_File;
+ use Tie::Hash;
+
+ our @ISA = qw( Tie::Hash );
+ our $AUTOLOAD;
sub TIEHASH {
my $type = shift;
- my $ref = SDBM_File->new(@_);
- bless {'delegate' => $ref};
+ my $ref = SDBM_File->new(@_);
+ bless { delegate => $ref };
}
sub AUTOLOAD {
$AUTOLOAD =~ s/^Mydbm:://;
# Pass the message to the delegate.
- $self->{'delegate'}->$AUTOLOAD(@_);
+ $self->{delegate}->$AUTOLOAD(@_);
}
package main;
use Fcntl qw( O_RDWR O_CREAT );
- tie %foo, "Mydbm", "adbm", O_RDWR|O_CREAT, 0640;
- $foo{'bar'} = 123;
- print "foo-bar = $foo{'bar'}\n";
+ tie my %foo, 'Mydbm', 'adbm', O_RDWR|O_CREAT, 0640;
+ $foo{bar} = 123;
+ print "foo-bar = $foo{bar}\n";
itself been called from another Perl subroutine. The code below
illustrates this
- sub fred
- { print "@_\n" }
+ sub fred {
+ print "@_\n";
+ }
- sub joe
- { &fred }
+ sub joe {
+ &fred;
+ }
- &joe(1,2,3) ;
+ &joe(1,2,3);
This will print
For example, say you want to call this Perl sub
- sub fred
- {
- eval { die "Fatal Error" ; }
- print "Trapped error: $@\n"
- if $@ ;
+ sub fred {
+ eval { die "Fatal Error" }
+ print "Trapped error: $@\n" if $@;
}
via this XSUB
This first trivial example will call a Perl subroutine, I<PrintUID>, to
print out the UID of the process.
- sub PrintUID
- {
- print "UID is $<\n" ;
+ sub PrintUID {
+ print "UID is $<\n";
}
and here is a C function to call it
So the Perl subroutine would look like this
- sub LeftString
- {
- my($s, $n) = @_ ;
- print substr($s, 0, $n), "\n" ;
+ sub LeftString {
+ my($s, $n) = @_ ;
+ print substr($s, 0, $n), "\n";
}
The C function required to call I<LeftString> would look like this.
Here is a Perl subroutine, I<Adder>, that takes 2 integer parameters
and simply returns their sum.
- sub Adder
- {
- my($a, $b) = @_ ;
- $a + $b ;
+ sub Adder {
+ my($a, $b) = @_;
+ $a + $b;
}
Because we are now concerned with the return value from I<Adder>, the C
Here is the Perl subroutine
- sub AddSubtract
- {
- my($a, $b) = @_ ;
- ($a+$b, $a-$b) ;
+ sub AddSubtract {
+ my($a, $b) = @_;
+ ($a+$b, $a-$b);
}
and this is the C function
The Perl subroutine, I<Inc>, below takes 2 parameters and increments
each directly.
- sub Inc
- {
- ++ $_[0] ;
- ++ $_[1] ;
+ sub Inc {
+ ++$_[0];
+ ++$_[1];
}
and here is a C function to call it.
the difference of its 2 parameters. If this would result in a negative
result, the subroutine calls I<die>.
- sub Subtract
- {
- my ($a, $b) = @_ ;
+ sub Subtract {
+ my ($a, $b) = @_;
- die "death can be fatal\n" if $a < $b ;
+ die "death can be fatal\n" if $a < $b;
- $a - $b ;
+ $a - $b;
}
and some C to call it
version of the call_Subtract example above inside a destructor:
package Foo;
- sub new { bless {}, $_[0] }
+
+ sub new { bless {}, shift }
+
sub Subtract {
- my($a,$b) = @_;
- die "death can be fatal" if $a < $b ;
- $a - $b;
+ my($a,$b) = @_;
+ die "death can be fatal" if $a < $b;
+ $a - $b;
}
- sub DESTROY { call_Subtract(5, 4); }
- sub foo { die "foo dies"; }
+
+ sub DESTROY { call_Subtract(5, 4) }
+ sub foo { die "foo dies" }
+
package main;
+
eval { Foo->new->foo };
print "Saw: $@" if $@; # should be, but isn't
Consider the Perl code below
- sub fred
- {
- print "Hello there\n" ;
+ sub fred {
+ print "Hello there\n";
}
- CallSubPV("fred") ;
+ CallSubPV("fred");
Here is a snippet of XSUB which defines I<CallSubPV>.
Because we are using an SV to call I<fred> the following can all be used
- CallSubSV("fred") ;
- CallSubSV(\&fred) ;
- $ref = \&fred ;
- CallSubSV($ref) ;
- CallSubSV( sub { print "Hello there\n" } ) ;
+ CallSubSV("fred");
+ CallSubSV(\&fred);
+
+ my $ref = \&fred;
+ CallSubSV($ref);
+ CallSubSV( sub { print "Hello there\n" } );
As you can see, I<call_sv> gives you much greater flexibility in
how you can specify the Perl subroutine.
to the Perl subroutine that was recorded in C<SaveSub1>. This is
particularly true for these cases
- SaveSub1(\&fred) ;
- CallSavedSub1() ;
+ SaveSub1(\&fred);
+ CallSavedSub1();
- SaveSub1( sub { print "Hello there\n" } ) ;
- CallSavedSub1() ;
+ SaveSub1( sub { print "Hello there\n" } );
+ CallSavedSub1();
By the time each of the C<SaveSub1> statements above have been executed,
the SV*s which corresponded to the parameters will no longer exist.
Similarly, with this code
- $ref = \&fred ;
- SaveSub1($ref) ;
- $ref = 47 ;
- CallSavedSub1() ;
+ my $ref = \&fred;
+ SaveSub1($ref);
+
+ $ref = 47;
+ CallSavedSub1();
you can expect one of these messages (which you actually get is dependent on
the version of Perl you are using)
A similar but more subtle problem is illustrated with this code
- $ref = \&fred ;
- SaveSub1($ref) ;
- $ref = \&joe ;
- CallSavedSub1() ;
+ my $ref = \&fred;
+ SaveSub1($ref);
+
+ $ref = \&joe;
+ CallSavedSub1();
This time whenever C<CallSavedSub1> get called it will execute the Perl
subroutine C<joe> (assuming it exists) rather than C<fred> as was
Here is a Perl subroutine which prints whatever parameters are passed
to it.
- sub PrintList
- {
- my(@list) = @_ ;
+ sub PrintList {
+ my @list = @_;
- foreach (@list) { print "$_\n" }
+ foreach (@list) {
+ print "$_\n";
+ }
}
and here is an example of I<call_argv> which will call
Consider the following Perl code
{
- package Mine ;
-
- sub new
- {
- my($type) = shift ;
- bless [@_]
- }
-
- sub Display
- {
- my ($self, $index) = @_ ;
- print "$index: $$self[$index]\n" ;
- }
-
- sub PrintID
- {
- my($class) = @_ ;
- print "This is Class $class version 1.0\n" ;
- }
+ package Mine ;
+
+ sub new {
+ my $type = shift;
+ bless [@_], $type;
+ }
+
+ sub Display {
+ my ($self, $index) = @_;
+ print "$index: $self->[$index]\n";
+ }
+
+ sub PrintID {
+ my $class = shift;
+ print "This is Class $class version 1.0\n";
+ }
}
It implements just a very simple class to manage an array. Apart from
name and a version number. The virtual method, C<Display>, prints out a
single element of the array. Here is an all Perl example of using it.
- $a = new Mine ('red', 'green', 'blue') ;
- $a->Display(1) ;
- PrintID Mine;
+ my $a = Mine->new('red', 'green', 'blue');
+ $a->Display(1);
+
+ Mine->PrintID;
will print
So the methods C<PrintID> and C<Display> can be invoked like this
- $a = new Mine ('red', 'green', 'blue') ;
- call_Method($a, 'Display', 1) ;
- call_PrintID('Mine', 'PrintID') ;
+ my $a = Mine->new('red', 'green', 'blue');
+ call_Method($a, 'Display', 1);
+ call_PrintID('Mine', 'PrintID');
The only thing to note is that in both the static and virtual methods,
the method name is not passed via the stack--it is used as the first
and here is some Perl to test it
PrintContext ;
- $a = PrintContext ;
- @a = PrintContext ;
+ my $a = PrintContext;
+ my @a = PrintContext;
The output from that will be
# Register the sub pcb1
register_fatal(\&pcb1) ;
- sub pcb1
- {
- die "I'm dying...\n" ;
+ sub pcb1 {
+ die "I'm dying...\n";
}
The mapping between the C callback and the Perl equivalent is stored in
So the Perl interface would look like this
- sub callback1
- {
- my($handle, $buffer) = @_ ;
+ sub callback1 {
+ my($handle, $buffer) = @_;
}
# Register the Perl callback
- asynch_read($fh, \&callback1) ;
+ asynch_read($fh, \&callback1);
- asynch_close($fh) ;
+ asynch_close($fh);
The mapping between the C callback and Perl is stored in the global
hash C<Mapping> this time. Using a hash has the distinct advantage that
can truncate an array down to nothing by assigning the null list
() to it. The following are equivalent:
- @whatever = ();
+ my @whatever = ();
$#whatever = -1;
If you evaluate an array in scalar context, it returns the length
which return whatever they feel like returning.) The following is
always true:
- scalar(@whatever) == $#whatever - $[ + 1;
-
-Version 5 of Perl changed the semantics of C<$[>: files that don't set
-the value of C<$[> no longer need to worry about whether another
-file changed its value. (In other words, use of C<$[> is deprecated.)
-So in general you can assume that
-
scalar(@whatever) == $#whatever + 1;
Some programmers choose to use an explicit conversion so as to
You can preallocate space for a hash by assigning to the keys() function.
This rounds up the allocated buckets to the next power of two:
+ my %users = ();
keys(%users) = 1000; # allocate 1024 buckets
=head2 Scalar value constructors
expression as a subscript.) The following code segment prints out "The
price is $Z<>100."
- $Price = '$100'; # not interpreted
- print "The price is $Price.\n"; # interpreted
+ my $Price = '$100'; # not interpolated
+ print "The price is $Price.\n"; # interpolated
As in some shells, you can enclose the variable name in braces to
disambiguate it from following alphanumerics (and underscores).
variable name from a following double-colon or an apostrophe, since
these would be otherwise treated as a package separator:
- $who = "Larry";
+ my $who = "Larry";
print PASSWD "${who}::0:0:Superuser:/:/bin/perl\n";
print "We use ${who}speak when ${who}'s here.\n";
variable (C<$LIST_SEPARATOR> in English), space by default. The
following are equivalent:
- $temp = join($", @ARGV);
+ my $temp = join($", @ARGV);
system "echo $temp";
system "echo @ARGV";
rest of the code, you'll need to remove leading whitespace
from each line manually:
- ($quote = <<'FINIS') =~ s/^\s+//gm;
+ (my $quote = <<'FINIS') =~ s/^\s+//gm;
The Road goes ever on and on,
down from the door where it began.
FINIS
to be a list literal is simply the value of the final element, as
with the C comma operator. For example,
- @foo = ('cc', '-E', $bar);
+ my @foo = ('cc', '-E', $bar);
assigns the entire list value to array @foo, but
- $foo = ('cc', '-E', $bar);
+ my $foo = ('cc', '-E', $bar);
assigns the value of variable $bar to the scalar variable $foo.
Note that the value of an actual array in scalar context is the
length of the array; the following assigns the value 3 to $foo:
- @foo = ('cc', '-E', $bar);
- $foo = @foo; # $foo gets 3
+ my @foo = ('cc', '-E', $bar);
+ my $foo = @foo; # $foo gets 3
You may have an optional comma before the closing parenthesis of a
list literal, so that you can say:
- @foo = (
+ my @foo = (
1,
2,
3,
To use a here-document to assign an array, one line per element,
you might use an approach like this:
- @sauces = <<End_Lines =~ m/(\S.*\S)/g;
+ my @sauces = <<End_Lines =~ m/(\S.*\S)/g;
normal tomato
spicy tomato
green chile
put the list in parentheses to avoid ambiguity. For example:
# Stat returns list value.
- $time = (stat($file))[8];
+ my $time = (stat($file))[8];
# SYNTAX ERROR HERE.
- $time = stat($file)[8]; # OOPS, FORGOT PARENTHESES
+ my $time = stat($file)[8]; # OOPS, FORGOT PARENTHESES
# Find a hex digit.
- $hexdigit = ('a','b','c','d','e','f')[$digit-10];
+ my $hexdigit = ('a','b','c','d','e','f')[$digit-10];
# A "reverse comma operator".
return (pop(@foo),pop(@foo))[0];
Lists may be assigned to only when each element of the list
is itself legal to assign to:
- ($a, $b, $c) = (1, 2, 3);
+ my($a, $b, $c) = (1, 2, 3);
- ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);
+ ($map{red}, $map{blue}, $map{green}) = (0x00f, 0x0f0, 0xf00);
An exception to this is that you may assign to C<undef> in a list.
This is useful for throwing away some of the return values of a
function:
- ($dev, $ino, undef, undef, $uid, $gid) = stat($file);
+ my($dev, $ino, undef, undef, $uid, $gid) = stat($file);
List assignment in scalar context returns the number of elements
produced by the expression on the right side of the assignment:
- $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
- $x = (($foo,$bar) = f()); # set $x to f()'s return count
+ my $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
+ my $x = (($foo,$bar) = f()); # set $x to f()'s return count
This is handy when you want to do a list assignment in a Boolean
context, because most list functions return a null list when finished,
return values, by assigning to an empty list and then using that
assignment in scalar context. For example, this code:
- $count = () = $string =~ /\d+/g;
+ my $count = () = $string =~ /\d+/g;
will place into $count the number of digit groups found in $string.
This happens because the pattern match is in list context (since it
number of times the pattern matched) and assign that to $count. Note
that simply using
- $count = $string =~ /\d+/g;
+ my $count = $string =~ /\d+/g;
would not have worked, since a pattern match in scalar context will
only return true or false, rather than a count of matches.
The final element of a list assignment may be an array or a hash:
- ($a, $b, @rest) = split;
+ my($a, $b, @rest) = split;
+ # or
my($a, $b, %rest) = @_;
You can actually put an array or hash anywhere in the list, but the first one
items to be interpreted as a key and a value:
# same as map assignment above
- %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
+ my %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
While literal lists and named arrays are often interchangeable, that's
not the case for hashes. Just because you can subscript a list value like
interpreted as a string--if it's a bareword that would be a legal identifier.
This makes it nice for initializing hashes:
- %map = (
+ my %map = (
red => 0x00f,
blue => 0x0f0,
green => 0xf00,
or for initializing hash references to be used as records:
- $rec = {
+ my $rec = {
witch => 'Mable the Merciless',
cat => 'Fluffy the Ferocious',
date => '10/31/1776',
or for using call-by-named-parameter to complicated functions:
- $field = $query->radio_group(
+ use CGI;
+ my $query = CGI->new;
+ my $field = $query->radio_group(
name => 'group_name',
values => ['eenie','meenie','minie'],
default => 'meenie',
linebreak => 'true',
- labels => \%labels
+ labels => \%labels,
);
Note that just because a hash is initialized in that order doesn't
A common way to access an array or a hash is one scalar element at a
time. You can also subscript a list to get a single element from it.
- $whoami = $ENV{"USER"}; # one element from the hash
- $parent = $ISA[0]; # one element from the array
- $dir = (getpwnam("daemon"))[7]; # likewise, but with list
+ my $whoami = $ENV{"USER"}; # one element from the hash
+ my $parent = $ISA[0]; # one element from the array
+ my $dir = (getpwnam("daemon"))[7]; # likewise, but with list
A slice accesses several elements of a list, an array, or a hash
simultaneously using a list of subscripts. It's more convenient
than writing out the individual elements as a list of separate
scalar values.
- ($him, $her) = @folks[0,-1]; # array slice
- @them = @folks[0 .. 3]; # array slice
- ($who, $home) = @ENV{"USER", "HOME"}; # hash slice
- ($uid, $dir) = (getpwnam("daemon"))[2,7]; # list slice
+ my($him, $her) = @folks[0,-1]; # array slice
+ my @them = @folks[0 .. 3]; # array slice
+ my($who, $home) = @ENV{"USER", "HOME"}; # hash slice
+ my($uid, $dir) = (getpwnam("daemon"))[2,7]; # list slice
Since you can assign to a list of variables, you can also assign to
an array or hash slice.
- @days[3..5] = qw/Wed Thu Fri/;
+ my( @days, %colors, @folks );
+ @days[3..5] = qw(Wed Thu Fri);
@colors{'red','blue','green'}
= (0xff0000, 0x0000ff, 0x00ff00);
@folks[0, -1] = @folks[-1, 0];
The previous assignments are exactly equivalent to
- ($days[3], $days[4], $days[5]) = qw/Wed Thu Fri/;
- ($colors{'red'}, $colors{'blue'}, $colors{'green'})
+ my( @days, %colors, @folks );
+ ($days[3], $days[4], $days[5]) = qw(Wed Thu Fri);
+ ($colors{red}, $colors{blue}, $colors{green})
= (0xff0000, 0x0000ff, 0x00ff00);
- ($folks[0], $folks[-1]) = ($folks[0], $folks[-1]);
+ ($folks[0], $folks[-1]) = ($folks[-1], $folks[0]);
Since changing a slice changes the original array or hash that it's
slicing, a C<foreach> construct will alter some--or even all--of the
A slice of an empty list is still an empty list. Thus:
- @a = ()[1,0]; # @a has no elements
- @b = (@a)[0,1]; # @b has no elements
- @c = (0,1)[2,3]; # @c has no elements
+ my @a = ()[1,0]; # @a has no elements
+ my @b = (@a)[0,1]; # @b has no elements
+ my @c = (0,1)[2,3]; # @c has no elements
But:
- @a = (1)[1,0]; # @a has two elements
- @b = (1,undef)[1,0,2]; # @b has three elements
+ my @a = (1)[1,0]; # @a has two elements
+ my @b = (1,undef)[1,0,2]; # @b has three elements
This makes it easy to write loops that terminate when a null list
is returned:
- while ( ($home, $user) = (getpwent)[7,0]) {
+ while ( my($home, $user) = (getpwent)[7,0] ) {
printf "%-8s %s\n", $user, $home;
}
The main use of typeglobs in modern Perl is create symbol table aliases.
This assignment:
+ {
+
*this = *that;
makes $this an alias for $that, @this an alias for @that, %this an alias
make @Here::blue an alias for @There::green, or %Here::blue an alias for
%There::green, etc. See L<perlmod/"Symbol Tables"> for more examples
of this. Strange though this may seem, this is the basis for the whole
-module import/export system.
+module import/export system. And none of it works under
+C<use strict 'vars'>.
Another use for typeglobs is to pass filehandles into a function or
to create new filehandles. If you need to use a typeglob to save away
a filehandle, do it this way:
- $fh = *STDOUT;
+ my $fh = *STDOUT;
or perhaps as a real reference, like this:
- $fh = \*STDOUT;
+ my $fh = \*STDOUT;
See L<perlsub> for examples of using these as indirect filehandles
in functions.
open (FH, $path) or return undef;
return *FH;
}
- $fh = newopen('/etc/passwd');
+ my $fh = newopen('/etc/passwd');
Now that we have the C<*foo{THING}> notation, typeglobs aren't used as much
for filehandle manipulations, although they're still needed to pass brand
that must be passed around, as in the following example:
sub myopen {
- open my $fh, "@_"
- or die "Can't open '@_': $!";
+ my $filename = shift;
+ open my $fh, $filename
+ or die "Can't open '$filename': $!";
return $fh;
}
=head1 SYNOPSIS
- $db = tie %hash, 'DBM', ...
+ my $db = tie my %hash, 'DBM', ...;
- $old_filter = $db->filter_store_key ( sub { ... } ) ;
- $old_filter = $db->filter_store_value( sub { ... } ) ;
- $old_filter = $db->filter_fetch_key ( sub { ... } ) ;
- $old_filter = $db->filter_fetch_value( sub { ... } ) ;
+ my $old_filter;
+ $old_filter = $db->filter_store_key ( sub { ... } );
+ $old_filter = $db->filter_store_value( sub { ... } );
+ $old_filter = $db->filter_fetch_key ( sub { ... } );
+ $old_filter = $db->filter_fetch_value( sub { ... } );
=head1 DESCRIPTION
sure you have already guessed, this is a problem that DBM Filters can
fix very easily.
- use strict ;
- use warnings ;
- use SDBM_File ;
- use Fcntl ;
+ use strict;
+ use warnings;
+ use SDBM_File;
+ use Fcntl;
- my %hash ;
- my $filename = "/tmp/filt" ;
- unlink $filename ;
+ my %hash;
+ my $filename = '/tmp/filt';
+ unlink $filename;
my $db = tie(%hash, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0640)
- or die "Cannot open $filename: $!\n" ;
+ or die "Cannot open $filename: $!\n";
# Install DBM Filters
- $db->filter_fetch_key ( sub { s/\0$// } ) ;
- $db->filter_store_key ( sub { $_ .= "\0" } ) ;
+ $db->filter_fetch_key ( sub { s/\0$// } );
+ $db->filter_store_key ( sub { $_ .= "\0" } );
$db->filter_fetch_value(
- sub { no warnings 'uninitialized' ;s/\0$// } ) ;
- $db->filter_store_value( sub { $_ .= "\0" } ) ;
+ sub { no warnings 'uninitialized'; s/\0$// } );
+ $db->filter_store_value( sub { $_ .= "\0" } );
- $hash{"abc"} = "def" ;
- my $a = $hash{"ABC"} ;
+ $hash{abc} = 'def';
+ my $a = $hash{ABC};
# ...
- undef $db ;
- untie %hash ;
+ undef $db;
+ untie %hash;
The code above uses SDBM_File, but it will work with any of the DBM
modules.
a DBM database it always writes the key and value as strings. So when
you use this:
- $hash{12345} = "something" ;
+ $hash{12345} = 'something';
the key 12345 will get stored in the DBM database as the 5 byte string
"12345". If you actually want the key to be stored in the DBM database
Here is a DBM Filter that does it:
- use strict ;
- use warnings ;
- use DB_File ;
- my %hash ;
- my $filename = "/tmp/filt" ;
- unlink $filename ;
+ use strict;
+ use warnings;
+ use DB_File;
+ my %hash;
+ my $filename = '/tmp/filt';
+ unlink $filename;
my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
- or die "Cannot open $filename: $!\n" ;
+ or die "Cannot open $filename: $!\n";
- $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ;
- $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ;
- $hash{123} = "def" ;
+ $db->filter_fetch_key ( sub { $_ = unpack('i', $_) } );
+ $db->filter_store_key ( sub { $_ = pack ('i', $_) } );
+ $hash{123} = 'def';
# ...
- undef $db ;
- untie %hash ;
+ undef $db;
+ untie %hash;
The code above uses DB_File, but again it will work with any of the
DBM modules.
=back
- perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
- -e '{printf("%s%-9o%-9o%-9o%o\n",$1,$2,$3,$4,$5)}' perlebcdic.pod
+ perldoc -m perlebcdic | \
+ perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
+ -e '{printf("%s%-9o%-9o%-9o%o\n",$1,$2,$3,$4,$5)}'
+
+Or, as a script, called like C<perldoc -m perlebcdic | extract.pl>:
+
+ my $regex = qr/
+ (.{33}) # any 33 characters
+
+ (\d+)\s+ # capture some digits, discard spaces
+ (\d+)\s+ # ".."
+ (\d+)\s+ # ".."
+ (\d+) # capture some digits
+ /x;
+
+ while ( <> ) {
+ if ( $_ =~ $regex ) {
+ printf(
+ "%s%-9o%-9o%-9o%o\n",
+ $1, $2, $3, $4, $5,
+ );
+ }
+ }
If you want to retain the UTF-x code points then in script form you
might want to write:
=back
- open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!";
- while (<FH>) {
- if (/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)\s+(\d+)\.?(\d*)/) {
- if ($7 ne '' && $9 ne '') {
- printf("%s%-9o%-9o%-9o%-9o%-3o.%-5o%-3o.%o\n",$1,$2,$3,$4,$5,$6,$7,$8,$9);
- }
- elsif ($7 ne '') {
- printf("%s%-9o%-9o%-9o%-9o%-3o.%-5o%o\n",$1,$2,$3,$4,$5,$6,$7,$8);
- }
- else {
- printf("%s%-9o%-9o%-9o%-9o%-9o%o\n",$1,$2,$3,$4,$5,$6,$8);
+ my $regex = qr/
+ (.{33}) # $1: any 33 characters
+
+ (\d+)\s+ # $2, $3, $4, $5:
+ (\d+)\s+ # capture some digits, discard spaces
+ (\d+)\s+ # 4 times
+ (\d+)\s+
+
+ (\d+) # $6: capture some digits,
+ \.? # there may be a period,
+ (\d*) # $7: capture some digits if they're there,
+ \s+ # discard spaces
+
+ (\d+) # $8: capture some digits
+ \.? # there may be a period,
+ (\d*) # $9: capture some digits if they're there,
+ /x;
+
+ open( FH, 'perldoc -m perlebcdic |' ) ||
+ die "Could not open perlebcdic.pod: $!";
+ while ( <FH> ) {
+ if ( $_ =~ $regex ) {
+ if ( $7 ne '' && $9 ne '' ) {
+ printf(
+ "%s%-9o%-9o%-9o%-9o%-3o.%-5o%-3o.%o\n",
+ $1, $2, $3, $4, $5, $6, $7, $8, $9
+ );
+ } elsif ( $7 ne '' ) {
+ printf(
+ "%s%-9o%-9o%-9o%-9o%-3o.%-5o%o\n",
+ $1, $2, $3, $4, $5, $6, $7, $8
+ );
+ } else {
+ printf(
+ "%s%-9o%-9o%-9o%-9o%-9o%o\n",
+ $1, $2, $3, $4, $5, $6, $8
+ );
}
}
}
+ close FH;
If you would rather see this table listing hexadecimal values then
run the table through:
=back
- perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
- -e '{printf("%s%-9X%-9X%-9X%X\n",$1,$2,$3,$4,$5)}' perlebcdic.pod
+ perldoc -m perlebcdic | \
+ perl -ne 'if(/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
+ -e '{printf("%s%-9X%-9X%-9X%X\n",$1,$2,$3,$4,$5)}'
Or, in order to retain the UTF-x code points in hexadecimal:
=back
- open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!";
+ my $regex = qr/
+ (.{33}) # $1: any 33 characters
+
+ (\d+)\s+ # $2, $3, $4, $5:
+ (\d+)\s+ # capture some digits, discard spaces
+ (\d+)\s+ # 4 times
+ (\d+)\s+
+
+ (\d+) # $6: capture some digits,
+ \.? # there may be a period,
+ (\d*) # $7: capture some digits if they're there,
+ \s+ # discard spaces
+
+ (\d+) # $8: capture some digits
+ \.? # there may be a period,
+ (\d*) # $9: capture some digits if they're there,
+ /x;
+
+ open( FH, 'perldoc -m perlebcdic |' ) ||
+ die "Could not open perlebcdic.pod: $!";
while (<FH>) {
- if (/(.{33})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)\s+(\d+)\.?(\d*)/) {
- if ($7 ne '' && $9 ne '') {
- printf("%s%-9X%-9X%-9X%-9X%-2X.%-6X%-2X.%X\n",$1,$2,$3,$4,$5,$6,$7,$8,$9);
+ if ( $_ =~ $regex ) {
+ if ( $7 ne '' && $9 ne '' ) {
+ printf(
+ "%s%-9X%-9X%-9X%-9X%-2X.%-6X%-2X.%X\n",
+ $1, $2, $3, $4, $5, $6, $7, $8, $9
+ );
}
- elsif ($7 ne '') {
- printf("%s%-9X%-9X%-9X%-9X%-2X.%-6X%X\n",$1,$2,$3,$4,$5,$6,$7,$8);
+ elsif ( $7 ne '' ) {
+ printf(
+ "%s%-9X%-9X%-9X%-9X%-2X.%-6X%X\n",
+ $1, $2, $3, $4, $5, $6, $7, $8
+ );
}
else {
- printf("%s%-9X%-9X%-9X%-9X%-9X%X\n",$1,$2,$3,$4,$5,$6,$8);
+ printf(
+ "%s%-9X%-9X%-9X%-9X%-9X%X\n",
+ $1, $2, $3, $4, $5, $6, $8
+ );
}
}
}
+=head2 THE SINGLE OCTET TABLE
incomp- incomp-
8859-1 lete lete
<SMALL LETTER thorn> 254 142 142 142 195.190 139.114
<y WITH DIAERESIS> 255 223 223 223 195.191 139.115
+
If you would rather see the above table in CCSID 0037 order rather than
ASCII + Latin-1 order then run the table through:
=back
- perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\
- -e '{push(@l,$_)}' \
- -e 'END{print map{$_->[0]}' \
- -e ' sort{$a->[1] <=> $b->[1]}' \
- -e ' map{[$_,substr($_,42,3)]}@l;}' perlebcdic.pod
+ perldoc -m perlebcdic | \
+ perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)' \
+ -e '{push(@l,$_)}' \
+ -e 'END{print map{$_->[0]}' \
+ -e 'sort{$a->[1] <=> $b->[1]}' \
+ -e 'map{[$_,substr($_,42,3)]}@l;}'
If you would rather see it in CCSID 1047 order then change the digit
42 in the last line to 51, like this:
=back
- perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\
- -e '{push(@l,$_)}' \
- -e 'END{print map{$_->[0]}' \
- -e ' sort{$a->[1] <=> $b->[1]}' \
- -e ' map{[$_,substr($_,51,3)]}@l;}' perlebcdic.pod
+ perldoc -m perlebcdic | \
+ perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)' \
+ -e '{push(@l,$_)}' \
+ -e 'END{print map{$_->[0]}' \
+ -e 'sort{$a->[1] <=> $b->[1]}' \
+ -e 'map{[$_,substr($_,51,3)]}@l;}'
If you would rather see it in POSIX-BC order then change the digit
51 in the last line to 60, like this:
=back
- perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)'\
- -e '{push(@l,$_)}' \
- -e 'END{print map{$_->[0]}' \
- -e ' sort{$a->[1] <=> $b->[1]}' \
- -e ' map{[$_,substr($_,60,3)]}@l;}' perlebcdic.pod
+ perldoc -m perlebcdic | \
+ perl -ne 'if(/.{33}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}\s{6,8}\d{1,3}/)' \
+ -e '{push(@l,$_)}' \
+ -e 'END{print map{$_->[0]}' \
+ -e 'sort{$a->[1] <=> $b->[1]}' \
+ -e 'map{[$_,substr($_,60,3)]}@l;}'
=head1 IDENTIFYING CHARACTER CODE SETS
could use the return value of ord() or chr() to test one or more
character values. For example:
- $is_ascii = "A" eq chr(65);
- $is_ebcdic = "A" eq chr(193);
+ my $is_ascii = "A" eq chr(65);
+ my $is_ebcdic = "A" eq chr(193);
Also, "\t" is a C<HORIZONTAL TABULATION> character so that:
- $is_ascii = ord("\t") == 9;
- $is_ebcdic = ord("\t") == 5;
+ my $is_ascii = ord("\t") == 9;
+ my $is_ebcdic = ord("\t") == 5;
To distinguish EBCDIC code pages try looking at one or more of
the characters that differ between them. For example:
- $is_ebcdic_37 = "\n" eq chr(37);
- $is_ebcdic_1047 = "\n" eq chr(21);
+ my $is_ebcdic_37 = "\n" eq chr(37);
+ my $is_ebcdic_1047 = "\n" eq chr(21);
Or better still choose a character that is uniquely encoded in any
of the code sets, e.g.:
- $is_ascii = ord('[') == 91;
- $is_ebcdic_37 = ord('[') == 186;
- $is_ebcdic_1047 = ord('[') == 173;
- $is_ebcdic_POSIX_BC = ord('[') == 187;
+ my $is_ascii = ord('[') == 91;
+ my $is_ebcdic_37 = ord('[') == 186;
+ my $is_ebcdic_1047 = ord('[') == 173;
+ my $is_ebcdic_POSIX_BC = ord('[') == 187;
However, it would be unwise to write tests such as:
- $is_ascii = "\r" ne chr(13); # WRONG
- $is_ascii = "\n" ne chr(10); # ILL ADVISED
+ my $is_ascii = "\r" ne chr(13); # WRONG
+ my $is_ascii = "\n" ne chr(10); # ILL ADVISED
Obviously the first of these will fail to distinguish most ASCII machines
-from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC machine since "\r" eq
-chr(13) under all of those coded character sets. But note too that
-because "\n" is chr(13) and "\r" is chr(10) on the MacIntosh (which is an
+from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC machine since "\r" eq
+chr(13) under all of those coded character sets. But note too that
+because "\n" is chr(13) and "\r" is chr(10) on the MacIntosh (which is an
ASCII machine) the second C<$is_ascii> test will lead to trouble there.
-To determine whether or not perl was built under an EBCDIC
+To determine whether or not perl was built under an EBCDIC
code page you can use the Config module like so:
use Config;
- $is_ebcdic = $Config{'ebcdic'} eq 'define';
+ my $is_ebcdic = $Config{'ebcdic'} eq 'define';
=head1 CONVERSIONS
provide easy to use ASCII to EBCDIC operations that are also easily
reversed.
-For example, to convert ASCII to code page 037 take the output of the second
-column from the output of recipe 0 (modified to add \\ characters) and use
+For example, to convert ASCII to code page 037 take the output of the second
+column from the output of recipe 0 (modified to add \\ characters) and use
it in tr/// like so:
- $cp_037 =
- '\000\001\002\003\234\011\206\177\227\215\216\013\014\015\016\017' .
- '\020\021\022\023\235\205\010\207\030\031\222\217\034\035\036\037' .
- '\200\201\202\203\204\012\027\033\210\211\212\213\214\005\006\007' .
- '\220\221\026\223\224\225\226\004\230\231\232\233\024\025\236\032' .
- '\040\240\342\344\340\341\343\345\347\361\242\056\074\050\053\174' .
- '\046\351\352\353\350\355\356\357\354\337\041\044\052\051\073\254' .
- '\055\057\302\304\300\301\303\305\307\321\246\054\045\137\076\077' .
- '\370\311\312\313\310\315\316\317\314\140\072\043\100\047\075\042' .
- '\330\141\142\143\144\145\146\147\150\151\253\273\360\375\376\261' .
- '\260\152\153\154\155\156\157\160\161\162\252\272\346\270\306\244' .
- '\265\176\163\164\165\166\167\170\171\172\241\277\320\335\336\256' .
- '\136\243\245\267\251\247\266\274\275\276\133\135\257\250\264\327' .
- '\173\101\102\103\104\105\106\107\110\111\255\364\366\362\363\365' .
- '\175\112\113\114\115\116\117\120\121\122\271\373\374\371\372\377' .
- '\134\367\123\124\125\126\127\130\131\132\262\324\326\322\323\325' .
- '\060\061\062\063\064\065\066\067\070\071\263\333\334\331\332\237' ;
+ my $cp_037 = join '',
+ qq[\000\001\002\003\234\011\206\177\227\215\216\013\014\015\016\017],
+ qq[\020\021\022\023\235\205\010\207\030\031\222\217\034\035\036\037],
+ qq[\200\201\202\203\204\012\027\033\210\211\212\213\214\005\006\007],
+ qq[\220\221\026\223\224\225\226\004\230\231\232\233\024\025\236\032],
+ qq[\040\240\342\344\340\341\343\345\347\361\242\056\074\050\053\174],
+ qq[\046\351\352\353\350\355\356\357\354\337\041\044\052\051\073\254],
+ qq[\055\057\302\304\300\301\303\305\307\321\246\054\045\137\076\077],
+ qq[\370\311\312\313\310\315\316\317\314\140\072\043\100\047\075\042],
+ qq[\330\141\142\143\144\145\146\147\150\151\253\273\360\375\376\261],
+ qq[\260\152\153\154\155\156\157\160\161\162\252\272\346\270\306\244],
+ qq[\265\176\163\164\165\166\167\170\171\172\241\277\320\335\336\256],
+ qq[\136\243\245\267\251\247\266\274\275\276\133\135\257\250\264\327],
+ qq[\173\101\102\103\104\105\106\107\110\111\255\364\366\362\363\365],
+ qq[\175\112\113\114\115\116\117\120\121\122\271\373\374\371\372\377],
+ qq[\134\367\123\124\125\126\127\130\131\132\262\324\326\322\323\325],
+ qq[\060\061\062\063\064\065\066\067\070\071\263\333\334\331\332\237];
my $ebcdic_string = $ascii_string;
+
eval '$ebcdic_string =~ tr/\000-\377/' . $cp_037 . '/';
To convert from EBCDIC 037 to ASCII just reverse the order of the tr///
shell utility from within perl would be to:
# OS/390 or z/OS example
- $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1`
+ my $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1`
or the inverse map:
# OS/390 or z/OS example
- $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`
+ my $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`
For other perl based conversion options see the Convert::* modules on CPAN.
will have twenty six elements on either an EBCDIC machine
or an ASCII machine:
- @alphabet = ('A'..'Z'); # $#alphabet == 25
+ my @alphabet = ( 'A'..'Z' ); # $#alphabet == 25
The bitwise operators such as & ^ | may return different results
when operating on string or character data in a perl program running
an example adapted from the one in L<perlop>:
# EBCDIC-based examples
- print "j p \n" ^ " a h"; # prints "JAPH\n"
- print "JA" | " ph\n"; # prints "japh\n"
- print "JAPH\nJunk" & "\277\277\277\277\277"; # prints "japh\n";
- print 'p N$' ^ " E<H\n"; # prints "Perl\n";
+ print "j p \n" ^ " a h"; # prints "JAPH\n"
+ print "JA" | " ph\n"; # prints "japh\n"
+ print "JAPH\nJunk" & "\277\277\277\277\277"; # prints "japh\n"
+ print 'p N$' ^ " E<H\n"; # prints "Perl\n"
An interesting property of the 32 C0 control characters
in the ASCII table is that they can "literally" be constructed
chr() must be given an EBCDIC code number argument to yield a desired
character return value on an EBCDIC machine. For example:
- $CAPITAL_LETTER_A = chr(193);
+ my $CAPITAL_LETTER_A = chr(193);
=item ord()
ord() will return EBCDIC code number values on an EBCDIC machine.
For example:
- $the_number_193 = ord("A");
+ my $the_number_193 = ord("A");
=item pack()
The c and C templates for pack() are dependent upon character set
encoding. Examples of usage on EBCDIC include:
+ my $foo;
$foo = pack("CCCC",193,194,195,196);
# $foo eq "ABCD"
- $foo = pack("C4",193,194,195,196);
+ $foo = pack("C4", 193,194,195,196);
# same thing
$foo = pack("ccxxcc",193,194,195,196);
See the discussion of printf() above. An example of the use
of sprintf would be:
- $CAPITAL_LETTER_A = sprintf("%c",193);
+ my $CAPITAL_LETTER_A = sprintf("%c",193);
=item unpack()
sub Is_c0 {
my $char = substr(shift,0,1);
- if (ord('^')==94) { # ascii
+ if ( ord('^') == 94 ) { # ascii
return $char =~ /[\000-\037]/;
- }
- if (ord('^')==176) { # 37
+ }
+ if ( ord('^') == 176 ) { # 37
return $char =~ /[\000-\003\067\055-\057\026\005\045\013-\023\074\075\062\046\030\031\077\047\034-\037]/;
}
- if (ord('^')==95 || ord('^')==106) { # 1047 || posix-bc
+ if ( ord('^') == 95 || ord('^') == 106 ) { # 1047 || posix-bc
return $char =~ /[\000-\003\067\055-\057\026\005\025\013-\023\074\075\062\046\030\031\077\047\034-\037]/;
}
}
sub Is_delete {
my $char = substr(shift,0,1);
- if (ord('^')==94) { # ascii
+ if ( ord('^') == 94 ) { # ascii
return $char eq "\177";
- }
- else { # ebcdic
+ } else { # ebcdic
return $char eq "\007";
}
}
sub Is_c1 {
my $char = substr(shift,0,1);
- if (ord('^')==94) { # ascii
+ if ( ord('^') == 94 ) { # ascii
return $char =~ /[\200-\237]/;
}
- if (ord('^')==176) { # 37
+ if ( ord('^') == 176 ) { # 37
return $char =~ /[\040-\044\025\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/;
}
- if (ord('^')==95) { # 1047
+ if ( ord('^') == 95 ) { # 1047
return $char =~ /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/;
}
- if (ord('^')==106) { # posix-bc
- return $char =~
+ if ( ord('^') == 106 ) { # posix-bc
+ return $char =~
/[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\137]/;
}
}
sub Is_latin_1 {
my $char = substr(shift,0,1);
- if (ord('^')==94) { # ascii
+ if ( ord('^') == 94 ) { # ascii
return $char =~ /[\240-\377]/;
}
- if (ord('^')==176) { # 37
- return $char =~
+ if ( ord('^') == 176 ) { # 37
+ return $char =~
/[\101\252\112\261\237\262\152\265\275\264\232\212\137\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/;
}
- if (ord('^')==95) { # 1047
+ if ( ord('^') == 95 ) { # 1047
return $char =~
/[\101\252\112\261\237\262\152\265\273\264\232\212\260\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\272\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/;
}
- if (ord('^')==106) { # posix-bc
- return $char =~
+ if ( ord('^') == 106 ) { # posix-bc
+ return $char =~
/[\101\252\260\261\237\262\320\265\171\264\232\212\272\312\257\241\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\340\376\335\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\300\336\333\334\215\216\337]/;
}
}
two letter abbreviation for a physician comes before the two letter
for drive, that is:
- @sorted = sort(qw(Dr. dr.)); # @sorted holds ('Dr.','dr.') on ASCII,
- # but ('dr.','Dr.') on EBCDIC
+ my @sorted = sort(qw(Dr. dr.)); # @sorted holds ('Dr.','dr.') on ASCII,
+ # but ('dr.','Dr.') on EBCDIC
The property of lower case before uppercase letters in EBCDIC is
even carried to the Latin 1 EBCDIC pages such as 0037 and 1047.
apply tr/[A-Z]/[a-z]/ before sorting. If the data are primarily UPPERCASE
and include Latin-1 characters then apply:
- tr/[a-z]/[A-Z]/;
+ tr/[a-z]/[A-Z]/;
tr/[àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþ]/[ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ]/;
- s/ß/SS/g;
+ s/ß/SS/g;
then sort(). Do note however that such Latin-1 manipulation does not
address the E<yuml> C<y WITH DIAERESIS> character that will remain at
where 7E is the hexadecimal ASCII code point for '~'. Here is an example
of decoding such a URL under CCSID 1047:
- $url = 'http://www.pvhp.com/%7Epvhp/';
+ my $url = 'http://www.pvhp.com/%7Epvhp/';
# this array assumes code page 1047
my @a2e_1047 = (
0, 1, 2, 3, 55, 45, 46, 47, 22, 5, 21, 11, 12, 13, 14, 15,
Conversely, here is a partial solution for the task of encoding such
a URL under the 1047 code page:
- $url = 'http://www.pvhp.com/~pvhp/';
+ my $url = 'http://www.pvhp.com/~pvhp/';
# this array assumes code page 1047
my @e2a_1047 = (
0, 1, 2, 3,156, 9,134,127,151,141,142, 11, 12, 13, 14, 15,
92,247, 83, 84, 85, 86, 87, 88, 89, 90,178,212,214,210,211,213,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57,179,219,220,217,218,159
);
- # The following regular expression does not address the
+ # The following regular expression does not address the
# mappings for: ('.' => '%2E', '/' => '%2F', ':' => '%3A')
$url =~ s/([\t "#%&\(\),;<=>\?\@\[\\\]^`{|}~])/sprintf("%%%02X",$e2a_1047[ord($1)])/ge;
characters equivalent to their ASCII counterparts. For example, the
following will print "Yes indeed\n" on either an ASCII or EBCDIC computer:
- $all_byte_chrs = '';
- for (0..255) { $all_byte_chrs .= chr($_); }
- $uuencode_byte_chrs = pack('u', $all_byte_chrs);
- ($uu = <<' ENDOFHEREDOC') =~ s/^\s*//gm;
+ my $all_byte_chrs = '';
+
+ $all_byte_chrs .= chr($_) foreach 0 .. 255;
+
+ my $uuencode_byte_chrs = pack('u', $all_byte_chrs);
+
+ (my $uu = <<' ENDOFHEREDOC') =~ s/^\s*//gm;
M``$"`P0%!@<("0H+#`T.#Q`1$A,4%187&!D:&QP='A\@(2(C)"4F)R@I*BLL
M+2XO,#$R,S0U-C<X.3H[/#T^/T!!0D-$149'2$E*2TQ-3D]045)35%565UA9
M6EM<75Y?8&%B8V1E9F=H:6IK;&UN;W!Q<G-T=79W>'EZ>WQ]?G^`@8*#A(6&
MM+6VM[BYNKN\O;Z_P,'"P\3%QL?(R<K+S,W.S]#1TM/4U=;7V-G:V]S=WM_@
?X>+CY.7FY^CIZNOL[>[O\/'R\_3U]O?X^?K[_/W^_P``
ENDOFHEREDOC
- if ($uuencode_byte_chrs eq $uu) {
+ if ( $uuencode_byte_chrs eq $uu ) {
print "Yes ";
}
$uudecode_byte_chrs = unpack('u', $uuencode_byte_chrs);
- if ($uudecode_byte_chrs eq $all_byte_chrs) {
+ if ( $uudecode_byte_chrs eq $all_byte_chrs ) {
print "indeed\n";
}
Here is a very spartan uudecoder that will work on EBCDIC provided
that the @e2a array is filled in appropriately:
- #!/usr/local/bin/perl
- @e2a = ( # this must be filled in
- );
- $_ = <> until ($mode,$file) = /^begin\s*(\d*)\s*(\S*)/;
+ #!/usr/bin/perl
+ my @e2a = (
+ # this must be filled in
+ );
+ $_ = <> until my($mode,$file) = /^begin\s*(\d*)\s*(\S*)/;
open(OUT, "> $file") if $file ne "";
while(<>) {
last if /^end/;
the printable set using:
# This QP encoder works on ASCII only
- $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/sprintf("=%02X",ord($1))/ge;
+ my $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/sprintf("=%02X",ord($1))/ge;
Whereas a QP encoder that works on both ASCII and EBCDIC machines
would look somewhat like the following (where the EBCDIC branch @e2a
if (ord('A') == 65) { # ASCII
$delete = "\x7F"; # ASCII
@e2a = (0 .. 255) # ASCII to ASCII identity map
- }
- else { # EBCDIC
+
+ } else { # EBCDIC
$delete = "\x07"; # EBCDIC
- @e2a = # EBCDIC to ASCII map (as shown above)
+ @e2a = (
+ # EBCDIC to ASCII map (as shown above)
+ );
}
- $qp_string =~
+ my $qp_string =~
s/([^ !"\#\$%&'()*+,\-.\/0-9:;<>?\@A-Z[\\\]^_`a-z{|}~$delete])/sprintf("=%02X",$e2a[ord($1)])/ge;
(although in production code the substitutions might be done
#!/usr/local/bin/perl
- while(<>){
+ while ( <> ) {
tr/n-za-mN-ZA-M/a-zA-Z/;
print;
}
In one-liner form:
- perl -ne 'tr/n-za-mN-ZA-M/a-zA-Z/;print'
+ perl -pe 'tr/n-za-mN-ZA-M/a-zA-Z/'
=head1 Hashing order and checksums
Joe Smith. Trademarks, registered trademarks, service marks and
registered service marks used in this document are the property of
their respective owners.
-
-