Larry Wall [Mon, 15 Oct 1990 23:07:21 +0000]
perl 3.0 patch #36 patch #29, continued
See patch #29.
Larry Wall [Mon, 15 Oct 1990 23:05:15 +0000]
perl 3.0 patch #35 patch #29, continued
See patch #29.
Larry Wall [Mon, 15 Oct 1990 23:06:41 +0000]
perl 3.0 patch #34 patch #29, continued
See patch #29.
Larry Wall [Mon, 15 Oct 1990 23:03:11 +0000]
perl 3.0 patch #33 patch #29, continued
See patch #29.
Larry Wall [Tue, 16 Oct 1990 02:28:17 +0000]
perl 3.0 patch #32 patch #29, continued
See patch #29.
Larry Wall [Tue, 16 Oct 1990 02:30:59 +0000]
perl 3.0 patch #31 patch #29, continued
See patch #29.
Larry Wall [Mon, 15 Oct 1990 23:06:25 +0000]
perl 3.0 patch #30 patch #29, continued
See patch #29.
Larry Wall [Mon, 15 Oct 1990 23:06:10 +0000]
perl 3.0 patch #29 (combined patch)
This set of patches pretty much brings you up to the functionality
that version 4.0 will have. The Perl Book documents version 4.0.
Perhaps these should be called release notes... :-)
Enhancements:
Many of the changes relate to making the debugger work better.
It now runs your scripts at nearly full speed because it no longer
calls a subroutine on every statement. The debugger now doesn't
get confused about packages, evals and other filenames. More
variables (though still not all) are available within the debugger.
Related to this is the fact that every statement now knows which
package and filename it was compiled in, so package semantics are
now much more straightforward. Every variable also knows which
package it was compiled in. So many places that used to print
out just the variable name now prefix the variable name with the
package name. Notably, if you print *foo it now gives *package'foo.
Along with these, there is now a "caller" function which returns
the context of the current subroutine call. See the man page for
more details.
Chip Salzenberg sent the patches for System V IPC (msg, sem and shm)
so I dropped them in.
There was no way to wait for a specific pid, which was silly, since
Perl was already keeping track of the information. So I added
the waitpid() call, which uses Unix's wait4() or waitpid() if
available, and otherwise emulates them (at least as far as letting
you wait for a particular pid--it doesn't emulate non-blocking wait).
For use in sorting routines, there are now two new operators,
cmp and <=>. These do string and numeric comparison, returning
-1, 0 or 1 when the first argument is less than, equal to or
greater than the second argument.
Occasionally one finds that one wants to evaluate an operator in
a scalar context, even though it's part of a LIST. For this purpose,
there is now a scalar() operator. For instance, the approved
fix for the novice error of using <> in assigning to a local is now:
local($var) = scalar(<STDIN>);
Perl's ordinary I/O is done using standard I/O routines. Every
now and then this gets in your way. You may now access the system
calls read() and write() via the Perl functions sysread() and
syswrite(). They should not be intermixed with ordinary I/O calls
unless you know what you're doing.
Along with this, both the sysread() and read() functions allow you
an optional 4th argument giving an offset into the string you're
reading into, so for instance you can easily finish up partial reads.
As a bit of syntactic sugar, you can now use the file tests -M, -A
and -C to determine the age of a file in (possibly fractional) days
as of the time the script started running. This makes it much
easier to write midnight cleanup scripts with precision.
The index() and rindex() functions now have an optional 3rd argument
which tells it where to start looking, so you can now iterate through
a string using these functions.
The substr() function's 3rd argument is now optional, and if omitted,
the function returns everything to the end of the string.
The tr/// translation function now understands c, d and s options, just
like the tr program. (Well, almost just like. The d option only
deletes characters that aren't in the replacement string.) The
c complementes the character class to match and the s option squishes
out multiple occurrences of any replacement class characters.
The reverse function, used in a scalar context, now reverses its
scalar argument as a string.
Dale Worley posted a patch to add @###.## type fields to formats.
I said, "Neat!" and dropped it in, lock, stock and sinker.
Kai Uwe Rommel sent a bunch of MSDOS and OS/2 updates, which I (mostly)
incorporated. I can't vouch for them, but they look okay.
Any data stored after the __END__ marker can be accesses now via
the DATA filehandle, which is automatically opened onto the script
at that point. (Well, actually, it's just kept open, since it
was already open to read the script.)
The taintperl program now checks for world writable PATH components,
and complains if any are found (if PATH is used).
Bug fixes:
It used to be that you could get core dumps by such means as
@$foo=();
@foo[42];
(1,2,3)[42];
$#foo = 50;
foreach $elem (@foo) {
$elem = 1;
}
This is no longer so. (For those who are up on Perl internals, the
stack policy no longer allows Nullstr--all undefined values must
be passed as &str_undef.)
If you say something like
local($foo,$bar);
or
local($initialized,$foo,$bar) = ('one value');
$foo and $bar are now initialized to the undefined value, rather
than the defined null string.
Array assignment to special arrays is now better supported. For
instance, @ENV = () clears the environment, and %foo = () will
now clear any dbm file bound to %foo.
On the subject of dbm files, the highly visible bugs at patchlevel
28 have been fixed. You can now open dbm files readonly, and you
don't have to do a dummy assignment to make the cache allocate itself.
The modulus operator wasn't working right on negative values because
of a misplaced cast. For instance, -5 % 5 was returning
the value 5, which is clearly wrong.
Certain operations coredumped if you didn't supply a value:
close;
eof;
Previously, if the subroutine supplied for a sort operation didn't
exist, it failed quietly. Now it produces a fatal error.
The bitwise complement operator ~ didn't work on vec() strings longer
than one byte because of failure to increment a loop variable.
The oct and hex functions returned a negative result if the highest
bit was set. They now return an unsigned result, which seems a
little less confusing. Likewise, the token 0x
80000000 also produces
an unsigned value now.
Some machines didn't like to see 0x
87654321 in an #ifdef because
they think of the symbols as signed. The tests have been changed
to just look at the lower 4 nybbles of the value, which is sufficient
to determine endianness, at least as far as the #ifdefs are concerned.
The unshift operator did not return the documented value, which
was the number of elements in the new array. Instead it returned
the last unshifted argument, more or less by accident.
-w sometimes printed spurious warnings about ARGV and ENV when
referencing the arrays indirectly through shift or exec. This
was because the typo test was misplaced before the code that
exempts special variables from the typo test.
If you said 'require "./foo.pl"', it would look in someplace like
/usr/local/lib/perl/./foo.pl instead of the current directory. This
works more like people expect now. The require error messages also
referred to wrong file, if they worked at all.
The h2ph program didn't translate includes right--it should have
changed .h to .ph.
Patterns with multiple short literal strings sometimes failed.
This was a problem with the code that looks for a maximal literal
string to feed to the Boyer-Moore searching routine. The code
was gluing together literal strings that weren't continuous.
The $* variable controls multi-line pattern matching. When it's
0, patterns are supposed to match as if the string contained a
single line. Unfortunately, /^pat/ occasionally matched in middle
of string under certain conditions.
Recently the regular expression routines were upgraded to do
{n,m} more efficiently. In doing this, however, I manufactured
a couple of bugs: /.{n,m}$/ could match with fewer than n characters
remaining on the line, and patterns like /\d{9}/ could match more
than 9 characters.
The undefined value has an actual physical location in Perl, and
pointers to it are passed around. By certain circuitous routes
it was possible to clobber the undefined value so that it
was no longer undefined--kind of like making /dev/null into
a real file. Hopefully this can't happen any more.
op.stat could fail if /bin/0 existed, because of a while (<*>) {...
This has been changed to a while (defined($_ = <*>)) {...
The length of a search pattern was limited by the length of
tokenbuf internally. This restriction has been removed.
The null character gave the tokener indigestion when used as
a delimiter for m// or s///.
There was a bunch of other cleanupish things that are too trivial
to mention here.
Larry Wall [Mon, 13 Aug 1990 09:45:26 +0000]
perl 3.0 patch #28 (combined patch)
Certain systems, notable Ultrix, set the close-on-exec flag
by default on dup'ed file descriptors. This is anti-social
when you're creating a new STDOUT. The flag is now forced
off for STDIN, STDOUT and STDERR.
Some yaccs report 29 shift/reduce conflicts and 59 reduce/reduce
conflicts, while other yaccs and bison report 27 and 61. The
Makefile now says to expect either thing. I'm not sure if there's
a bug lurking there somewhere.
The defined(@array) and defined(%array) ended up defining
the arrays they were trying to determine the status of. Oops.
Using the status of NSIG to determine whether <signal.h> had
been included didn't work right on Xenix. A fix seems to be
beyond Configure at the moment, so we've got some OS dependent
#ifdefs in there.
There were some syntax errors in the new code to determine whether
it is safe to emulate rename() with unlink/link/unlink. Obviously
heavily tested code... :-)
Patch 27 introduced the possibility of using identifiers as
unquoted strings, but the code to warn against the use of
totally lowercase identifiers looped infinitely.
I documented that you can't interpolate $) or $| in pattern.
It was actually implied under s///, but it should have been
more explicit.
Patterns with {m} rather than {m,n} didn't work right.
Tests io.fs and op.stat had difficulties under AFS. They now
ignore the tests in question if they think they're running under
/afs.
The shift/reduce expectation message was off for a2p's Makefile.
Larry Wall [Wed, 8 Aug 1990 17:07:27 +0000]
perl 3.0 patch #27 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:06:25 +0000]
perl 3.0 patch #26 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:07:07 +0000]
perl 3.0 patch #25 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:04:39 +0000]
perl 3.0 patch #24 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:06:03 +0000]
perl 3.0 patch #23 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:01:53 +0000]
perl 3.0 patch #22 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:07:00 +0000]
perl 3.0 patch #21 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:07:00 +0000]
perl 3.0 patch #20 patch #19, continued
See patch #19.
Larry Wall [Wed, 8 Aug 1990 17:02:14 +0000]
perl 3.0 patch #19 (combined patch)
You now have the capability of linking C subroutines into a
special version of perl. See the files in usub/ for an example.
There is now an operator to include library modules with duplicate
suppression and error checking, called "require". (makelib has been
renamed to h2ph, and Tom Christiansen's h2pl stuff has been included
too. Perl .h files are now called .ph files to avoid confusion.)
It's now possible to truncate files if your machines supports any
of ftruncate(fd, size), chsize(fd, size) or fcntl(fd, F_FREESP, size).
Added -c switch to do compilation only, that is, to suppress
execution. Useful in combination with -D1024.
There's now a -x switch to extract a script from the input stream
so you can pipe articles containing Perl scripts directly into perl.
Previously, the only places you could use bare words in Perl were as
filehandles or labels. You can now put bare words (identifiers)
anywhere. If they have no interpretation as filehandles or labels,
they will be treated as if they had single quotes around them.
This works together nicely with the fact that you can use a
symbol name indirectly as a filehandle or to assign to *name.
It basically means you can write subroutines and pass filehandles
without quoting or *-ing them. (It also means the grammar is even
more ambiguous now--59 reduce/reduce conflicts!!! But it seems
to do the Right Thing.)
Added __LINE__ and __FILE__ tokens to let you interpolate the
current line number or filename, such as in a call to an error
routine, or to help you translate eval linenumbers to real
linenumbers.
Added __END__ token to let you mark the end of the program in
the input stream. (^D and ^Z are allowed synonyms.) Program text
and data can now both come from STDIN.
`command` in array context now returns array of lines. Previously
it would return a single element array holding all the lines.
An empty %array now returns 0 in scalar context so that you can
use it profitably in a conditional: &blurfl if %seen;
The include search path (@INC) now includes . explicity at the
end, so you can change it if you wish. Library routines now
have precedence by default.
Several pattern matching optimizations: I sped up /x+y/ patterns
greatly by not retrying on every x, and disabled backoff on
patterns anchored to the end like /\s+$/. This made /\s+$/ run
100 times faster on a string containing 70 spaces followed by an X.
Actual improvements will generally be less than that. I also
sped up {m,n} on simple items by making it a variant of *.
And /.*whatever/ is now optimizaed to /^.*whatever/ to avoid
retrying at every position in the event of failure. I fixed
character classes to allow backslashing hyphen, by popular
request.
In the past, $ in a pattern would sometimes match in the middle
of the string and sometimes not, if $* == 0. Now it will never
match except at the end of the string, or just before a terminating
newline. When $* == 1 behavior is as before.
In the README file, I've expanded on just how I think the GNU
General Public License applies to Perl and to things you might
want to do with Perl.
The interpreter used to set the global variable "line" to be
the current line number. Instead, it now sets a global pointer
to the current Perl statement, which is no more overhead, but
now we will have access to the file name and package name associated
with that statement, so that the debugger soon be upgraded to
allow debugging of evals and packages.
In the past, a conditional construct in an array context passed
the array context on to the conditional expression, causing
general consternation and confusion. Conditionals now always
supply a scalar context to the expression, and if that expression
turns out to be the one whose value is returned, the value is
coerced to an array value of one element.
The switch optimizer was confused by negative fractional values,
and truncating them the wrong direction.
Configure now checks for chsize, select and truncate functions, and
now asks if you want to put scripts into some separate directory
from your binaries. More and more people are establishing a common
directory across architectures for scripts, so this is getting
important.
It used to be that a numeric literal ended up being stored both
as a string and as a double. This could make for lots of wasted
storage if you said things like "$seen{$key} = 1;". So now
numeric literals are now stored only in floating point format,
which saves space, and generates at most one extra conversion per
literal over the life of the script.
The % operator had an off-by-one error if the left argument was
negative.
The pack and unpack functions have been upgraded. You
can now convert native float and double fields using f and d.
You can specify negative relative positions with X<n>, and absolute
positions in the record with @<n>. You can have a length of *
on the final field to indicate that it is to gobble all the rest
of the available fields. In unpack, if you precede a field
spec with %<n>, it does an n-bit checksum on it instead of the
value itself. (Thus "%16C*" will checksum just like the Sys V sum
program.) One totally wacked out evening I hacked a u format
in to pack and unpack uudecode-style strings.
A couple bugs were fixed in unpack--it couldn't unpack an A or a
format field in a scalar context, which is just supposed to
return the first field. The c and C formats were also calling
bcopy to copy each character. Yuck.
Machines without the setreuid() system call couldn't manipulate
$< and $> easily. Now, if you've got setuid(), you can say $< = $>
or $> = $< or even ($<, $>) = ($uid, $uid), as long as it's
something that can be done with setuid(). Similarly for setgid().
I've included various MSDOS and OS/2 patches that people have sent.
There's still more in the hopper...
An open on a pipe for output such as 'open(STDOUT,"|command")' left
STDOUT attached to the wrong file descriptor. This didn't matter
within Perl, but it made subprocesses expecting stdout to be on fd 1
rather irate.
The print command could fail to detect errors such as running out
room on the disk. Now it checks a little better.
Saying "print @foo" might only print out some of the elements
if there undefined elements in the middle of the array, due to
a reversed bit of logic in the print routine.
On machines with vfork the child process would allocate memory
in the parent without the parent knowing about it, or having any way
to free the memory so allocated. The parent now calls a cleanup
routine that knows whether that's what happened.
If the getsockname or getpeername functions returned a normal
Unix error, perl -w would report that you tried I/O on an
unopened socket, even though it was open.
MACH doesn't have seekdir or telldir. Who ever uses them anyway?
Under certain circumstances, an optimized pattern match could
pass a hint into the standard pattern matching routine which
the standard routine would then ignore. The next pattern match
after that would then get a "panic: hint in do_match" because the
hint didn't point into the current string of interest.
The $' variable returned a short string if it contained an
embedded null.
Two common split cases are now special-cased to avoid the regular
expression code. One is /\s+/ (and its cousin ' ', which also
trims leading whitespace). The other is /^/, which is very useful
for splitting a "here-is" quote into lines:
@lines = split(/^/, <<END);
Element 0
Element 1
Element 2
END
You couldn't split on a single case-insensitive letter because
the single character split optimization ignore the case folding
flag.
Sort now handles undefined strings right, and sorts lists
a little more efficiently because it weeds them out before
sorting so it doesn't have to check for them on every comparison.
The each() and keys() functions were returning garbage on null
keys in DBM files because the DBM iterator merely returns a pointer
into the buffer to a string that's not necessarily null terminated.
Internally, Perl keeps a null at the end of every string (though
allowing embedded nulls) and some routines make use of this
to avoid checking for the end of buffer on every comparison. So
this just needed to be treated as a special case.
The &, | and ^ operators will do bitwise operations on two strings,
but for some reason I hadn't implemented ~ to do a complement.
Using an associative array name with a % in dbmopen(%name...)
didn't work right, not because it didn't parse, but because the
dbm opening routine internally did the wrong thing with it.
You can now say dbmopen(name, 'filename', undef) to prevent it
from opening the dbm file if it doesn't exist.
The die operator simply exited if you didn't give an argument,
because that made sense before eval existed. But now it will be
equivalent to "die 'Died';".
Using the return function outside a subroutine returned a cryptic
message about not being able to pop a magical label off the stack.
It's now more informative.
On systems without the rename() system call, it's emulated with
unlink()/link()/unlink(), which could clobber a file if it
happened to unlink it before it linked it. Perl now checks to
make sure the source and destination filenames aren't in fact
the same directory entry.
The -s file test now returns size of file. Why not?
If you tried to write a general subroutine to open files, passing
in the filehandle as *filehandle, it didn't work because nobody
took responsibility to allocate the filehandle structure internally.
Now, passing *name to subroutine forces filehandle and array
creation on that symbol if they're already not created.
Reading input via <HANDLE> is now a little more efficient--it
does one less string copy.
The dumpvar.pl routine now fixes weird chars to be printable, and
allows you to specify a list of varables to display. The debugger
takes advantage of this. The debugger also now allows \ continuation
lines, and has an = command to let you make aliases easily. Line
numbers should now be correct even after lines containing only
a semicolon.
The action code for parsing split; with no arguments didn't
pass correct a corrent value of bufend to the scanpat it was
using to establish the /\s+/ pattern.
The $] variable returned the rcsid string and patchlevel. It still
returns that in a string context, but in a numeric context it
returns the version number (as in 4.0) + patchlevel / 1000.
So these patches are being applied to 3.018.
The variables $0, %ENV, @ARGV were retaining incorrect information
from the previous incarnation in dumped/undumped scripts.
The %ENV array is suppose to be global even inside packages, but
and off-by-one error kept it from being so.
The $| variable couldn't be set on a filehandle before the file
was opened. Now you can.
If errno == 0, the $! variable returned "Error 0" in a string
context, which is, unfortunately, a true string. It now returns ""
in string context if errno == 0, so you can use it reasonable in
a conditional without comparing it to 0: &cleanup if $!;
On some machines, conversion of a number to a string caused
a malloc string to be overrun by 1 character. More memory is
now allocated for such a string.
The tainting mechanism didn't work right on scripts that were setgid
but not setuid.
If you had reference to an array such as @name in a program, but
didn't invoke any of the usual array operations, the array never
got initialized.
The FPS compiler doesn't do default in a switch very well if the
value can be interpreted as a signed character. There's now a
#ifdef BADSWITCH for such machines.
Certain combinations of backslashed backslashes weren't correctly
parsed inside double-quoted strings.
"Here" strings caused warnings about uninitialized variables because
the string used internally to accumulate the lines wasn't initialized
according to the standards of the -w switch.
The a2p translator couldn't parse {foo = (bar == 123)} due to
a hangover from the old awk syntax. It also needed to put a
chop into a program if the program referenced NF so that the
field count would come out right when the split was done.
There was a missing semicolon when local($_) was emitted.
I also didn't realize that an explicity awk split on ' ' trims
leading whitespace just like the implicit split at the beginning
of the loop. The awk for..in loop has to be translated in one
of two ways in a2p, depending on whether the array was produced
by a split or by subscripting. If the array was a normal array,
a2p put out code that iterated over the array values rather than
the numeric indexes, which was wrong.
The s2p didn't translate \n correctly, stripping the backslash.
Larry Wall [Tue, 27 Mar 1990 04:46:23 +0000]
perl 3.0 patch #18 patch #16, continued
See patch #16.
Larry Wall [Tue, 27 Mar 1990 04:26:14 +0000]
perl 3.0 patch #17 patch #16, continued
See patch #16.
Larry Wall [Tue, 27 Mar 1990 20:20:03 +0000]
perl 3.0 patch #16 (combined patch)
There is now support for compiling perl under the Microsoft C
compiler on MSDOS. Special thanks go to Diomidis Spinellis
<dds@cc.ic.ac.uk> for this. To compile under MSDOS, look at the
readme file in the msdos subdirectory.
As a part of this, six files will be renamed when you run
Configure. These are config.h.SH, perl.man.[1-4] and t/op.subst.
Suns (and perhaps other machines) can't cast negative floating
point numbers to unsigned ints reasonably. Configure now detects
this and takes appropriate action.
Configure looked for optional libraries but then didn't ever use
them, even if there was no config.sh value to override.
System V Release 4 provides us with yet another nm format for
Configure to parse. No doubt it's "better". Sigh.
MIPS CPUs running under Ultrix were getting configured for volatile
support, but they don't like volatile when applied to a type generated
by a typedef. Configure now tests for this.
I've added two new perl library routines: ctime.pl from
Waldemar Kebsch and Marion Hakanson, and syslog.pl from Tom
Christiansen and me.
In subroutines, non-terminal blocks should never have arrays
requested of them, even if the subroutine call's context is
looking for an array.
Formats didn't work inside eval. Now they do.
Any $foo++ that doesn't return a value is now optimized to ++$foo
since the latter doesn't require generation of a temporary to hold
the old value.
A self-referential printf pattern such as sprintf($s,...,$s,...)
would end up with a null as the first character of the next field.
On machines that don't support executing scripts in the kernel,
perl has to emulate that when an exec fails. In this case,
the do_exec() routine can lose arguments passed to the script.
A memory leakage in pattern matching triggered by use of $`, $& or $'
has been fixed.
A splice that pulls up the front of an array such as splice(@array,0,$n)
can cause a duplicate free error.
The grep operator blew up on undefined array values. It now handles
them reasonably, setting $_ to undef.
The .. operator in an array context is used to generate number
ranges. This has been generalized to allow any string ranges that
can be generated with the magical increment code of ++. So
you can say 'a' .. 'f', '000'..'999', etc.
The ioctl function didn't return non-zero values correctly.
Associative array slices from dbm files like @dbmvalues{'foo','bar'}
could use the same cache entry for multiple values, causing loss of
some of the values of the slice. Cache values are now not flushed
until the end of a statement.
The do FILE operator blew up when used inside an eval, due to trying
to free the eval code it was still executing.
If you did s/^prefix// on a string, and subsequently assigned a
value that didn't contain a string value to the string, you could
get a bad free error.
One of the taint checks blew up on undefined array elements, which
showed up only when taintperl was run.
The final semicolon in program is supposed to be optional now.
Unfortunately this wasn't true when -p or -n added extra code
around your code. Now it's true all the time.
A tail anchored pattern such as /foo$/ could cause grief if you
searched a string that was shorter than that.
Larry Wall [Tue, 13 Mar 1990 23:33:04 +0000]
perl 3.0 patch #15 (combined patch)
In patch 13, there was a fix to make the VAR=value construct
in a command force interpretation by the shell. This was botched,
causing an argv list to be occasionally allocated with too small
a size. This problem is hidden on some machines because of
BSD malloc's semantics.
The lib/dumpvar.pl file was missing final 1; which made it
difficult to tell if it loaded right.
The lib/termcap.pl Tgetent subroutine didn't interpret ^x right
due to a missing ord().
In the section of the man page that gives hints for C programmers,
it falsely declared that you can't subscript array values. As of
patch 13, this statement is "inoperative".
The t/op.sleep test assumed that a sleep of 2 seconds would always
return a value of 2 seconds slept. Depending on the load and
the whimsey of the scheduler, it could actually sleep longer than
2 seconds upon occasion. It now allows sleeps of up to 10 seconds.
Larry Wall [Mon, 12 Mar 1990 04:13:22 +0000]
perl 3.0 patch #14 patch #13, continued
See patch #13.
Larry Wall [Mon, 12 Mar 1990 04:09:28 +0000]
perl 3.0 patch #13 (combined patch)
I added the list slice operator: (LIST)[LIST]
$hexdigit = (0..9,'a','b','c','d','e','f')[$fourbits]
There was no way to cut stuff out of the middle of an array
or to insert stuff without copying the head and tail of the array,
which is gross. I added the splice operator to do this:
@oldelems = splice(@array,$offset,$len,LIST)
Equivalencies:
splice(@array,0,1)
splice(@array,0,0,$x,$y)
splice(@array,-1,1)
splice(@array,$#array+1,0,$x,$y)
splice(@array,$x,1,$y)
Having -lPW as one of the libraries that Configure looks for
was causing lots of people grief. It was only there for
people using bison who otherwise don't have alloca(), so I
zapped it.
Some of the questions that supported the ~name syntax didn't
say so, and some that should have supported it didn't. Now they do.
If you selected the manp directory for your man pages, the manext
variable was left set to 'n'.
When Configure sees that the optional libraries have previously
been determined in config.sh, it now believes it rather than using
the list it generates.
In the test for byteorder, some compilers get indigestion on the
constant 0x
0807060504030201. It's now split into two parts.
Some compilers don't like it if you put CCFLAGS after the .c file
on the command line. Some of the Configure tests did this.
On some systems, the test for vprintf() needs to have stdio.h
included in order to give valid results.
Some machines don't support the volatile declaration as applied
to a pointer. The Configure test now checks for this.
Also, cmd.c had some VOLATILE declarations on pointed-to items
rather than the pointers themselves, causing MIPS heartburn.
In Makefile.SH, some of the t*.c files needed to have dependencies
on perly.h. Additionally, some parallel makes can't handle a
dependency line with two targets, so the perly.h and perl.c lines
have been separated. Also, when perly.h is generated, it will
now have a declaration added to it for yylval--bison wasn't supplying
this.
The construct "while (s/x//) {}" was partially fixed in patch 9, but
there were still some weirdnesses about it. Hopefully these are
ironed out now.
If you did a switch structure based on numeric value, and there
was some action attached to when the variable is greater than
the maximum specified value, that action would not happen. Instead,
any action for values under the minimum value happened.
The debugger had some difficulties after patch 9, due to changes
in the meaning of @array in a scalar context, and because of
an pointer error in patch 9.
Because of the fix in patch 9 to let return () work right, the
construct "return (@array)" did counter-intuitive things. It
now returns an array value. "return @array" and "return (@array)"
now mean the same thing.
A pack of ascii strings could call str_ncat() with negative length
when the length of the string was greater than the length specified
for the field.
Patch 9 fixed *name values so that the wouldn't collide with ordinary
string values, but there were two places I missed, one in perldb,
and one in the sprintf code.
Perl looks at commands it is going to execute to see if it can
bypass /bin/sh and execute them directly. Ordinarily = is not
a shell metacharacter, but in a command like "system 'FOO=bar command'"i
it indicates that /bin/sh should be used, since it's setting an
environment variable. It now does that (other than that construct,
the = character is still not a shell metacharacter).
If a runtime pattern to split happens to be null, it was being
interpreted as if it were a space, that is, as the awk-emulating
split. It now splits all characters apart, since that's more in
line with what people expect, and the other behavior wasn't documented.
Patch 9 added the reserved word "pipe". The scripts eg/g/gsh and
/eg/scan/scanner used pipe as filehandle since they were written
before the recommendation of upper-case filehandles was devised.
They now use PIPE.
The undef $/ command was supposed to let you slurp in an entire
binary file with one <>, but it didn't work as advertised.
Xenix systems have been having problems with Configure setting
up ndir right. Hopefully this will work better now, but it's
possible the changes will blow someone else up. Such is life...
The construct (LIST,) is now legal, so that you can say
@foo = (
1,
2,
3,
);
Various changes were made to the documentation.
In double quoted strings, you could say \0 to mean the null
character. In pattern matches, only \000 was allowed since
\0 was taken to be a \<digit> backreference. Since it doesn't
make sense to refer to the whole matched string before it's done,
there's no reason \0 can't mean null in a pattern too. So now
it does.
You could modify a numeric variable by using substr as an lvalue,
and if you then reference the variable numerically, you'd get
the old number out rather than one derived from the new string.
Now the old number is invalidated on lvalued substr.
The test t/op.mkdir should create directories 0777 rather than 0666.
As Randal requested, the last semicolon of a program is now optional.
Actually, he just asked for -e 'prog' to have that behaviour, but
it seemed reasonable to generalize it slightly. It's been that
way with eval for some time.
Larry Wall [Wed, 28 Feb 1990 21:56:43 +0000]
perl 3.0 patch #12 patch #9, continued
See patch #9.
Larry Wall [Wed, 28 Feb 1990 21:55:09 +0000]
perl 3.0 patch #11 patch #9, continued
See patch #9.
Larry Wall [Wed, 28 Feb 1990 21:54:46 +0000]
perl 3.0 patch #10 patch #9, continued
See patch #9.
Larry Wall [Wed, 28 Feb 1990 21:56:11 +0000]
perl 3.0 patch #9 (combined patch)
Well, I didn't quite fix 100 things--only 94. There are still
some other things to do, so don't think if I didn't fix your
favorite bug that your bug report is in the bit bucket. (It
may be, but don't think it. :-)
There are very few enhancements here. One is the new pipe()
function. There was just no way to emulate this using the
current operations, unless you happened to have socketpair()
on your system. Not even syscall() was useful in this respect.
Configure now determines whether volatile is supported, since
some compilers implement volatile but don't define __STDC__.
Some compilers can put structure members and global variables
into registers, so more variables had to be declared volatile
to avoid clobbering during longjmp().
Some systems have wanted routines stashed away in libBSD.a and
libPW.a. Configure can now find them.
A number of Configure tests create a file called "try" and then
execute it. Unfortunately, if there was a "try" elsewhere in PATH
it got that one instead. All references are now to "./try".
On Ultrix machines running the Mips cpu, some header files define
things differently for assembly language than for the C language.
To differentiate these, cc passes a -DLANGUAGE_C to the C preprocessor.
Unfortunately, Configure, makedepend and perl want to use the
preprocessor independently of cc. Configure now defaults to
adding -DLANGUAGE_C on machines containing that symbol in signal.h.
In Configure, some libraries were getting into the list more than
once, causing extra extraction overhead. The names are now
uniquified.
Someone has invented yet another output format for nm. Sigh.
Why do people assume that only people read the output of programs?
Due to commentary between a declaration and its semicolon, some
standard versions of stdio weren't being considered standard, and the
type of char used by stdio was being misidentified.
People trying to use bison instead of yacc ran into two problems.
One, lack of alloca(), is solved on some machines by finding libPW.a.
The other is that you have to supply a -y switch to bison to get
it to emulate yacc naming conventions. Configure now prompts
correctly for bison -y.
The make clean had a rm -f $suidperl where it just wanted
a rm -f suidperl
In the README, documented more weirdities on various machines,
including a pointer to the JMPCLOBBER symbol.
In the construct
OUTER: foreach (1,2,3) {
INNER: foreach (4,5) {
...
next OUTER;
}
}
the inner loop was not getting reset to the first element. This
was one of those bugs that arise because longjmp() doesn't
execute exit handlers as it unwinds the stack.
Perl reallocs many things as they grow, including the stack (its
stack, not the C program's stack). This means that routines
have to be careful to retreive the new stack when they call
subroutines that can do such a realloc. In cmd.c there was
such code but it was hidden inside an #ifdef JMPCLOBBER that
it should have been outside of, so you could get bad return
values of JMPCLOBBER wasn't defined. If you defined JMPCLOBBER
to work around this problem, you should consider undefining
it if your compiler guarantees that register variables get the value
they had either at setjmp() or longjmp() time. Perl runs
slightly faster without JMPCLOBBER defined.
The longjmp()s that perl does return known values, but as a
paranoid programming measure, it now checks that the values
are one of the expected ones.
If you say something like
while (s/ /_/) {}
the substitution almost always succeeds (on normal text). There
is an optimization that quickly discovers and bypasses operations
that are going to fail, but does nothing to help generally successful
ones such as the one above. So there's a heuristic that disables
the optimization if it isn't buying us anything. Unfortunately,
in the above case, it's in the conditional of a while loop,
which is duplicated by another optimization to be a
last unless s/ /_/;
at the end of the loop, to avoid unnecessary subroutine calls.
Because the conditional was duplicated (not the expression itself,
just the structure pointing to it), the heuristic mentioned above
tried to disable the first optimization twice, resulting in the
label stack getting corrupted.
Some subroutines which mix both return mechanisms like this:
sub foo {
local($foo);
return $foo if $whatever;
$foo;
}
This clobbered the return value of $foo when the end of the scope
of the local($foo) was reached. This was because such a routine
turns into something like this internally:
sub foo {
_SUB_: {
local($foo);
if ($whatever) {
$foo; last _SUB_;
}
$foo;
}
}
Because the outer _SUB_ block was manufactured by non-standard
means, it wasn't getting marked as an expression that could
return a value, ie a terminal expression. So the return value
wasn't getting properly saved off to the side before the local()
exited.
The internal label on subroutine blocks used to be SUB, but I
changed it to _SUB_ to avoid possible confusion. Evals now have
labels too, so they are labelled with _EVAL_. The reason evals
now have a label is that nested evals need separate longjmp
environments, or fatal errors end up getting a longjmp() botch.
So eval now uses the same label stack as loops and subroutines.
The eval routine used to always return undef on failure. In an
array context, however, this makes a non-null array, which when
assigned is TRUE, which is counter-intuitive. It now returns
a null array upon failure in an array context.
When a foreach operator works on a non-array, the compiler translates
foreach (1,2,3) {
into something like
@_GEN_0 = (1,2,3); foreach (@_GEN_0) {
Unfortunately, the line number was not correctly propagated to both
command structures, so huge line numbers could appear in error
messages and while debugging.
The x operator was stupidly written, just calling the internal
routine str_scat() multiple times, and not preextending the
string to the known new length. It now preextends the string
and calls a special routine to replicate the string quickly.
On long strings like '\0' x 1024, the operator is more than
10 times faster.
The split operator is supposed to split into @_ if called in
a scalar context. Unfortunately, it was also splitting into @_
in an array context that wasn't a real array, such as assignment
to a list:
($foo,$bar) = split;
This has now been fixed.
The split and substitute operators have a check to make sure
that it isn't looping endlessly. Unfortunate, they had a hardwired
limit of 10000 iterations. There are applications conceivable
where you could work on longer values than that, so they
now calculate a reasonable limit based on the length of the arguments.
Pack and unpack called atoi all the time on the template fields.
Since there are usually at most one or two digits of number,
this wasted a lot of time on machines with slow subroutine calls.
It now picks up the number itself.
There were several places that casts could blow up. In particular,
it appears that a sun3 can't cast a negative float to an unsigned
integer. Appropriate measure have been taken--hopefully this
won't blow someone else up.
A local($.) didn't work right because the actual value of the
current line number is derived from the last input filehandle.
This has been fixed by causing the last input filehandle to
be restored after the scope of a local($.) to what it was when
the local was executed.
Assignment is supposed to return the final value of the left
hand side. In the case of array assignment (in an array context),
it was actually returning the right hand side. This showed up in
things that referred to the actual elements of an array value,
such as grep(s/foo/bar/, @abc = @xyz), which modified @xyz rather
than @abc.
The syscall() function was returning a garbage value (the index of
the top of the stack, actually) rather than value of system call.
There was some discussion about how to open files with arbitrary
characters in the filename. In particular, the open function strips
trailing spaces. There was no way to suppress this. Now you can
put an explicit null at the end of the string
open(FOO,"$filename\0")
and this will hide any spaces on the end of the filename. The Unix
open() function will of course treat the null as the trailing delimiter.
As a hangover from when Perl was not useful on binary files, there
was a check to make sure that the file being opened was a normal
file or character special file or socket. Now that Perl can
handle binary data, this is useless, and has been removed.
Some versions of utime.h have microseconds specified as acusec and
modusec. Perl was referring to these in order to zero out the
fields. But not everyone has these. Perl now just bzero's out
the structure and refers only to fields that everyone has.
You used to have to say
($foo) = unpack("L",$bar);
Now you can say
$foo = unpack("L",$bar);
and it will just unpack the first thing specified by the template;
The subscripts for slices were ignoring the value of $[. (This
never made any difference for people who leave $[ set to 0.)
It seems reasonable that grep in a scalar context should return the
number of items matched so that it can be used in, say, a conditional.
Formerly it returned an undef.
Another problem with grep was that if you said something like
grep(/$1/, @foo)
then each iteration of grep was executing in the context of the
previous iteration's regexp, so $1 might be wiped out after the
first iteration. All iterations of grep now operate in the regexp
context of the grep operator itself.
The eg/README file now explicity states that the examples in
the eg directory are to be considered in the Public Domain, and
thus do not have the same restrictions as the Perl source.
In a previous patch the shift operator was made to shift @_ inside
of subroutines. This made some of the getopt code wrong.
The sample rename command (and the new relink command) can either
take a list of filenames from stdin, or if stdin is a terminal,
default to a * in the current directory.
A sample travesty program is now included. If you want to know what
it does, feed it about 10 Usenet articles, or the perl manual, and
see what it prints out.
If a return operator was embedded in an expression that supplied
a scalar context, but the subroutine containing the return was
called in an array context, an array was not returned correctly.
Now it is.
The !~ operator used to ignore the negation in an array context and
do the same thing as =~. It now always returns scalar even in
array context, so if you say
($foo) = ($bar !~ /(pat)/)
$foo will get a value of either 1 or ''.
Opens on pipes were defined to return the child's pid in the parent,
and FALSE in the child. Unfortunately, what the child actually
got was an undef, making it indistinguishable from a failure to
open the pipe successfully. The child now gets a 0, and undef
means a failure to fork a child.
Formerly, @array in a scalar context returned the last value of
the array, by analogy to the comma operator. This makes for
counter-intuitive results when you say
if (@array)
if 0 or '' is a legal array value. @array now returns the length
of the array (not the subscript of the last element, which is @#array).
To get the last element of the array you must either pop(@array) or
refer to $array[$#array].
The chdir operator with no argument was supposed to change directory
to your home directory, but it core dumped instead.
The wait operator was ignoring SIGINT and SIGQUIT, by analogy to
the system and pipe operations. But wait is a lower level operation,
and it gives you more freedom if those signals aren't automatically
ignored. If you want them ignored, you now have to explicitly
ignore them by setting the proper %SIG entry.
Different versions of /bin/mkdir and /bin/rmdir return different
messages upon failure. Perl now knows about more of them.
-l FILEHANDLE now disallowed
The use of the -l file test makes no sense on a filehandle, since
you can't open symbolic links. So -l FILEHANDLE now is a fatal
error. This also means you can't say -l _, which is also a
useless operation.
The heavy wizardry involved in saying $#foo -= 2 didn't work quite
right.
In formats, you can say ... in a ^ field to have ... output when
there is more for that field that is getting truncated. The
next field was getting shifted over by three characters, however.
The perl library routines abbrev.pl, complete.pl, getopt.pl and
getopts.pl were assuming $[ == 0. The Getopt routine wasn't
returning an error on unrecognized switches. The look.pl routine
had never been tested, and didn't work at all. Now it does.
There were several difficulties in termcap.pl. Togoto was documented
backwards for $rows and $cols. The Tgetent routine could loop
endlessly if there was a tc entry. And it didn't interpret the ^x
form of specifying control characters right because of base
treachery (031 instead of 31). There were also problems with
using @_ as a temporary array.
In perl.h, the unused VREG symbol was deleted because it conflicted
with somebody's header files.
If perl detects a #! line that specifies some other interpreter
than perl, it will now start up that interpreter for you. This
let's you specify a SHELL of perl to some programs.
The $/ variable specifies the input record separator. It was
possible to set it to a non-text character and read in an entire
text file as one input, but it wasn't possible to do that
for a binary file. Now you can undef $/, and there will be
no record separator, so you are guaranteed to get the entire
file with one <>.
The example in the manual of an open() inside a ?: had the
branches of the ?: backwards. I documented the fact that
grep can modify arrays in place (with caveats about modifying
literal values). I also put in how to deal with filenames
that might have arbitrary characters, and mentioned about the
problem of unflushed buffers on opens that cause forks.
It's now documented how to force top of page before the next write.
Formerly, $0 was guaranteed to contain the name of the perl script
only till the first regular expression was executed. It now
keeps that value permanently. $0 can no longer be used as a synonym
for $&.
The regular expression evaluator didn't handle character classes
with the 8th bit set. None of /[\200-\377]/, \d, \w or \s worked
right--the character class because signed characters were not
interpreted right, and the builtins because the isdigit(), isalpha()
and isspace() macros are only defined if isascii() is true.
Patterns of the form /\bfoo/i didn't work right because the \b
wants to compare the preceding character with the next one
to look for word boundaries, and the i modifier forced a move
of the string to a place where it couldn't do that without
examining malloc garbage.
The type glob syntax *foo produces the symbol table entry for
all the various foo variables. Perl has to do certain bookkeeping
when moving such values around. The symbol table entry was not
adequately differentiated from normal data to prevent occasion
confusion, however.
On MICROPORTs, the CRIPPLED_CC option made the stab_array()
and stab_hash() macros into function calls, but neglected to
supply the function definitions.
The string length allocated to turn a number into a string
internally turned out to be too short on a Sun 4.
Several constructs were not recognized properly inside double-quoted
strings:
underline in name
required @foo to be defined rather than %foo
threw off bracket matcher
not identified with $1
The base.term test gives misleading results if /dev/null happens
not to be a character special file. So it now checks for that.
The op.stat could exceed the shell's maximum argument length
when evaluating </usr/bin/*>. It now chdirs to /usr/bin and does <*>.
return grandfathered to never be function call
The construct
return (1,2,3);
did not do what was expected, since return was swallowing the
parens in order to consider itself a function. The solution,
since return never wants any trailing expression such as
return (1,2,3) + 2;
is to simply make return an exception to the paren-makes-a-function
rule, and treat it the way it always was, so that it doesn't
strip the parens.
If perldb.pl doesn't exist, there was no reasonable error message
given when you invoke perl -d. It now does a do-or-die internally.
null hereis core dumped
The hereis construct dumped core on a null string:
print <<'FOO';
FOO
Certain pattern matches weren't working on patterns with embedded
nulls because the fbminstr() routine, when it decided it couldn't
do a fancy search, degenerated to using instr(), rather than
ninstr(), which is better about embedded nulls.
The s2p sed-to-perl translator didn't translate \< and \> to \b.
Now it does.
The a2p awk-to-perl translator didn't put a $ on ExitValue when
translating the awk exit construct. It also didn't allow
logical expressions inside normal expressions:
i = ($1 == 2 || $2 ~ /bar/)
a2p.h had definition of a bzero() macro inside an ifdef of BCOPY.
The two don't always go together, and since Configure is already
looking for both separately...
Larry Wall [Thu, 21 Dec 1989 07:38:27 +0000]
perl 3.0 patch #8 patch 7 continued
See patch 7.
Larry Wall [Thu, 21 Dec 1989 07:38:16 +0000]
perl 3.0 patch #7 (combined patch)
The select operator didn't interpret bit vectors correctly on
non-little-endian machines such as Suns. Rather than bollux up
the rather straightforward interpretation of bit vectors, I made
the select operator rearrange the bytes as necessary. So it
is still true that vec($foo,0,1) refers to the first bit of the
first byte of string $foo, even on big-endian machines.
The send() socket operator didn't correctly allow you to specify
a TO argument even though this was documented. (The TO argument
is desirable for sending datagram packets.)
In ANSI standard C, they decided that longjmp() didn't have to
guarantee anything about registers. Several people sent me
some patches that declared certain variables as volatile
rather than register for such compilers. Rather than go that
route, however, I wanted to keep some of these variables in
registers, so I just made sure that the important ones are
restored from non-register locations after longjmp(). I think
"volatile" encourages people to punt too easily.
The foreach construct still had some difficulty with two nested
foreach loops referring to the same array, and to a single
foreach that called its enclosing subroutine recursively.
I think I've got this straight now. You wouldn't think
a little iterator would give some much trouble.
A pattern like /b*/ wouldn't match a null string before the
first character. And certain patterns didn't match correctly
at end of string. The upshot was that
$_ = 'aaa';
s/b*/x/g;
produced 'axaxa' rather than the expected 'xaxaxax'. This has
been fixed. Note however that the split operator will still
not match a null string before the first character, so that
split(/b*/,'aaa') produces ('a','a','a'), not ('','a','a','a','').
The saga continues, and hopefully concludes. I realized I was
fighting a losing battle trying to grep out all the includes
from <time.h> and <sys/time.h>. There are just too many funny
includes, symbols, links and such on too many kinds of machines.
Configure now compiles a test program several different ways to
figure out which way to define the various symbols.
Configure now lets you pick between yacc or bison for your
compiler compiler. If you pick bison, be sure you have alloca
somewhere on your system.
The ANSI function strerror() is now supported where available.
In addition, errno may now be a macro with an lvalue, so errno
isn't declared extern if it's defined as a macro in <errno.h>.
The memcpy() and memset() are now allowed to return void.
There is now support for sys/ndir.h for systems such as Xenix.
It's now also easier to cross compile on a 386 for a 286.
DG/UX has functions setpgrp2() and getpgrp2() to keep the BSD
sematics separate from the SystemV semantics. So now we have
yet another wonderful non-standard way of doing things. There
is also a utime.h file which lets them put time stamps on
files to microsecond resolutions, though perl doesn't take
advantage of this.
The list of optional libraries to be searched for now includes
-lnet_s, -lnsl_s, -lsocket and -lx. We can now find .h files
down in /usr/include/lan.
Microport systems have problems. I've added some CRIPPLED_CC
support for them, but you still need to read the README.uport
file for some extra rigamarole.
In the README file, there are now hints for what to do if your
compile doesn't work right, and specific hints for machines
known to require certain switches.
The grep operator with a simple first argument, such as grep(1,@array),
didn't work right. That one seems silly, but grep($_,@array)
didn't work either. Now it does.
A /$pat/ followed by a // wrongly freed the runtime pattern twice,
causing ill-will on the part of all concerned.
The ord() function now always returns positive even on signed-char
machines. This seems to be less surprising to people. If you
still want a signed value on such machines, you can always use
unpack.
The lib/complete.pl file misused the @_ array. The array has
been renamed.
In the man page, I clarified that s`pat`repl` does command
substitution on the replacement string, that $timeleft from
select() is likely not implemented in many places, and that
the qualified form package'filehandle works as well as
$package'variable. It is also explicitly stated that
certain identifiers (non-alpha, STDIN, etc.) are always
resolved in package main's symbol table.
Perl didn't grok setuid scripts that had a space on the
first line between the shebang and the interpreter name.
In stab.c, sighandler() may now return either void or int,
depending on the value of VOIDSIG.
You couldn't debug a script that used -p or -n because they would
try to slap an extra } on the end of the perldb.pl file. This
upset the parser.
The interpration of strings like " ''$foo'' " caused problems
because the tokener didn't realize that neither single quote
following the variable was indicating a package qualifier.
(It knew the last one wasn't, but was confused about the first one.)
Merely changing an if to a while fixed it. Well, two if's.
Another place we don't want ' to be interpreted as a package
qualifier is if it's the delimiter for an m'pat' or s'pat'repl'.
These have been grandfathered to look like a match and a substitution.
There were a couple of problems in a2p. First, the ops array
was dimensioned too big on 286's. Second, there was a problem
involving passing a union where I should've passed a member of
the union, which meant user-defined functions didn't work right
on some machines.
Larry Wall [Fri, 17 Nov 1989 03:02:59 +0000]
perl 3.0 patch #6 patch 5 continued
See patch 5.
Larry Wall [Fri, 17 Nov 1989 03:02:33 +0000]
perl 3.0 patch #5 (combined patch)
Some machines have bcopy() but not bzero(), so Configure
tests for them separately now. Likewise for symlink() and lstat().
Some systems have dirent.h but not readdir(). The symbols BZERO,
LSTAT and READDIR are now used to differentiate.
Some machines have <time.h> including <sys/time.h>. Some do
the opposite. Some don't even have <sys/time.h>. Configure
now looks for both kinds of include, and the saga continues...
Configure tested twice for the presence of -lnm because x2p/Makefile.SH
had a reference to the obsolete $libnm variable. It now tests
only once.
Some machines have goodies stashed in /usr/include/sun,
/usr/include/bsd, -lsun and -lbsd. Configure now checks those
locations.
Configure could sometimes add an option to a default of none,
producing [none -DDEBUGGING] prompts. This is fixed.
Many of the units in metaconfig used the construct
if xxx=`loc...`; then
On most machines the exit status of loc ends up in $?, but on
a few machines, the assignment apparently sets $? to 0, since
it always succeeds. Oh well...
The tests for byte order had difficulties with illegal octal
digits and constants that were too long, as well as not defining
the union in try.c correctly.
When <dirent.h> was missing, it was assumed that the field d_namlen
existed. There is now an explicit check of <sys/dir.h> for the field.
The tests of <signal.h> to see how signal() is declared needed to have
signal.h run through the C preprocessor first because of POSIX ifdefs.
The type returned by getgroups() was defaulting wrong on Suns and
such. Configure now checks against the lint library if it exists
to produce a better default.
The construct
foreach $elem (@array) {
foreach $elem (@array) {
...
}
}
didn't work right because the iterator for the array was stored
with the array rather than with the node in the syntax tree.
If you said
defined $foo{'bar'}
it would create the element $foo{'bar'} while returning the
correct value. It now no longer creates the value.
The grep() function was occasionally losing arguments or dumping core.
This was because it called eval() on each argument but didn't
account for the fact that eval() is capable of reallocating the
stack.
If you said
$something ? $foo[1] : $foo[2]
you ended up (usually) with
$something ? $foo[0] : $foo[0]
because of the way the ?: operator tries to fool the stack into
thinking there's only one argument there instead of three. This
only happened to constant subscripts. Interestingly enough,
$abc[1] ? $foo[1] : $bar[1]
would have worked, since the first argument has the same subscript.
Some machines already define TRUE and FALSE, so we have to undef
them to avoid warnings.
Several people sent in some fixes for manual typos and indent problems.
There was a reqeust to clarify the difference between $! and $@, and
I added a gratuitous warning about print making an array context for
its arguments, since people seem to run into that frequently.
suidperl could correctly emulate a setgid script, but then it could
get confused about what the actual effective gid was.
Some machine or other defines sighandler(), so perl's sighandler()
needed to be made static.
We changed uchar to unchar for Crays, and it turns out that lots
of SysV machines typedef unchar instead. Sigh. It's now un_char.
If you did substitutions to chop leading components off a string,
and then set the string from <filehandle>, under certain circumstances
the input string could be corrupted because str_gets() called
str_grow() without making sure to change the strings current length to
be the number of characters just read, rather than the old length.
op.stat occasionally failed with NFS race condition, so it now waits
two seconds instead of one to guarantee that the NFS server advances
its clock at least one second.
IBM PC/RT compiler can't deal with UNI() and LOP() macros. If you
define CRIPPLED_CC it now will recast those macros as subroutines,
which runs a little slower but doesn't give the compiler heartburn.
The } character can terminate either an associative array subscript
or a BLOCK. These set up different expectations as to whether the
next token might be a term or an operator. There was a faulty
heuristic based on whether there was an intervening newline.
It turns out that if } simply leaves the current expectations along,
the right thing happens.
The command y/abcde// didn't work because the length of the first
part was not correctly copied to the second part.
In s2p, line labels without a subsequent statement were done wrong,
since an extra semicolon needs to be supplied. It wasn't always
suppplied, and when it was supplied, it was in the wrong place.
S2p also needed to remove its /tmp files better.
A2p translates
for (a in b)
to
foreach $a} (keys(%b))
on Pyramids, because index(s, '}' + 128) doesn't find a } with the
top bit set. This has been fixed.
Larry Wall [Fri, 10 Nov 1989 16:20:57 +0000]
perl 3.0 patch #4 Patch #2 continued
Larry Wall [Fri, 10 Nov 1989 16:20:25 +0000]
perl 3.0 patch #3 Patch #2 continued
Larry Wall [Fri, 10 Nov 1989 16:10:36 +0000]
perl 3.0 patch #2 (combined patch)
The metaconfig problem with pw_* fields has been fixed.
When you specify extra libraries to link in, Configure now
uses those libraries as well as libc to look for the functions
that are available. From the ccflags you give it now derives
the corresponding flags for the C preprocessor. And it has
better support for the Gnu C preprocessor.
Configure now detects USGness by the behavior of the tr program.
If USGness isn't found, then SIGTSTP determines BSDness.
The define of DEBUGGING has been taken out of perl.h and a2p.h.
If you want debugging you have to add -DDEBUGGING in a cc flag.
If you give an optimizer flag of -g, you get DEBUGGING as a
default.
Machines like the Cray have longs longer than 4 bytes. There
is now support for that.
Some machines have csh in other places than /bin. Configure
now figures out where it is.
Configure now supports Wollongong sockets and knows about
/usr/netinclude and /usr/lib/libnet.a.
Configure now gets sig names directly from signal.h if possible,
and only if that fails does it try to use kill -l.
The $sockethdr variable has been incorporated into $ccflags
Non-BSD machines required two ^D's to exit
while (<>) { ... }
This has been fixed, I believe, though I can't test it here.
It's now possible to compile perl without the DEBUGGING code.
It runs about 10% faster when you take the code out.
Configure now discovers if <sys/time.h> includes <time.h>, or
whether perl must include it itself.
Configure now finds the wait4() routine if available.
'-' x 26 made warnings about undefined value because of a bug
in evalstatic(). (Non-static 'x' didn't have the problem.)
A local list consisting of nothing but an array didn't work
right. Now it does.
A printf %c omitted the format string between the preceeding % field
and the %c. Code to printf %D, %X and %O was misplaced.
Some machines complain about printing signed values with
unsigned format specifiers like %x. The unsigned specifiers
now have a separate cast from the signed specifiers like %d.
The various file modes were not orthogonal. Now you can use
any of:
< > >> +< +> +>> <& >& >>& +<& +>& +>>&
Perl can now detect when a parent process passes in a socket so
that you can write reasonable inetd servers.
File descriptors above 2 are now closed on exec, either by using
the fcntl(), or if unavailable, brute force closing in a loop.
The return values of getsockopt(), getsockname() and getpeername()
were always undefined.
There were several places where a warn("shutdown") had to be
changed to some other function name.
The C routine gethostbyname() was misdeclared as gethostbynam().
telldir() is sometimes a macro, so we can't declare its return
value if it's defined.
Components of a slice corresponding to non-existent index elements
are now undefined rather than just null.
The mkdir and rmdir function will call the mkdir and rmdir
programs if the corresponding system calls aren't available.
The name of the directory was not quoted properly however.
Also, some attempt is now made to translate the odd messages
that some mkdirs and rmdir return into reasonable error codes.
As a final check for mkdir programs that return NO useful status,
a stat is done following the mkdir or rmdir to make sure the
directory is really there or gone.
The fileno, seekdir, rewinddir and closedir functions now specifically
disallow defaults and return undef. Previously they would just crash
perl.
CX/UX needs to set the key each time when iterating over associative
arrays due to a non-standard dbm_nextkey() function.
The lib/getopts.pl routine needed to shift @ARGV explicitly in
several spots.
The malloc pointer corruption check was made more portable by just
checking for alignment errors. It also is removed if DEBUGGING
is not enabled.
The include of <netinet/in.h> needed to be moved down below the
include of <sys/types.h> for some machines.
Not all machines declare the yydebug variable as the same type.
The reference to yydebug was moved to perl.y where it doesn't care.
I documented that a space must separate any word and a subsequent
single-quoted string because of package name prefixes.
Some long lines were broken for nroff, but not for troff.
One example of unshift in the manual had its arguments backwards.
I clarified that operation of ^ and $ on multiline strings when $*
is false is somewhat inconsistent.
People were forced to say !($foo++) when !$foo++ should be legal.
None of the unary operators correctly handled their default
arguments because of a screw-up in the parser actions.
/[\000]/ never matched a null due to some left over non-binary-ness
of perl 2.0.
/\b$foo/ gave up too early in trying to match at the end of a string.
sys_nerr was being used as the maximum error message number, when
in fact it's the maximum+1.
The identifier "uchar" is a typedef on Crays, so the variable of that
name was changed to "unchar".
The TEST program tried to run patch reject files. The reject files
are now rejected by TEST.
One test failed on picky systems because it referred to a filename
longer than 14 chars.
The op.split test assumed that the perl -D switch was available,
when in fact it's only available if perl was compiled with DEBUGGING.
Some header file somewhere defined macro CLINE, which conflicted
with toke.c's CLINE macro.
In s2p, + within patterns needed backslashing because + isn't a
metacharacter for sed. s2p was also printing out some debugging
info to the output file.
In a2p, an awk script with no line actions didn't make a main
loop, but it needs one to keep the awk semantics.
Larry Wall [Thu, 26 Oct 1989 10:31:40 +0000]
perl 3.0 patch #1 (combined patch)
Configure had difficulties if the user's path had weird components.
Now Configure appends the user's path to its own.
Some machines need <netinet/in.h> included in order to define
certain macros for packing or unpacking network order data.
On Suns, the shared library is used by default. If it doesn't
contain something contained in /lib/libc.a, then Configure was
getting things wrong (such as gethostent()). Now Configure uses
the shared library if it's there in preference to libc.a.
When gcc was selected as the compiler, the cc flags defaulted to
-fpcc_struct_return. Unfortunately, the underlines should be hyphens.
Configure figures out if BSD shadow passwords are installed and
the getpw* routines now return slightly different data in the
affected fields.
Some of the prompts in Configure with regard to gid and uid types
were unclear as to their intended use. They are now a little
clearer.
Sometimes you could change a .h file and taintperl and suidperl
didn't get remade correctly because of missing dependencies
in the Makefile.
The README file was misleading about the fact that you have to
say "make test" before you can "cd t; TEST"
The reverse operator was busted in two different ways. Should work
better now. There are now regression tests for it.
Some of the optimizations that perl does are disabled after period
of time if perl decides they aren't doing any good. One of these
caused a string to be freed that was later referenced via another
pointer, causing core dumps. The free turned out to be unnecessary,
so it was removed.
The unless modifier was broken when run under the debugger, due to
the invert() routine in perl.y inverting the logic on the DB
subroutine call instead of the command the unless was modifying.
Configure vfork test was backwards. It now works like other defines.
The numeric switch optimization was broken, and caused code to be
bypassed. This has been fixed.
A split in a subroutine that has no target splits into @_.
Unfortunately, this wrongly freed any referenced arguments passed
in through @_, causing confusing behavior later in the program.
File globbing (<foo.*>) left one orphaned string each time it
called the shell to do the glob.
RCS expanded an unintended $Header in lib/perldb.pl. This has
been fixed simply by replacing the $ with a .
Some forward declarations of static functions were missing from
malloc.c.
There's a strut in malloc for mips machines to extend the overhead
union to the size of a double. This was also enabled for sparc
machines.
DEC risc machines are reported to have a buggy memcmp. I've put
some conditional code into perl.h which I think will undef MEMCMP
appropriately.
In perl.man.4, I documented the desirability of using parens even
where they aren't strictly necessary.
I've grandfathered "format stdout" to be the same as "format STDOUT".
Unary operators can be called with no argument. The corresponding
function call form using empty parens () didn't work right, though
it did for certain functions in 2.0. It now works in 3.0.
The string ordering tests were wrong for pairs of strings in which
one string was a prefix of the other. This affected lt, le, gt,
ge, and the sort operator when used with no subroutine.
$/ didn't work with the stupid code used when STDSTDIO was undefined.
The stupid code has been replaced with smarter code that can do
it right. Special thanks to Piet van Oostrum for the code.
Goulds work better if the union in STR is at an 8 byte boundary.
The fields were rearranged somewhat to provide this.
"sort keys %a" should now work right (though parens are still
desirable for readability).
bcopy() needed a forward declaration on some machines.
In x2p/Makefile.SH, added dependency on ../config.sh so that it
gets linked down from above if it got removed for some reason.
Larry Wall [Wed, 18 Oct 1989 00:00:00 +0000]
perl 3.0: (no announcement message available)
A few of the new features: (18 Oct)
* Perl can now handle binary data correctly and has functions to pack and unpack binary structures into arrays or lists. You can now do arbitrary ioctl functions.
* You can now pass things to subroutines by reference.
* Debugger enhancements.
* An array or associative array may now appear in a local() list.
* Array values may now be interpolated into strings.
* Subroutine names are now distinguished by prefixing with &. You can call subroutines without using do, and without passing any argument list at all.
* You can use the new -u switch to cause perl to dump core so that you can run undump and produce a binary executable image. Alternately you can use the "dump" operator after initializing any variables and such.
* You can now chop lists.
* Perl now uses /bin/csh to do filename globbing, if available. This means that filenames with spaces or other strangenesses work right.
* New functions: mkdir and rmdir, getppid, getpgrp and setpgrp, getpriority and setpriority, chroot, ioctl and fcntl, flock, readlink, lstat, rindex, pack and unpack, read, warn, dbmopen and dbmclose, dump, reverse, defined, undef.
Larry Wall [Tue, 28 Jun 1988 03:41:16 +0000]
perl 2.0 patch 1: removed redundant debugging code in regexp.c
If you used ++ on a variable that had the value '' (as opposed to
being undefined) it would increment the numeric part but not
invalidate the string part, which could then give false results.
Berkeley recently sent out a patch that disables setuid #! scripts
because of an inherent problem in the semantics as they are
currently defined. If you have installed that patch, your setuid
and setgid bits are useless on scripts. I've added a means
for perl to examine those bits and emulate setuid/setgid scripts
itself in what I believe is a secure manner. If normal perl
detects such a script, it passes it off to another version of
perl that runs setuid root, and can run the script under the
desired uid/gid. This feature is optional, and Configure will
ask if you want to do it.
Some machines didn't like config.h when it said #/*undef SYMBOL.
Config.h.SH now is smart enough to tuck the # inside the comment.
There were several small problems in Configure: the return code from
ar was hidden by a piped call to sed, so if ar failed it went
undetected. The Cray uses a program called bld instead of ar.
Let's hear it for compatibilty. At least one version of gnucpp
adds a space after symbol interpolation, which was giving the
C preprocessor detector fits. There was a call to grep '-i' that
needed to have the -i protected by a backslash. Also, Configure
should remove the UU subdirectory that it makes while running.
"make realclean" now knows about the alternate patch extension ~.
In the manual page, I fixed some quotes that were ugly in troff,
and did some clarification of LIST, study, tr and unlink.
regexp.c had some redundant debugging code.
tr/x/y/ could dump core if y is shorter than x. I found this out
when I tried translating a bunch of characters to space by saying
something like y/a-z/ /.
Larry Wall [Sun, 5 Jun 1988 00:00:00 +0000]
perl 2.0 (no announcement message available)
Some of the enhancements from Perl1 included:
* New regexp routines derived from Henry Spencer's.
o Support for /(foo|bar)/.
o Support for /(foo)*/ and /(foo)+/.
o \s for whitespace, \S for non-, \d for digit, \D nondigit
* Local variables in blocks, subroutines and evals.
* Recursive subroutine calls are now supported.
* Array values may now be interpolated into lists: unlink 'foo', 'bar', @trashcan, 'tmp';
* File globbing.
* Use of <> in array contexts returns the whole file or glob list.
* New iterator for normal arrays, foreach, that allows both read and write.
* Ability to open pipe to a forked off script for secure pipes in setuid scripts.
* File inclusion via do 'foo.pl';
* More file tests, including -t to see if, for instance, stdin is a terminal. File tests now behave in a more correct manner. You can do file tests on filehandles as well as filenames. The special filetests -T and -B test a file to see if it's text or binary.
* An eof can now be used on each file of the <> input for such purposes as resetting the line numbers or appending to each file of an inplace edit.
* Assignments can now function as lvalues, so you can say things like ($HOST = $host) =~ tr/a-z/A-Z/; ($obj = $src) =~ s/\.c$/.o/;
* You can now do certain file operations with a variable which holds the name of a filehandle, e.g. open(++$incl,$includefilename); $foo = <$incl>;
* Warnings are now available (with -w) on use of uninitialized variables and on identifiers that are mentioned only once, and on reference to various undefined things.
* There is now a wait operator.
* There is now a sort operator.
* The manual is now not lying when it says that perl is generally faster than sed. I hope.
Jeff Siegal [Mon, 1 Feb 1988 22:56:10 +0000]
perl 1.0 patch 14: a2p incorrectly translates 'for (a in b)' construct.
The code a2p creates for the 'for (a in b)' construct ends
up assigning the wrong value to the key variable.
Kriton Kyrimis [Mon, 1 Feb 1988 22:28:33 +0000]
perl 1.0 patch 13: fix for faulty patch 12, plus random portability glitches
I botched patch #12, so that split(' ') only works on the first
line of input due to unintended interference by the optimization
that was added at the same time. Yes, I tested it, but only on
one line of input. *Sigh*
Some glitches have turned up on some of the rusty pig iron out there,
so here are some unglitchifications.
Kriton Kyrimis [Mon, 1 Feb 1988 04:35:21 +0000]
perl 1.0 patch 12: scripts made by a2p doen't handle leading white space right on input
Awk ignores leading whitespace on split. Perl by default does not.
The a2p translator couldn't handle this. The fix is partly to a2p
and partly to perl. Perl now has a way to specify to split to
ignore leading white space as awk does. A2p now takes advantage of
that.
I also threw in an optimization that let's runtime patterns
compile just once if they are known to be constant, so that
split(' ') doesn't compile the pattern every time.
Mark Biggar [Sun, 31 Jan 1988 20:00:34 +0000]
perl 1.0 patch 11: documentation upgrade
I documented the new eval operator for patch 8 but my automatic
patch generator overlooked it for some reason.
Here's the documentation for the eval operator, along with some
other documentation changes suggested by Mark.
Peter E. Yee [Fri, 29 Jan 1988 20:22:10 +0000]
perl 1.0 patch 10: if your libc is in a strange place, Configure blows up
There's a line in Configure that says libc=ans which should say
libc=$ans. This only shows up if libc.a isn't in /lib.
Marnix (ain't unix!) A. van Ammers [Fri, 29 Jan 1988 19:58:36 +0000]
perl 1.0 patch 9: 3 portability problems
There's a #define YYDEBUG; in perl.h that ought to be
#define YYDEBUG 1. Interesting that it works the former way on
any systems at all.
Patch 2 was defective and introduced a couple of lines with missing
right parens. Learn something old every day...
Some awks can't handle
awk '$6 != "" {print substr($6,2,100)}' </tmp/Cppsym2$$ ;;
if field 6 doesn't exist. Changed conditional to NF > 5.
There was also a problem that I fixed in metaconfig that involved
Configure grepping .SH files out of MANIFEST when the .SH was only
in the commentary. This doesn't affect perl's Configure because
there aren't any comments containing .SH in the MANIFEST file.
But that's the nice thing about metaconfig--you generate a new
Configure script and also get the changes you don't need (yet).
Larry Wall [Wed, 27 Jan 1988 22:18:25 +0000]
perl 1.0 patch 8: perl needed an eval operator and a symbolic debugger
I didn't add an eval operator to the original perl because
I hadn't thought of any good uses for it. Recently I thought
of some. Along with creating the eval operator, this patch
introduces a symbolic debugger for perl scripts, which makes
use of eval to interpret some debugging commands. Having eval
also lets me emulate awk's FOO=bar command line behavior with
a line such as the one a2p now inserts at the beginning of
translated scripts.
Arnold D. Robbins [Tue, 26 Jan 1988 01:16:41 +0000]
perl 1.0 patch 7: use of included malloc.c should be optional
The version of malloc.c that comes with perl was not really intended
to be used everywhere--it was included mostly for debugging purposes.
It's a nice little package, however, so I'm making it optional (via
Configure) as to whether you want it or not.
Andrew Burt [Mon, 25 Jan 1988 23:31:23 +0000]
perl 1.0 patch 6: printf doesn't finish processing format string when out of args.
printf "%% %d %%", 1; produces "% 1 %%", which is counterintuitive.
Arnold D. Robbins [Mon, 25 Jan 1988 20:53:22 +0000]
perl 1.0 patch 5: a2p didn't make use of the config.h generated by Configure
The a2p program used index() and bcopy(), both of do not exist
everywhere. Since Configure was already figuring out about those
functions, it is fairly trivial to get a2p to make use of the info.
Paul Eggert [Mon, 25 Jan 1988 19:48:31 +0000]
perl 1.0 patch 4: make depend doesn't work if . isn't in your PATH
make depend doesn't work if . isn't in your PATH.
Larry Wall [Sat, 23 Jan 1988 15:23:55 +0000]
perl 1.0 patch 3: Patch 2 was incomplete
I left one file out of patch 2. This is perhaps forgivable since
it is a file that is produced automatically by metaconfig along
with Configure.
Andrew Burt [Sat, 23 Jan 1988 14:57:57 +0000]
perl 1.0 patch 2: Various portability fixes.
Some things didn't work right on System V and Pyramids.
Dan Faigin, Doug Landauer [Thu, 21 Jan 1988 09:21:04 +0000]
perl 1.0 patch 1: Portability bugs and one possible SIGSEGV
On some systems the Configure script and C compilations get
warning messages that may scare some folks unnecessarily.
Also, use of the "redo" command if debugging is compiled in
overflows a stack on which the trace context is kept.
Larry Wall [Fri, 18 Dec 1987 00:00:00 +0000]
a "replacement" for awk and sed
[ Perl is kind of designed to make awk and sed semi-obsolete. This posting
will include the first 10 patches after the main source. The following
description is lifted from Larry's manpage. --r$ ]
Perl is a interpreted language optimized for scanning arbitrary text
files, extracting information from those text files, and printing
reports based on that information. It's also a good language for many
system management tasks. The language is intended to be practical
(easy to use, efficient, complete) rather than beautiful (tiny,
elegant, minimal). It combines (in the author's opinion, anyway) some
of the best features of C, sed, awk, and sh, so people familiar with
those languages should have little difficulty with it. (Language
historians will also note some vestiges of csh, Pascal, and even
BASIC-PLUS.) Expression syntax corresponds quite closely to C
expression syntax. If you have a problem that would ordinarily use sed
or awk or sh, but it exceeds their capabilities or must run a little
faster, and you don't want to write the silly thing in C, then perl may
be for you. There are also translators to turn your sed and awk
scripts into perl scripts.