initial commit

diff --git a/Changes b/Changes

new file mode 100644 (file)

index 0000000..8728426
--- /dev/null
+++ b/Changes
@@ -0,0 +1,104 @@
+Revision history for Perl extension File::FastSlurp.
+
+9999.13   Tue Oct 10 02:04:51 EDT 2006
+       - Refactored the extras/slurp_bench.pl script. It has options,
+         a key the benchmarks, help and more benchmarks.
+       - Reordered changes so recent entries are first
+
+9999.12   Thu Feb  2 02:26:31 EST 2006
+       - Fixed bug on windows with classic slurping and File::Slurp not
+         agreeing on newline conversion.
+       - Added t/newline.t test to check for that fix.
+       - When passing text data by scalar reference to write_file under
+         windows, the buffer is copied so the newline conversion won't
+         modify the caller's data.
+       - Thanks to Johan Lodin <lodin@cpan.org> for a test script which
+         I modified into t/newline.t
+
+9999.11   Fri Jan 20 01:24:00 EDT 2005
+       - Quick release to remove code that forced the faked SEEK_*
+         values to be used. Showed up when tested on OSX which doesn't
+         need that backport.
+
+9999.10   Thu Jan 19 11:38:00 EDT 2005
+       - t/*.t modules don't use Fcntl.pm
+       - using POSIX qw( :fcntl_h ) instead of Fcntl qw( :seek ) for
+         backwards compatiblity to 5.00503
+       - added conditional definitions of SEEK_* and O_* subs as they are not
+         defined in perl 5.004
+       - File::Slurp now runs on perl 5.004 and newer (see BUGS section)
+         All of the above thanks to Smylers <Smylers@stripey.com>,
+         Piers Kent <piers.kent@bbc.co.uk> and 
+         John Alden <john.alden@bbc.co.uk>
+       - Added pod.t and pod_coverage.t tests. This is to pass all
+         the CPANTS tests.
+
+9999.09  Tue Apr 19 01:21:55 EDT 2005
+       - t/original.t and read_dir.t no longer search for tempdirs. they just
+         use the current dir which should be in the build directory
+       - t/readdir.t renamed to read_dir.t for consistancy
+       - write_file return values are docuemented
+         Thanks to Adam Kennedy <adamk@cpan.org>
+       - added no_clobber option to write_file and t/no_clobber.t test for it
+         Thanks to <pagaltzis@gmx.de>
+       - fixed bug when appending a null string to a file which then
+         truncates it. seems to be an odd way for linux and OS X to
+         handle O_APPEND mode on sysopen. they don't seek to the end of
+         the file so it gets truncated. fixed by adding a seek to the
+         end if in append mode.n
+         Thanks to Chris Dolan <cdolan@cpan.org>
+
+
+9999.08  Sat Apr 16 01:01:27 EDT 2005
+       - read_dir returns an array ref in scalar context
+       - read_dir keeps . and .. if keep_dot_dot option is set.
+         Thanks to John Alden <john.alden@bbc.co.uk>
+       - slurp() is an optional exported alias to read_file
+         Thanks to Damian Conway <damian@conway.org>
+
+
+
+9999.07  Tue Jan 25 01:33:11 EST 2005
+       - Slurping in pseudo files (as in /proc) which show a size of 0
+         but actually have data works. This seems to be the case on
+         linux but on Solaris those files show their proper size.
+         Thanks to Juerd Waalboer <juerd@cpan.org>
+
+
+9999.06  Mon Sep 20 01:57:00 EDT 2004
+       - Slurping the DATA handle now works without the workaround.
+         tests are in t/data_scalar.t and t/data_list.t
+        - Paragraph mode in read_file is supported. As with <> when $/
+         (input record separator) is set to '', then the input file is
+         split on multiple newlines (/\n\n+/).
+         Thanks to Geoffrey Leach <geoff@direcway.com>
+
+
+9999.05  Tue Feb 24 21:14:55 EST 2004
+       - skip handle tests where socketpair is not supported (pre 5.8
+         on windows)
+         Thanks to Mike Arms <marms@sandia.gov>
+
+
+9999.04  Mon Feb 23 14:20:52 EST 2004
+       - fixed DATA handle bug in t/handle.t (not seen on most OS's)
+         Thanks to James Willmore <jwillmore@adelphia.net>
+
+
+9999.03  Mon Dec 22 01:44:43 EST 2003
+       - fixed DATA handle bugs in t/handle.t on osx (should be fixed
+         on BSD as well)
+       - added more comments to code
+
+9999.02  Wed Dec 17 03:40:49 EST 2003
+       - skip DATA test in handle.t on OSX (bug in perl with sysread on DATA)
+       - changed checking if file handle from fileno to ref
+               from Randal Schwartz <merlyn@stonehenge.com>
+       - added support for atomic spewing
+       - added new test stdin.t for the fileno/ref change
+       - added new test inode.t to test atomic spewing
+
+9999.01  Mon Sep  1 00:20:56 2003
+       - original version; created by h2xs 1.21 with options
+               -AX -n File::FastSlurp
+
diff --git a/MANIFEST b/MANIFEST

new file mode 100644 (file)

index 0000000..fa58954
--- /dev/null
+++ b/MANIFEST
@@ -0,0 +1,25 @@
+Changes
+lib/File/Slurp.pm
+Makefile.PL
+MANIFEST
+README
+t/error.t
+t/large.t
+t/handle.t
+t/data_scalar.t
+t/data_list.t
+t/paragraph.t
+t/original.t
+t/stdin.t
+t/inode.t
+t/pseudo.t
+t/read_dir.t
+t/slurp.t
+t/no_clobber.t
+t/append_null.t
+t/newline.t
+t/pod.t
+t/pod_coverage.t
+extras/slurp_bench.pl
+extras/slurp_article.pod
+META.yml                                 Module meta-data (added by MakeMaker)
diff --git a/META.yml b/META.yml

new file mode 100644 (file)

index 0000000..69b59e3
--- /dev/null
+++ b/META.yml
@@ -0,0 +1,10 @@
+# http://module-build.sourceforge.net/META-spec.html
+#XXXXXXX This is a prototype!!!  It will change in the future!!! XXXXX#
+name:         File-Slurp
+version:      9999.12
+version_from: lib/File/Slurp.pm
+installdirs:  site
+requires:
+
+distribution_type: module
+generated_by: ExtUtils::MakeMaker version 6.17
diff --git a/Makefile b/Makefile

new file mode 100644 (file)

index 0000000..4bb3b9b
--- /dev/null
+++ b/Makefile
@@ -0,0 +1,764 @@
+# This Makefile is for the File::Slurp extension to perl.
+#
+# It was generated automatically by MakeMaker version
+# 6.17 (Revision: 1.133) from the contents of
+# Makefile.PL. Don't edit this file, edit Makefile.PL instead.
+#
+#       ANY CHANGES MADE HERE WILL BE LOST!
+#
+#   MakeMaker ARGV: ()
+#
+#   MakeMaker Parameters:
+
+#     ABSTRACT_FROM => q[lib/File/Slurp.pm]
+#     AUTHOR => q[Uri Guttman <uri@stemsystems.com>]
+#     NAME => q[File::Slurp]
+#     PREREQ_PM => {  }
+#     VERSION_FROM => q[lib/File/Slurp.pm]
+
+# --- MakeMaker post_initialize section:
+
+
+# --- MakeMaker const_config section:
+
+# These definitions are from config.sh (via /usr/local/lib/perl5/5.8.6/sun4-solaris/Config.pm)
+
+# They may have been overridden via Makefile.PL or on the command line
+AR = ar
+CC = gcc
+CCCDLFLAGS = -fPIC
+CCDLFLAGS =  
+DLEXT = so
+DLSRC = dl_dlopen.xs
+LD = gcc
+LDDLFLAGS = -G -L/usr/local/lib
+LDFLAGS =  -L/usr/local/lib 
+LIBC = /lib/libc.so
+LIB_EXT = .a
+OBJ_EXT = .o
+OSNAME = solaris
+OSVERS = 2.9
+RANLIB = :
+SITELIBEXP = /usr/local/lib/perl5/site_perl/5.8.6
+SITEARCHEXP = /usr/local/lib/perl5/site_perl/5.8.6/sun4-solaris
+SO = so
+EXE_EXT = 
+FULL_AR = /usr/ccs/bin/ar
+VENDORARCHEXP = 
+VENDORLIBEXP = 
+
+
+# --- MakeMaker constants section:
+AR_STATIC_ARGS = cr
+DIRFILESEP = /
+NAME = File::Slurp
+NAME_SYM = File_Slurp
+VERSION = 9999.12
+VERSION_MACRO = VERSION
+VERSION_SYM = 9999_12
+DEFINE_VERSION = -D$(VERSION_MACRO)=\"$(VERSION)\"
+XS_VERSION = 9999.12
+XS_VERSION_MACRO = XS_VERSION
+XS_DEFINE_VERSION = -D$(XS_VERSION_MACRO)=\"$(XS_VERSION)\"
+INST_ARCHLIB = blib/arch
+INST_SCRIPT = blib/script
+INST_BIN = blib/bin
+INST_LIB = blib/lib
+INST_MAN1DIR = blib/man1
+INST_MAN3DIR = blib/man3
+MAN1EXT = 1
+MAN3EXT = 3
+INSTALLDIRS = site
+DESTDIR = 
+PREFIX = 
+PERLPREFIX = /usr/local
+SITEPREFIX = /usr/local
+VENDORPREFIX = 
+INSTALLPRIVLIB = $(PERLPREFIX)/lib/perl5/5.8.6
+DESTINSTALLPRIVLIB = $(DESTDIR)$(INSTALLPRIVLIB)
+INSTALLSITELIB = $(SITEPREFIX)/lib/perl5/site_perl/5.8.6
+DESTINSTALLSITELIB = $(DESTDIR)$(INSTALLSITELIB)
+INSTALLVENDORLIB = 
+DESTINSTALLVENDORLIB = $(DESTDIR)$(INSTALLVENDORLIB)
+INSTALLARCHLIB = $(PERLPREFIX)/lib/perl5/5.8.6/sun4-solaris
+DESTINSTALLARCHLIB = $(DESTDIR)$(INSTALLARCHLIB)
+INSTALLSITEARCH = $(SITEPREFIX)/lib/perl5/site_perl/5.8.6/sun4-solaris
+DESTINSTALLSITEARCH = $(DESTDIR)$(INSTALLSITEARCH)
+INSTALLVENDORARCH = 
+DESTINSTALLVENDORARCH = $(DESTDIR)$(INSTALLVENDORARCH)
+INSTALLBIN = $(PERLPREFIX)/bin
+DESTINSTALLBIN = $(DESTDIR)$(INSTALLBIN)
+INSTALLSITEBIN = $(SITEPREFIX)/bin
+DESTINSTALLSITEBIN = $(DESTDIR)$(INSTALLSITEBIN)
+INSTALLVENDORBIN = 
+DESTINSTALLVENDORBIN = $(DESTDIR)$(INSTALLVENDORBIN)
+INSTALLSCRIPT = $(PERLPREFIX)/bin
+DESTINSTALLSCRIPT = $(DESTDIR)$(INSTALLSCRIPT)
+INSTALLMAN1DIR = $(PERLPREFIX)/man/man1
+DESTINSTALLMAN1DIR = $(DESTDIR)$(INSTALLMAN1DIR)
+INSTALLSITEMAN1DIR = $(SITEPREFIX)/man/man1
+DESTINSTALLSITEMAN1DIR = $(DESTDIR)$(INSTALLSITEMAN1DIR)
+INSTALLVENDORMAN1DIR = 
+DESTINSTALLVENDORMAN1DIR = $(DESTDIR)$(INSTALLVENDORMAN1DIR)
+INSTALLMAN3DIR = $(PERLPREFIX)/man/man3
+DESTINSTALLMAN3DIR = $(DESTDIR)$(INSTALLMAN3DIR)
+INSTALLSITEMAN3DIR = $(SITEPREFIX)/man/man3
+DESTINSTALLSITEMAN3DIR = $(DESTDIR)$(INSTALLSITEMAN3DIR)
+INSTALLVENDORMAN3DIR = 
+DESTINSTALLVENDORMAN3DIR = $(DESTDIR)$(INSTALLVENDORMAN3DIR)
+PERL_LIB = /usr/local/lib/perl5/5.8.6
+PERL_ARCHLIB = /usr/local/lib/perl5/5.8.6/sun4-solaris
+LIBPERL_A = libperl.a
+FIRST_MAKEFILE = Makefile
+MAKEFILE_OLD = $(FIRST_MAKEFILE).old
+MAKE_APERL_FILE = $(FIRST_MAKEFILE).aperl
+PERLMAINCC = $(CC)
+PERL_INC = /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE
+PERL = /usr/local/bin/perl
+FULLPERL = /usr/local/bin/perl
+ABSPERL = $(PERL)
+PERLRUN = $(PERL)
+FULLPERLRUN = $(FULLPERL)
+ABSPERLRUN = $(ABSPERL)
+PERLRUNINST = $(PERLRUN) "-I$(INST_ARCHLIB)" "-I$(INST_LIB)"
+FULLPERLRUNINST = $(FULLPERLRUN) "-I$(INST_ARCHLIB)" "-I$(INST_LIB)"
+ABSPERLRUNINST = $(ABSPERLRUN) "-I$(INST_ARCHLIB)" "-I$(INST_LIB)"
+PERL_CORE = 0
+PERM_RW = 644
+PERM_RWX = 755
+
+MAKEMAKER   = /usr/local/lib/perl5/5.8.6/ExtUtils/MakeMaker.pm
+MM_VERSION  = 6.17
+MM_REVISION = 1.133
+
+# FULLEXT = Pathname for extension directory (eg Foo/Bar/Oracle).
+# BASEEXT = Basename part of FULLEXT. May be just equal FULLEXT. (eg Oracle)
+# PARENT_NAME = NAME without BASEEXT and no trailing :: (eg Foo::Bar)
+# DLBASE  = Basename part of dynamic library. May be just equal BASEEXT.
+FULLEXT = File/Slurp
+BASEEXT = Slurp
+PARENT_NAME = File
+DLBASE = $(BASEEXT)
+VERSION_FROM = lib/File/Slurp.pm
+OBJECT = 
+LDFROM = $(OBJECT)
+LINKTYPE = dynamic
+
+# Handy lists of source code files:
+XS_FILES = 
+C_FILES  = 
+O_FILES  = 
+H_FILES  = 
+MAN1PODS = 
+MAN3PODS = Slurp.pm \
+       lib/File/Slurp.pm \
+       slurp_article.pod
+
+# Where is the Config information that we are using/depend on
+CONFIGDEP = $(PERL_ARCHLIB)$(DIRFILESEP)Config.pm $(PERL_INC)$(DIRFILESEP)config.h
+
+# Where to build things
+INST_LIBDIR      = $(INST_LIB)/File
+INST_ARCHLIBDIR  = $(INST_ARCHLIB)/File
+
+INST_AUTODIR     = $(INST_LIB)/auto/$(FULLEXT)
+INST_ARCHAUTODIR = $(INST_ARCHLIB)/auto/$(FULLEXT)
+
+INST_STATIC      = 
+INST_DYNAMIC     = 
+INST_BOOT        = 
+
+# Extra linker info
+EXPORT_LIST        = 
+PERL_ARCHIVE       = 
+PERL_ARCHIVE_AFTER = 
+
+
+TO_INST_PM = Slurp.pm \
+       carp.pl \
+       lib/File/Slurp.pm \
+       seek.pl \
+       slurp_article.pod \
+       slurp_bench.pl \
+       split.pl \
+       sysread.pl
+
+PM_TO_BLIB = Slurp.pm \
+       $(INST_LIB)/File/Slurp.pm \
+       carp.pl \
+       $(INST_LIB)/File/carp.pl \
+       lib/File/Slurp.pm \
+       blib/lib/File/Slurp.pm \
+       seek.pl \
+       $(INST_LIB)/File/seek.pl \
+       split.pl \
+       $(INST_LIB)/File/split.pl \
+       sysread.pl \
+       $(INST_LIB)/File/sysread.pl \
+       slurp_article.pod \
+       $(INST_LIB)/File/slurp_article.pod \
+       slurp_bench.pl \
+       $(INST_LIB)/File/slurp_bench.pl
+
+
+# --- MakeMaker platform_constants section:
+MM_Unix_VERSION = 1.42
+PERL_MALLOC_DEF = -DPERL_EXTMALLOC_DEF -Dmalloc=Perl_malloc -Dfree=Perl_mfree -Drealloc=Perl_realloc -Dcalloc=Perl_calloc
+
+
+# --- MakeMaker tool_autosplit section:
+# Usage: $(AUTOSPLITFILE) FileToSplit AutoDirToSplitInto
+AUTOSPLITFILE = $(PERLRUN)  -e 'use AutoSplit;  autosplit($$ARGV[0], $$ARGV[1], 0, 1, 1)'
+
+
+
+# --- MakeMaker tool_xsubpp section:
+
+
+# --- MakeMaker tools_other section:
+SHELL = /bin/sh
+CHMOD = chmod
+CP = cp
+MV = mv
+NOOP = $(SHELL) -c true
+NOECHO = @
+RM_F = rm -f
+RM_RF = rm -rf
+TEST_F = test -f
+TOUCH = touch
+UMASK_NULL = umask 0
+DEV_NULL = > /dev/null 2>&1
+MKPATH = $(PERLRUN) "-MExtUtils::Command" -e mkpath
+EQUALIZE_TIMESTAMP = $(PERLRUN) "-MExtUtils::Command" -e eqtime
+ECHO = echo
+ECHO_N = echo -n
+UNINST = 0
+VERBINST = 0
+MOD_INSTALL = $(PERLRUN) -MExtUtils::Install -e 'install({@ARGV}, '\''$(VERBINST)'\'', 0, '\''$(UNINST)'\'');'
+DOC_INSTALL = $(PERLRUN) "-MExtUtils::Command::MM" -e perllocal_install
+UNINSTALL = $(PERLRUN) "-MExtUtils::Command::MM" -e uninstall
+WARN_IF_OLD_PACKLIST = $(PERLRUN) "-MExtUtils::Command::MM" -e warn_if_old_packlist
+
+
+# --- MakeMaker makemakerdflt section:
+makemakerdflt: all
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker dist section:
+TAR = tar
+TARFLAGS = cvf
+ZIP = zip
+ZIPFLAGS = -r
+COMPRESS = gzip --best
+SUFFIX = .gz
+SHAR = shar
+PREOP = $(NOECHO) $(NOOP)
+POSTOP = $(NOECHO) $(NOOP)
+TO_UNIX = $(NOECHO) $(NOOP)
+CI = ci -u
+RCS_LABEL = rcs -Nv$(VERSION_SYM): -q
+DIST_CP = best
+DIST_DEFAULT = tardist
+DISTNAME = File-Slurp
+DISTVNAME = File-Slurp-9999.12
+
+
+# --- MakeMaker macro section:
+
+
+# --- MakeMaker depend section:
+
+
+# --- MakeMaker cflags section:
+
+
+# --- MakeMaker const_loadlibs section:
+
+
+# --- MakeMaker const_cccmd section:
+
+
+# --- MakeMaker post_constants section:
+
+
+# --- MakeMaker pasthru section:
+
+PASTHRU = LIB="$(LIB)"\
+       LIBPERL_A="$(LIBPERL_A)"\
+       LINKTYPE="$(LINKTYPE)"\
+       PREFIX="$(PREFIX)"\
+       OPTIMIZE="$(OPTIMIZE)"\
+       PASTHRU_DEFINE="$(PASTHRU_DEFINE)"\
+       PASTHRU_INC="$(PASTHRU_INC)"
+
+
+# --- MakeMaker special_targets section:
+.SUFFIXES: .xs .c .C .cpp .i .s .cxx .cc $(OBJ_EXT)
+
+.PHONY: all config static dynamic test linkext manifest
+
+
+
+# --- MakeMaker c_o section:
+
+
+# --- MakeMaker xs_c section:
+
+
+# --- MakeMaker xs_o section:
+
+
+# --- MakeMaker top_targets section:
+all :: pure_all manifypods
+       $(NOECHO) $(NOOP)
+
+
+pure_all :: config pm_to_blib subdirs linkext
+       $(NOECHO) $(NOOP)
+
+subdirs :: $(MYEXTLIB)
+       $(NOECHO) $(NOOP)
+
+config :: $(FIRST_MAKEFILE) $(INST_LIBDIR)$(DIRFILESEP).exists
+       $(NOECHO) $(NOOP)
+
+config :: $(INST_ARCHAUTODIR)$(DIRFILESEP).exists
+       $(NOECHO) $(NOOP)
+
+config :: $(INST_AUTODIR)$(DIRFILESEP).exists
+       $(NOECHO) $(NOOP)
+
+$(INST_AUTODIR)/.exists :: /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h
+       $(NOECHO) $(MKPATH) $(INST_AUTODIR)
+       $(NOECHO) $(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h $(INST_AUTODIR)/.exists
+
+       -$(NOECHO) $(CHMOD) $(PERM_RWX) $(INST_AUTODIR)
+
+$(INST_LIBDIR)/.exists :: /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h
+       $(NOECHO) $(MKPATH) $(INST_LIBDIR)
+       $(NOECHO) $(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h $(INST_LIBDIR)/.exists
+
+       -$(NOECHO) $(CHMOD) $(PERM_RWX) $(INST_LIBDIR)
+
+$(INST_ARCHAUTODIR)/.exists :: /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h
+       $(NOECHO) $(MKPATH) $(INST_ARCHAUTODIR)
+       $(NOECHO) $(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h $(INST_ARCHAUTODIR)/.exists
+
+       -$(NOECHO) $(CHMOD) $(PERM_RWX) $(INST_ARCHAUTODIR)
+
+config :: $(INST_MAN3DIR)$(DIRFILESEP).exists
+       $(NOECHO) $(NOOP)
+
+
+$(INST_MAN3DIR)/.exists :: /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h
+       $(NOECHO) $(MKPATH) $(INST_MAN3DIR)
+       $(NOECHO) $(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.8.6/sun4-solaris/CORE/perl.h $(INST_MAN3DIR)/.exists
+
+       -$(NOECHO) $(CHMOD) $(PERM_RWX) $(INST_MAN3DIR)
+
+help:
+       perldoc ExtUtils::MakeMaker
+
+
+# --- MakeMaker linkext section:
+
+linkext :: $(LINKTYPE)
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker dlsyms section:
+
+
+# --- MakeMaker dynamic section:
+
+dynamic :: $(FIRST_MAKEFILE) $(INST_DYNAMIC) $(INST_BOOT)
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker dynamic_bs section:
+
+BOOTSTRAP =
+
+
+# --- MakeMaker dynamic_lib section:
+
+
+# --- MakeMaker static section:
+
+## $(INST_PM) has been moved to the all: target.
+## It remains here for awhile to allow for old usage: "make static"
+static :: $(FIRST_MAKEFILE) $(INST_STATIC)
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker static_lib section:
+
+
+# --- MakeMaker manifypods section:
+
+POD2MAN_EXE = $(PERLRUN) "-MExtUtils::Command::MM" -e pod2man "--"
+POD2MAN = $(POD2MAN_EXE)
+
+
+manifypods : pure_all  \
+       Slurp.pm \
+       lib/File/Slurp.pm \
+       slurp_article.pod \
+       Slurp.pm \
+       lib/File/Slurp.pm \
+       slurp_article.pod
+       $(NOECHO) $(POD2MAN) --section=3 --perm_rw=$(PERM_RW)\
+         Slurp.pm $(INST_MAN3DIR)/File::Slurp.$(MAN3EXT) \
+         lib/File/Slurp.pm $(INST_MAN3DIR)/File::Slurp.$(MAN3EXT) \
+         slurp_article.pod $(INST_MAN3DIR)/File::slurp_article.$(MAN3EXT) 
+
+
+
+
+# --- MakeMaker processPL section:
+
+
+# --- MakeMaker installbin section:
+
+
+# --- MakeMaker subdirs section:
+
+# none
+
+# --- MakeMaker clean_subdirs section:
+clean_subdirs :
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker clean section:
+
+# Delete temporary files but do not touch installed files. We don't delete
+# the Makefile here so a later make realclean still has a makefile to use.
+
+clean :: clean_subdirs
+       -$(RM_RF) ./blib $(MAKE_APERL_FILE) $(INST_ARCHAUTODIR)/extralibs.all $(INST_ARCHAUTODIR)/extralibs.ld perlmain.c tmon.out mon.out so_locations pm_to_blib *$(OBJ_EXT) *$(LIB_EXT) perl.exe perl perl$(EXE_EXT) $(BOOTSTRAP) $(BASEEXT).bso $(BASEEXT).def lib$(BASEEXT).def $(BASEEXT).exp $(BASEEXT).x core core.*perl.*.? *perl.core core.[0-9] core.[0-9][0-9] core.[0-9][0-9][0-9] core.[0-9][0-9][0-9][0-9] core.[0-9][0-9][0-9][0-9][0-9]
+       -$(MV) $(FIRST_MAKEFILE) $(MAKEFILE_OLD) $(DEV_NULL)
+
+
+# --- MakeMaker realclean_subdirs section:
+realclean_subdirs :
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker realclean section:
+
+# Delete temporary files (via clean) and also delete installed files
+realclean purge ::  clean realclean_subdirs
+       $(RM_RF) $(INST_AUTODIR) $(INST_ARCHAUTODIR)
+       $(RM_RF) $(DISTVNAME)
+       $(RM_F)  $(INST_LIB)/File/slurp_article.pod $(INST_LIB)/File/seek.pl blib/lib/File/Slurp.pm $(MAKEFILE_OLD) $(INST_LIB)/File/split.pl $(INST_LIB)/File/slurp_bench.pl $(INST_LIB)/File/Slurp.pm
+       $(RM_F) $(INST_LIB)/File/sysread.pl $(FIRST_MAKEFILE) $(INST_LIB)/File/carp.pl
+
+
+# --- MakeMaker metafile section:
+metafile :
+       $(NOECHO) $(ECHO) '# http://module-build.sourceforge.net/META-spec.html' > META.yml
+       $(NOECHO) $(ECHO) '#XXXXXXX This is a prototype!!!  It will change in the future!!! XXXXX#' >> META.yml
+       $(NOECHO) $(ECHO) 'name:         File-Slurp' >> META.yml
+       $(NOECHO) $(ECHO) 'version:      9999.12' >> META.yml
+       $(NOECHO) $(ECHO) 'version_from: lib/File/Slurp.pm' >> META.yml
+       $(NOECHO) $(ECHO) 'installdirs:  site' >> META.yml
+       $(NOECHO) $(ECHO) 'requires:' >> META.yml
+       $(NOECHO) $(ECHO) '' >> META.yml
+       $(NOECHO) $(ECHO) 'distribution_type: module' >> META.yml
+       $(NOECHO) $(ECHO) 'generated_by: ExtUtils::MakeMaker version 6.17' >> META.yml
+
+
+# --- MakeMaker metafile_addtomanifest section:
+metafile_addtomanifest:
+       $(NOECHO) $(PERLRUN) -MExtUtils::Manifest=maniadd -e 'eval { maniadd({q{META.yml} => q{Module meta-data (added by MakeMaker)}}) } ' \
+       -e '    or print "Could not add META.yml to MANIFEST: $${'\''@'\''}\n"'
+
+
+# --- MakeMaker dist_basics section:
+distclean :: realclean distcheck
+       $(NOECHO) $(NOOP)
+
+distcheck :
+       $(PERLRUN) "-MExtUtils::Manifest=fullcheck" -e fullcheck
+
+skipcheck :
+       $(PERLRUN) "-MExtUtils::Manifest=skipcheck" -e skipcheck
+
+manifest :
+       $(PERLRUN) "-MExtUtils::Manifest=mkmanifest" -e mkmanifest
+
+veryclean : realclean
+       $(RM_F) *~ *.orig */*~ */*.orig
+
+
+
+# --- MakeMaker dist_core section:
+
+dist : $(DIST_DEFAULT) $(FIRST_MAKEFILE)
+       $(NOECHO) $(PERLRUN) -l -e 'print '\''Warning: Makefile possibly out of date with $(VERSION_FROM)'\''' \
+       -e '    if -e '\''$(VERSION_FROM)'\'' and -M '\''$(VERSION_FROM)'\'' < -M '\''$(FIRST_MAKEFILE)'\'';'
+
+tardist : $(DISTVNAME).tar$(SUFFIX)
+       $(NOECHO) $(NOOP)
+
+uutardist : $(DISTVNAME).tar$(SUFFIX)
+       uuencode $(DISTVNAME).tar$(SUFFIX) $(DISTVNAME).tar$(SUFFIX) > $(DISTVNAME).tar$(SUFFIX)_uu
+
+$(DISTVNAME).tar$(SUFFIX) : distdir
+       $(PREOP)
+       $(TO_UNIX)
+       $(TAR) $(TARFLAGS) $(DISTVNAME).tar $(DISTVNAME)
+       $(RM_RF) $(DISTVNAME)
+       $(COMPRESS) $(DISTVNAME).tar
+       $(POSTOP)
+
+zipdist : $(DISTVNAME).zip
+       $(NOECHO) $(NOOP)
+
+$(DISTVNAME).zip : distdir
+       $(PREOP)
+       $(ZIP) $(ZIPFLAGS) $(DISTVNAME).zip $(DISTVNAME)
+       $(RM_RF) $(DISTVNAME)
+       $(POSTOP)
+
+shdist : distdir
+       $(PREOP)
+       $(SHAR) $(DISTVNAME) > $(DISTVNAME).shar
+       $(RM_RF) $(DISTVNAME)
+       $(POSTOP)
+
+
+# --- MakeMaker distdir section:
+distdir : metafile metafile_addtomanifest
+       $(RM_RF) $(DISTVNAME)
+       $(PERLRUN) "-MExtUtils::Manifest=manicopy,maniread" \
+               -e "manicopy(maniread(),'$(DISTVNAME)', '$(DIST_CP)');"
+
+
+
+# --- MakeMaker dist_test section:
+
+disttest : distdir
+       cd $(DISTVNAME) && $(ABSPERLRUN) Makefile.PL
+       cd $(DISTVNAME) && $(MAKE) $(PASTHRU)
+       cd $(DISTVNAME) && $(MAKE) test $(PASTHRU)
+
+
+# --- MakeMaker dist_ci section:
+
+ci :
+       $(PERLRUN) "-MExtUtils::Manifest=maniread" \
+         -e "@all = keys %{ maniread() };" \
+         -e "print(qq{Executing $(CI) @all\n}); system(qq{$(CI) @all});" \
+         -e "print(qq{Executing $(RCS_LABEL) ...\n}); system(qq{$(RCS_LABEL) @all});"
+
+
+# --- MakeMaker install section:
+
+install :: all pure_install doc_install
+
+install_perl :: all pure_perl_install doc_perl_install
+
+install_site :: all pure_site_install doc_site_install
+
+install_vendor :: all pure_vendor_install doc_vendor_install
+
+pure_install :: pure_$(INSTALLDIRS)_install
+
+doc_install :: doc_$(INSTALLDIRS)_install
+
+pure__install : pure_site_install
+       $(NOECHO) $(ECHO) INSTALLDIRS not defined, defaulting to INSTALLDIRS=site
+
+doc__install : doc_site_install
+       $(NOECHO) $(ECHO) INSTALLDIRS not defined, defaulting to INSTALLDIRS=site
+
+pure_perl_install ::
+       $(NOECHO) $(MOD_INSTALL) \
+               read $(PERL_ARCHLIB)/auto/$(FULLEXT)/.packlist \
+               write $(DESTINSTALLARCHLIB)/auto/$(FULLEXT)/.packlist \
+               $(INST_LIB) $(DESTINSTALLPRIVLIB) \
+               $(INST_ARCHLIB) $(DESTINSTALLARCHLIB) \
+               $(INST_BIN) $(DESTINSTALLBIN) \
+               $(INST_SCRIPT) $(DESTINSTALLSCRIPT) \
+               $(INST_MAN1DIR) $(DESTINSTALLMAN1DIR) \
+               $(INST_MAN3DIR) $(DESTINSTALLMAN3DIR)
+       $(NOECHO) $(WARN_IF_OLD_PACKLIST) \
+               $(SITEARCHEXP)/auto/$(FULLEXT)
+
+
+pure_site_install ::
+       $(NOECHO) $(MOD_INSTALL) \
+               read $(SITEARCHEXP)/auto/$(FULLEXT)/.packlist \
+               write $(DESTINSTALLSITEARCH)/auto/$(FULLEXT)/.packlist \
+               $(INST_LIB) $(DESTINSTALLSITELIB) \
+               $(INST_ARCHLIB) $(DESTINSTALLSITEARCH) \
+               $(INST_BIN) $(DESTINSTALLSITEBIN) \
+               $(INST_SCRIPT) $(DESTINSTALLSCRIPT) \
+               $(INST_MAN1DIR) $(DESTINSTALLSITEMAN1DIR) \
+               $(INST_MAN3DIR) $(DESTINSTALLSITEMAN3DIR)
+       $(NOECHO) $(WARN_IF_OLD_PACKLIST) \
+               $(PERL_ARCHLIB)/auto/$(FULLEXT)
+
+pure_vendor_install ::
+       $(NOECHO) $(MOD_INSTALL) \
+               read $(VENDORARCHEXP)/auto/$(FULLEXT)/.packlist \
+               write $(DESTINSTALLVENDORARCH)/auto/$(FULLEXT)/.packlist \
+               $(INST_LIB) $(DESTINSTALLVENDORLIB) \
+               $(INST_ARCHLIB) $(DESTINSTALLVENDORARCH) \
+               $(INST_BIN) $(DESTINSTALLVENDORBIN) \
+               $(INST_SCRIPT) $(DESTINSTALLSCRIPT) \
+               $(INST_MAN1DIR) $(DESTINSTALLVENDORMAN1DIR) \
+               $(INST_MAN3DIR) $(DESTINSTALLVENDORMAN3DIR)
+
+doc_perl_install ::
+       $(NOECHO) $(ECHO) Appending installation info to $(DESTINSTALLARCHLIB)/perllocal.pod
+       -$(NOECHO) $(MKPATH) $(DESTINSTALLARCHLIB)
+       -$(NOECHO) $(DOC_INSTALL) \
+               "Module" "$(NAME)" \
+               "installed into" "$(INSTALLPRIVLIB)" \
+               LINKTYPE "$(LINKTYPE)" \
+               VERSION "$(VERSION)" \
+               EXE_FILES "$(EXE_FILES)" \
+               >> $(DESTINSTALLARCHLIB)/perllocal.pod
+
+doc_site_install ::
+       $(NOECHO) $(ECHO) Appending installation info to $(DESTINSTALLARCHLIB)/perllocal.pod
+       -$(NOECHO) $(MKPATH) $(DESTINSTALLARCHLIB)
+       -$(NOECHO) $(DOC_INSTALL) \
+               "Module" "$(NAME)" \
+               "installed into" "$(INSTALLSITELIB)" \
+               LINKTYPE "$(LINKTYPE)" \
+               VERSION "$(VERSION)" \
+               EXE_FILES "$(EXE_FILES)" \
+               >> $(DESTINSTALLARCHLIB)/perllocal.pod
+
+doc_vendor_install ::
+       $(NOECHO) $(ECHO) Appending installation info to $(DESTINSTALLARCHLIB)/perllocal.pod
+       -$(NOECHO) $(MKPATH) $(DESTINSTALLARCHLIB)
+       -$(NOECHO) $(DOC_INSTALL) \
+               "Module" "$(NAME)" \
+               "installed into" "$(INSTALLVENDORLIB)" \
+               LINKTYPE "$(LINKTYPE)" \
+               VERSION "$(VERSION)" \
+               EXE_FILES "$(EXE_FILES)" \
+               >> $(DESTINSTALLARCHLIB)/perllocal.pod
+
+
+uninstall :: uninstall_from_$(INSTALLDIRS)dirs
+
+uninstall_from_perldirs ::
+       $(NOECHO) $(UNINSTALL) $(PERL_ARCHLIB)/auto/$(FULLEXT)/.packlist
+
+uninstall_from_sitedirs ::
+       $(NOECHO) $(UNINSTALL) $(SITEARCHEXP)/auto/$(FULLEXT)/.packlist
+
+uninstall_from_vendordirs ::
+       $(NOECHO) $(UNINSTALL) $(VENDORARCHEXP)/auto/$(FULLEXT)/.packlist
+
+
+# --- MakeMaker force section:
+# Phony target to force checking subdirectories.
+FORCE:
+       $(NOECHO) $(NOOP)
+
+
+# --- MakeMaker perldepend section:
+
+
+# --- MakeMaker makefile section:
+
+# We take a very conservative approach here, but it's worth it.
+# We move Makefile to Makefile.old here to avoid gnu make looping.
+$(FIRST_MAKEFILE) : Makefile.PL $(CONFIGDEP)
+       $(NOECHO) $(ECHO) "Makefile out-of-date with respect to $?"
+       $(NOECHO) $(ECHO) "Cleaning current config before rebuilding Makefile..."
+       $(NOECHO) $(RM_F) $(MAKEFILE_OLD)
+       $(NOECHO) $(MV)   $(FIRST_MAKEFILE) $(MAKEFILE_OLD)
+       -$(MAKE) -f $(MAKEFILE_OLD) clean $(DEV_NULL) || $(NOOP)
+       $(PERLRUN) Makefile.PL 
+       $(NOECHO) $(ECHO) "==> Your Makefile has been rebuilt. <=="
+       $(NOECHO) $(ECHO) "==> Please rerun the make command.  <=="
+       false
+
+
+
+# --- MakeMaker staticmake section:
+
+# --- MakeMaker makeaperl section ---
+MAP_TARGET    = perl
+FULLPERL      = /usr/local/bin/perl
+
+$(MAP_TARGET) :: static $(MAKE_APERL_FILE)
+       $(MAKE) -f $(MAKE_APERL_FILE) $@
+
+$(MAKE_APERL_FILE) : $(FIRST_MAKEFILE)
+       $(NOECHO) $(ECHO) Writing \"$(MAKE_APERL_FILE)\" for this $(MAP_TARGET)
+       $(NOECHO) $(PERLRUNINST) \
+               Makefile.PL DIR= \
+               MAKEFILE=$(MAKE_APERL_FILE) LINKTYPE=static \
+               MAKEAPERL=1 NORECURS=1 CCCDLFLAGS=
+
+
+# --- MakeMaker test section:
+
+TEST_VERBOSE=0
+TEST_TYPE=test_$(LINKTYPE)
+TEST_FILE = test.pl
+TEST_FILES = t/*.t
+TESTDB_SW = -d
+
+testdb :: testdb_$(LINKTYPE)
+
+test :: $(TEST_TYPE)
+
+test_dynamic :: pure_all
+       PERL_DL_NONLAZY=1 $(FULLPERLRUN) "-MExtUtils::Command::MM" "-e" "test_harness($(TEST_VERBOSE), '$(INST_LIB)', '$(INST_ARCHLIB)')" $(TEST_FILES)
+
+testdb_dynamic :: pure_all
+       PERL_DL_NONLAZY=1 $(FULLPERLRUN) $(TESTDB_SW) "-I$(INST_LIB)" "-I$(INST_ARCHLIB)" $(TEST_FILE)
+
+test_ : test_dynamic
+
+test_static :: test_dynamic
+testdb_static :: testdb_dynamic
+
+
+# --- MakeMaker ppd section:
+# Creates a PPD (Perl Package Description) for a binary distribution.
+ppd:
+       $(NOECHO) $(ECHO) '<SOFTPKG NAME="$(DISTNAME)" VERSION="9999,12,0,0">' > $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '    <TITLE>$(DISTNAME)</TITLE>' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '    <ABSTRACT>Efficient Reading/Writing of Complete Files</ABSTRACT>' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '    <AUTHOR>Uri Guttman &lt;uri@stemsystems.com&gt;</AUTHOR>' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '    <IMPLEMENTATION>' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '        <OS NAME="$(OSNAME)" />' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '        <ARCHITECTURE NAME="sun4-solaris" />' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '        <CODEBASE HREF="" />' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '    </IMPLEMENTATION>' >> $(DISTNAME).ppd
+       $(NOECHO) $(ECHO) '</SOFTPKG>' >> $(DISTNAME).ppd
+
+
+# --- MakeMaker pm_to_blib section:
+
+pm_to_blib: $(TO_INST_PM)
+       $(NOECHO) $(PERLRUN) -MExtUtils::Install -e 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
+         Slurp.pm $(INST_LIB)/File/Slurp.pm \
+         carp.pl $(INST_LIB)/File/carp.pl \
+         lib/File/Slurp.pm blib/lib/File/Slurp.pm \
+         seek.pl $(INST_LIB)/File/seek.pl \
+         split.pl $(INST_LIB)/File/split.pl \
+         sysread.pl $(INST_LIB)/File/sysread.pl \
+         slurp_article.pod $(INST_LIB)/File/slurp_article.pod \
+         slurp_bench.pl $(INST_LIB)/File/slurp_bench.pl 
+       $(NOECHO) $(TOUCH) $@
+
+# --- MakeMaker selfdocument section:
+
+
+# --- MakeMaker postamble section:
+
+
+# End.
diff --git a/Makefile.PL b/Makefile.PL

new file mode 100644 (file)

index 0000000..07a60d5
--- /dev/null
+++ b/Makefile.PL
@@ -0,0 +1,11 @@
+use ExtUtils::MakeMaker;
+# See lib/ExtUtils/MakeMaker.pm for details of how to influence
+# the contents of the Makefile that is written.
+WriteMakefile(
+    'NAME'             => 'File::Slurp',
+    'VERSION_FROM'     => 'lib/File/Slurp.pm', # finds $VERSION
+    'PREREQ_PM'                => {}, # e.g., Module::Name => 1.1
+    ($] >= 5.005 ?    ## Add these new keywords supported since 5.005
+      (ABSTRACT_FROM => 'lib/File/Slurp.pm', # retrieve abstract from module
+       AUTHOR     => 'Uri Guttman <uri@stemsystems.com>') : ()),
+);
diff --git a/Makefile.old b/Makefile.old

new file mode 100644 (file)

index 0000000..811e0f6
--- /dev/null
+++ b/Makefile.old
@@ -0,0 +1,700 @@
+# This Makefile is for the File::Slurp extension to perl.
+#
+# It was generated automatically by MakeMaker version
+# 5.45 (Revision: 1.222) from the contents of
+# Makefile.PL. Don't edit this file, edit Makefile.PL instead.
+#
+#      ANY CHANGES MADE HERE WILL BE LOST!
+#
+#   MakeMaker ARGV: ()
+#
+#   MakeMaker Parameters:
+
+#      ABSTRACT_FROM => q[lib/File/Slurp.pm]
+#      AUTHOR => q[Uri Guttman <uri@stemsystems.com>]
+#      NAME => q[File::Slurp]
+#      PREREQ_PM => {  }
+#      VERSION_FROM => q[lib/File/Slurp.pm]
+
+# --- MakeMaker post_initialize section:
+
+
+# --- MakeMaker const_config section:
+
+# These definitions are from config.sh (via /usr/local/lib/perl5/5.6.1/sun4-solaris/Config.pm)
+
+# They may have been overridden via Makefile.PL or on the command line
+AR = ar
+CC = gcc
+CCCDLFLAGS = -fPIC
+CCDLFLAGS =  
+DLEXT = so
+DLSRC = dl_dlopen.xs
+LD = gcc
+LDDLFLAGS = -G -L/usr/local/lib
+LDFLAGS =  -L/usr/local/lib 
+LIBC = /lib/libc.so
+LIB_EXT = .a
+OBJ_EXT = .o
+OSNAME = solaris
+OSVERS = 2.7
+RANLIB = :
+SO = so
+EXE_EXT = 
+FULL_AR = /usr/ccs/bin/ar
+
+
+# --- MakeMaker constants section:
+AR_STATIC_ARGS = cr
+NAME = File::Slurp
+DISTNAME = File-Slurp
+NAME_SYM = File_Slurp
+VERSION = 9999.03
+VERSION_SYM = 9999_03
+XS_VERSION = 9999.03
+INST_BIN = blib/bin
+INST_EXE = blib/script
+INST_LIB = blib/lib
+INST_ARCHLIB = blib/arch
+INST_SCRIPT = blib/script
+PREFIX = /usr/local
+INSTALLDIRS = site
+INSTALLPRIVLIB = $(PREFIX)/lib/perl5/5.6.1
+INSTALLARCHLIB = $(PREFIX)/lib/perl5/5.6.1/sun4-solaris
+INSTALLSITELIB = $(PREFIX)/lib/perl5/site_perl/5.6.1
+INSTALLSITEARCH = $(PREFIX)/lib/perl5/site_perl/5.6.1/sun4-solaris
+INSTALLBIN = $(PREFIX)/bin
+INSTALLSCRIPT = $(PREFIX)/bin
+PERL_LIB = /usr/local/lib/perl5/5.6.1
+PERL_ARCHLIB = /usr/local/lib/perl5/5.6.1/sun4-solaris
+SITELIBEXP = /usr/local/lib/perl5/site_perl/5.6.1
+SITEARCHEXP = /usr/local/lib/perl5/site_perl/5.6.1/sun4-solaris
+LIBPERL_A = libperl.a
+FIRST_MAKEFILE = Makefile
+MAKE_APERL_FILE = Makefile.aperl
+PERLMAINCC = $(CC)
+PERL_INC = /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE
+PERL = /usr/local/bin/perl
+FULLPERL = /usr/local/bin/perl
+FULL_AR = /usr/ccs/bin/ar
+
+VERSION_MACRO = VERSION
+DEFINE_VERSION = -D$(VERSION_MACRO)=\"$(VERSION)\"
+XS_VERSION_MACRO = XS_VERSION
+XS_DEFINE_VERSION = -D$(XS_VERSION_MACRO)=\"$(XS_VERSION)\"
+PERL_MALLOC_DEF = -DPERL_EXTMALLOC_DEF -Dmalloc=Perl_malloc -Dfree=Perl_mfree -Drealloc=Perl_realloc -Dcalloc=Perl_calloc
+
+MAKEMAKER = /usr/local/lib/perl5/5.6.1/ExtUtils/MakeMaker.pm
+MM_VERSION = 5.45
+
+# FULLEXT = Pathname for extension directory (eg Foo/Bar/Oracle).
+# BASEEXT = Basename part of FULLEXT. May be just equal FULLEXT. (eg Oracle)
+# ROOTEXT = Directory part of FULLEXT with leading slash (eg /DBD)  !!! Deprecated from MM 5.32  !!!
+# PARENT_NAME = NAME without BASEEXT and no trailing :: (eg Foo::Bar)
+# DLBASE  = Basename part of dynamic library. May be just equal BASEEXT.
+FULLEXT = File/Slurp
+BASEEXT = Slurp
+PARENT_NAME = File
+DLBASE = $(BASEEXT)
+VERSION_FROM = lib/File/Slurp.pm
+OBJECT = 
+LDFROM = $(OBJECT)
+LINKTYPE = dynamic
+
+# Handy lists of source code files:
+XS_FILES= 
+C_FILES = 
+O_FILES = 
+H_FILES = 
+HTMLLIBPODS    = 
+HTMLSCRIPTPODS = 
+MAN1PODS = 
+MAN3PODS = lib/File/Slurp.pm \
+       slurp_article.pod
+HTMLEXT = html
+INST_MAN1DIR = blib/man1
+INSTALLMAN1DIR = $(PREFIX)/man/man1
+MAN1EXT = 1
+INST_MAN3DIR = blib/man3
+INSTALLMAN3DIR = $(PREFIX)/man/man3
+MAN3EXT = 3
+PERM_RW = 644
+PERM_RWX = 755
+
+# work around a famous dec-osf make(1) feature(?):
+makemakerdflt: all
+
+.SUFFIXES: .xs .c .C .cpp .cxx .cc $(OBJ_EXT)
+
+# Nick wanted to get rid of .PRECIOUS. I don't remember why. I seem to recall, that
+# some make implementations will delete the Makefile when we rebuild it. Because
+# we call false(1) when we rebuild it. So make(1) is not completely wrong when it
+# does so. Our milage may vary.
+# .PRECIOUS: Makefile    # seems to be not necessary anymore
+
+.PHONY: all config static dynamic test linkext manifest
+
+# Where is the Config information that we are using/depend on
+CONFIGDEP = $(PERL_ARCHLIB)/Config.pm $(PERL_INC)/config.h
+
+# Where to put things:
+INST_LIBDIR      = $(INST_LIB)/File
+INST_ARCHLIBDIR  = $(INST_ARCHLIB)/File
+
+INST_AUTODIR     = $(INST_LIB)/auto/$(FULLEXT)
+INST_ARCHAUTODIR = $(INST_ARCHLIB)/auto/$(FULLEXT)
+
+INST_STATIC  =
+INST_DYNAMIC =
+INST_BOOT    =
+
+EXPORT_LIST = 
+
+PERL_ARCHIVE = 
+
+PERL_ARCHIVE_AFTER = 
+
+TO_INST_PM = carp.pl \
+       lib/File/Slurp.pm \
+       lib/File/Slurp.pm.~1.15.~ \
+       slurp_article.pod \
+       slurp_bench.pl \
+       split.pl \
+       sysread.pl
+
+PM_TO_BLIB = slurp_article.pod \
+       $(INST_LIBDIR)/slurp_article.pod \
+       carp.pl \
+       $(INST_LIBDIR)/carp.pl \
+       lib/File/Slurp.pm.~1.15.~ \
+       $(INST_LIB)/File/Slurp.pm.~1.15.~ \
+       split.pl \
+       $(INST_LIBDIR)/split.pl \
+       sysread.pl \
+       $(INST_LIBDIR)/sysread.pl \
+       lib/File/Slurp.pm \
+       $(INST_LIB)/File/Slurp.pm \
+       slurp_bench.pl \
+       $(INST_LIBDIR)/slurp_bench.pl
+
+
+# --- MakeMaker tool_autosplit section:
+
+# Usage: $(AUTOSPLITFILE) FileToSplit AutoDirToSplitInto
+AUTOSPLITFILE = $(PERL) "-I$(PERL_ARCHLIB)" "-I$(PERL_LIB)" -e 'use AutoSplit;autosplit($$ARGV[0], $$ARGV[1], 0, 1, 1) ;'
+
+
+# --- MakeMaker tool_xsubpp section:
+
+
+# --- MakeMaker tools_other section:
+
+SHELL = /bin/sh
+CHMOD = chmod
+CP = cp
+LD = gcc
+MV = mv
+NOOP = $(SHELL) -c true
+RM_F = rm -f
+RM_RF = rm -rf
+TEST_F = test -f
+TOUCH = touch
+UMASK_NULL = umask 0
+DEV_NULL = > /dev/null 2>&1
+
+# The following is a portable way to say mkdir -p
+# To see which directories are created, change the if 0 to if 1
+MKPATH = $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Command -e mkpath
+
+# This helps us to minimize the effect of the .exists files A yet
+# better solution would be to have a stable file in the perl
+# distribution with a timestamp of zero. But this solution doesn't
+# need any changes to the core distribution and works with older perls
+EQUALIZE_TIMESTAMP = $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Command -e eqtime
+
+# Here we warn users that an old packlist file was found somewhere,
+# and that they should call some uninstall routine
+WARN_IF_OLD_PACKLIST = $(PERL) -we 'exit unless -f $$ARGV[0];' \
+-e 'print "WARNING: I have found an old package in\n";' \
+-e 'print "\t$$ARGV[0].\n";' \
+-e 'print "Please make sure the two installations are not conflicting\n";'
+
+UNINST=0
+VERBINST=0
+
+MOD_INSTALL = $(PERL) -I$(INST_LIB) -I$(PERL_LIB) -MExtUtils::Install \
+-e "install({@ARGV},'$(VERBINST)',0,'$(UNINST)');"
+
+DOC_INSTALL = $(PERL) -e '$$\="\n\n";' \
+-e 'print "=head2 ", scalar(localtime), ": C<", shift, ">", " L<", $$arg=shift, "|", $$arg, ">";' \
+-e 'print "=over 4";' \
+-e 'while (defined($$key = shift) and defined($$val = shift)){print "=item *";print "C<$$key: $$val>";}' \
+-e 'print "=back";'
+
+UNINSTALL =   $(PERL) -MExtUtils::Install \
+-e 'uninstall($$ARGV[0],1,1); print "\nUninstall is deprecated. Please check the";' \
+-e 'print " packlist above carefully.\n  There may be errors. Remove the";' \
+-e 'print " appropriate files manually.\n  Sorry for the inconveniences.\n"'
+
+
+# --- MakeMaker dist section:
+
+DISTVNAME = $(DISTNAME)-$(VERSION)
+TAR  = tar
+TARFLAGS = cvf
+ZIP  = zip
+ZIPFLAGS = -r
+COMPRESS = gzip --best
+SUFFIX = .gz
+SHAR = shar
+PREOP = @$(NOOP)
+POSTOP = @$(NOOP)
+TO_UNIX = @$(NOOP)
+CI = ci -u
+RCS_LABEL = rcs -Nv$(VERSION_SYM): -q
+DIST_CP = best
+DIST_DEFAULT = tardist
+
+
+# --- MakeMaker macro section:
+
+
+# --- MakeMaker depend section:
+
+
+# --- MakeMaker cflags section:
+
+
+# --- MakeMaker const_loadlibs section:
+
+
+# --- MakeMaker const_cccmd section:
+
+
+# --- MakeMaker post_constants section:
+
+
+# --- MakeMaker pasthru section:
+
+PASTHRU = LIB="$(LIB)"\
+       LIBPERL_A="$(LIBPERL_A)"\
+       LINKTYPE="$(LINKTYPE)"\
+       PREFIX="$(PREFIX)"\
+       OPTIMIZE="$(OPTIMIZE)"
+
+
+# --- MakeMaker c_o section:
+
+
+# --- MakeMaker xs_c section:
+
+
+# --- MakeMaker xs_o section:
+
+
+# --- MakeMaker top_targets section:
+
+#all ::        config $(INST_PM) subdirs linkext manifypods
+
+all :: pure_all htmlifypods manifypods
+       @$(NOOP)
+
+pure_all :: config pm_to_blib subdirs linkext
+       @$(NOOP)
+
+subdirs :: $(MYEXTLIB)
+       @$(NOOP)
+
+config :: Makefile $(INST_LIBDIR)/.exists
+       @$(NOOP)
+
+config :: $(INST_ARCHAUTODIR)/.exists
+       @$(NOOP)
+
+config :: $(INST_AUTODIR)/.exists
+       @$(NOOP)
+
+$(INST_AUTODIR)/.exists :: /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h
+       @$(MKPATH) $(INST_AUTODIR)
+       @$(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h $(INST_AUTODIR)/.exists
+
+       -@$(CHMOD) $(PERM_RWX) $(INST_AUTODIR)
+
+$(INST_LIBDIR)/.exists :: /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h
+       @$(MKPATH) $(INST_LIBDIR)
+       @$(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h $(INST_LIBDIR)/.exists
+
+       -@$(CHMOD) $(PERM_RWX) $(INST_LIBDIR)
+
+$(INST_ARCHAUTODIR)/.exists :: /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h
+       @$(MKPATH) $(INST_ARCHAUTODIR)
+       @$(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h $(INST_ARCHAUTODIR)/.exists
+
+       -@$(CHMOD) $(PERM_RWX) $(INST_ARCHAUTODIR)
+
+config :: $(INST_MAN3DIR)/.exists
+       @$(NOOP)
+
+
+$(INST_MAN3DIR)/.exists :: /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h
+       @$(MKPATH) $(INST_MAN3DIR)
+       @$(EQUALIZE_TIMESTAMP) /usr/local/lib/perl5/5.6.1/sun4-solaris/CORE/perl.h $(INST_MAN3DIR)/.exists
+
+       -@$(CHMOD) $(PERM_RWX) $(INST_MAN3DIR)
+
+help:
+       perldoc ExtUtils::MakeMaker
+
+Version_check:
+       @$(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) \
+               -MExtUtils::MakeMaker=Version_check \
+               -e "Version_check('$(MM_VERSION)')"
+
+
+# --- MakeMaker linkext section:
+
+linkext :: $(LINKTYPE)
+       @$(NOOP)
+
+
+# --- MakeMaker dlsyms section:
+
+
+# --- MakeMaker dynamic section:
+
+## $(INST_PM) has been moved to the all: target.
+## It remains here for awhile to allow for old usage: "make dynamic"
+#dynamic :: Makefile $(INST_DYNAMIC) $(INST_BOOT) $(INST_PM)
+dynamic :: Makefile $(INST_DYNAMIC) $(INST_BOOT)
+       @$(NOOP)
+
+
+# --- MakeMaker dynamic_bs section:
+
+BOOTSTRAP =
+
+
+# --- MakeMaker dynamic_lib section:
+
+
+# --- MakeMaker static section:
+
+## $(INST_PM) has been moved to the all: target.
+## It remains here for awhile to allow for old usage: "make static"
+#static :: Makefile $(INST_STATIC) $(INST_PM)
+static :: Makefile $(INST_STATIC)
+       @$(NOOP)
+
+
+# --- MakeMaker static_lib section:
+
+
+# --- MakeMaker htmlifypods section:
+
+htmlifypods : pure_all
+       @$(NOOP)
+
+
+# --- MakeMaker manifypods section:
+POD2MAN_EXE = /usr/local/bin/pod2man
+POD2MAN = $(PERL) -we '%m=@ARGV;for (keys %m){' \
+-e 'next if -e $$m{$$_} && -M $$m{$$_} < -M $$_ && -M $$m{$$_} < -M "Makefile";' \
+-e 'print "Manifying $$m{$$_}\n";' \
+-e 'system(qq[$$^X ].q["-I$(PERL_ARCHLIB)" "-I$(PERL_LIB)" $(POD2MAN_EXE) ].qq[$$_>$$m{$$_}])==0 or warn "Couldn\047t install $$m{$$_}\n";' \
+-e 'chmod(oct($(PERM_RW))), $$m{$$_} or warn "chmod $(PERM_RW) $$m{$$_}: $$!\n";}'
+
+manifypods : pure_all slurp_article.pod \
+       lib/File/Slurp.pm
+       @$(POD2MAN) \
+       slurp_article.pod \
+       $(INST_MAN3DIR)/File::slurp_article.$(MAN3EXT) \
+       lib/File/Slurp.pm \
+       $(INST_MAN3DIR)/File::Slurp.$(MAN3EXT)
+
+# --- MakeMaker processPL section:
+
+
+# --- MakeMaker installbin section:
+
+
+# --- MakeMaker subdirs section:
+
+# none
+
+# --- MakeMaker clean section:
+
+# Delete temporary files but do not touch installed files. We don't delete
+# the Makefile here so a later make realclean still has a makefile to use.
+
+clean ::
+       -rm -rf ./blib $(MAKE_APERL_FILE) $(INST_ARCHAUTODIR)/extralibs.all perlmain.c mon.out core core.*perl.*.? *perl.core so_locations pm_to_blib *$(OBJ_EXT) *$(LIB_EXT) perl.exe $(BOOTSTRAP) $(BASEEXT).bso $(BASEEXT).def $(BASEEXT).exp
+       -mv Makefile Makefile.old $(DEV_NULL)
+
+
+# --- MakeMaker realclean section:
+
+# Delete temporary files (via clean) and also delete installed files
+realclean purge ::  clean
+       rm -rf $(INST_AUTODIR) $(INST_ARCHAUTODIR)
+       rm -f  $(INST_LIBDIR)/slurp_article.pod $(INST_LIBDIR)/carp.pl
+       rm -f $(INST_LIB)/File/Slurp.pm.~1.15.~ $(INST_LIBDIR)/split.pl
+       rm -f $(INST_LIBDIR)/sysread.pl $(INST_LIB)/File/Slurp.pm $(INST_LIBDIR)/slurp_bench.pl
+       rm -rf Makefile Makefile.old
+
+
+# --- MakeMaker dist_basics section:
+
+distclean :: realclean distcheck
+
+distcheck :
+       $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Manifest=fullcheck \
+               -e fullcheck
+
+skipcheck :
+       $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Manifest=skipcheck \
+               -e skipcheck
+
+manifest :
+       $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Manifest=mkmanifest \
+               -e mkmanifest
+
+veryclean : realclean
+       $(RM_F) *~ *.orig */*~ */*.orig
+
+
+# --- MakeMaker dist_core section:
+
+dist : $(DIST_DEFAULT)
+       @$(PERL) -le 'print "Warning: Makefile possibly out of date with $$vf" if ' \
+           -e '-e ($$vf="$(VERSION_FROM)") and -M $$vf < -M "Makefile";'
+
+tardist : $(DISTVNAME).tar$(SUFFIX)
+
+zipdist : $(DISTVNAME).zip
+
+$(DISTVNAME).tar$(SUFFIX) : distdir
+       $(PREOP)
+       $(TO_UNIX)
+       $(TAR) $(TARFLAGS) $(DISTVNAME).tar $(DISTVNAME)
+       $(RM_RF) $(DISTVNAME)
+       $(COMPRESS) $(DISTVNAME).tar
+       $(POSTOP)
+
+$(DISTVNAME).zip : distdir
+       $(PREOP)
+       $(ZIP) $(ZIPFLAGS) $(DISTVNAME).zip $(DISTVNAME)
+       $(RM_RF) $(DISTVNAME)
+       $(POSTOP)
+
+uutardist : $(DISTVNAME).tar$(SUFFIX)
+       uuencode $(DISTVNAME).tar$(SUFFIX) \
+               $(DISTVNAME).tar$(SUFFIX) > \
+               $(DISTVNAME).tar$(SUFFIX)_uu
+
+shdist : distdir
+       $(PREOP)
+       $(SHAR) $(DISTVNAME) > $(DISTVNAME).shar
+       $(RM_RF) $(DISTVNAME)
+       $(POSTOP)
+
+
+# --- MakeMaker dist_dir section:
+
+distdir :
+       $(RM_RF) $(DISTVNAME)
+       $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Manifest=manicopy,maniread \
+               -e "manicopy(maniread(),'$(DISTVNAME)', '$(DIST_CP)');"
+
+
+# --- MakeMaker dist_test section:
+
+disttest : distdir
+       cd $(DISTVNAME) && $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) Makefile.PL
+       cd $(DISTVNAME) && $(MAKE)
+       cd $(DISTVNAME) && $(MAKE) test
+
+
+# --- MakeMaker dist_ci section:
+
+ci :
+       $(PERL) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -MExtUtils::Manifest=maniread \
+               -e "@all = keys %{ maniread() };" \
+               -e 'print("Executing $(CI) @all\n"); system("$(CI) @all");' \
+               -e 'print("Executing $(RCS_LABEL) ...\n"); system("$(RCS_LABEL) @all");'
+
+
+# --- MakeMaker install section:
+
+install :: all pure_install doc_install
+
+install_perl :: all pure_perl_install doc_perl_install
+
+install_site :: all pure_site_install doc_site_install
+
+install_ :: install_site
+       @echo INSTALLDIRS not defined, defaulting to INSTALLDIRS=site
+
+pure_install :: pure_$(INSTALLDIRS)_install
+
+doc_install :: doc_$(INSTALLDIRS)_install
+       @echo Appending installation info to $(INSTALLARCHLIB)/perllocal.pod
+
+pure__install : pure_site_install
+       @echo INSTALLDIRS not defined, defaulting to INSTALLDIRS=site
+
+doc__install : doc_site_install
+       @echo INSTALLDIRS not defined, defaulting to INSTALLDIRS=site
+
+pure_perl_install ::
+       @$(MOD_INSTALL) \
+               read $(PERL_ARCHLIB)/auto/$(FULLEXT)/.packlist \
+               write $(INSTALLARCHLIB)/auto/$(FULLEXT)/.packlist \
+               $(INST_LIB) $(INSTALLPRIVLIB) \
+               $(INST_ARCHLIB) $(INSTALLARCHLIB) \
+               $(INST_BIN) $(INSTALLBIN) \
+               $(INST_SCRIPT) $(INSTALLSCRIPT) \
+               $(INST_HTMLLIBDIR) $(INSTALLHTMLPRIVLIBDIR) \
+               $(INST_HTMLSCRIPTDIR) $(INSTALLHTMLSCRIPTDIR) \
+               $(INST_MAN1DIR) $(INSTALLMAN1DIR) \
+               $(INST_MAN3DIR) $(INSTALLMAN3DIR)
+       @$(WARN_IF_OLD_PACKLIST) \
+               $(SITEARCHEXP)/auto/$(FULLEXT)
+
+
+pure_site_install ::
+       @$(MOD_INSTALL) \
+               read $(SITEARCHEXP)/auto/$(FULLEXT)/.packlist \
+               write $(INSTALLSITEARCH)/auto/$(FULLEXT)/.packlist \
+               $(INST_LIB) $(INSTALLSITELIB) \
+               $(INST_ARCHLIB) $(INSTALLSITEARCH) \
+               $(INST_BIN) $(INSTALLBIN) \
+               $(INST_SCRIPT) $(INSTALLSCRIPT) \
+               $(INST_HTMLLIBDIR) $(INSTALLHTMLSITELIBDIR) \
+               $(INST_HTMLSCRIPTDIR) $(INSTALLHTMLSCRIPTDIR) \
+               $(INST_MAN1DIR) $(INSTALLMAN1DIR) \
+               $(INST_MAN3DIR) $(INSTALLMAN3DIR)
+       @$(WARN_IF_OLD_PACKLIST) \
+               $(PERL_ARCHLIB)/auto/$(FULLEXT)
+
+doc_perl_install ::
+       -@$(MKPATH) $(INSTALLARCHLIB)
+       -@$(DOC_INSTALL) \
+               "Module" "$(NAME)" \
+               "installed into" "$(INSTALLPRIVLIB)" \
+               LINKTYPE "$(LINKTYPE)" \
+               VERSION "$(VERSION)" \
+               EXE_FILES "$(EXE_FILES)" \
+               >> $(INSTALLARCHLIB)/perllocal.pod
+
+doc_site_install ::
+       -@$(MKPATH) $(INSTALLARCHLIB)
+       -@$(DOC_INSTALL) \
+               "Module" "$(NAME)" \
+               "installed into" "$(INSTALLSITELIB)" \
+               LINKTYPE "$(LINKTYPE)" \
+               VERSION "$(VERSION)" \
+               EXE_FILES "$(EXE_FILES)" \
+               >> $(INSTALLARCHLIB)/perllocal.pod
+
+
+uninstall :: uninstall_from_$(INSTALLDIRS)dirs
+
+uninstall_from_perldirs ::
+       @$(UNINSTALL) $(PERL_ARCHLIB)/auto/$(FULLEXT)/.packlist
+
+uninstall_from_sitedirs ::
+       @$(UNINSTALL) $(SITEARCHEXP)/auto/$(FULLEXT)/.packlist
+
+
+# --- MakeMaker force section:
+# Phony target to force checking subdirectories.
+FORCE:
+       @$(NOOP)
+
+
+# --- MakeMaker perldepend section:
+
+
+# --- MakeMaker makefile section:
+
+# We take a very conservative approach here, but it\'s worth it.
+# We move Makefile to Makefile.old here to avoid gnu make looping.
+Makefile : Makefile.PL $(CONFIGDEP)
+       @echo "Makefile out-of-date with respect to $?"
+       @echo "Cleaning current config before rebuilding Makefile..."
+       -@$(RM_F) Makefile.old
+       -@$(MV) Makefile Makefile.old
+       -$(MAKE) -f Makefile.old clean $(DEV_NULL) || $(NOOP)
+       $(PERL) "-I$(PERL_ARCHLIB)" "-I$(PERL_LIB)" Makefile.PL 
+       @echo "==> Your Makefile has been rebuilt. <=="
+       @echo "==> Please rerun the make command.  <=="
+       false
+
+# To change behavior to :: would be nice, but would break Tk b9.02
+# so you find such a warning below the dist target.
+#Makefile :: $(VERSION_FROM)
+#      @echo "Warning: Makefile possibly out of date with $(VERSION_FROM)"
+
+
+# --- MakeMaker staticmake section:
+
+# --- MakeMaker makeaperl section ---
+MAP_TARGET    = perl
+FULLPERL      = /usr/local/bin/perl
+
+$(MAP_TARGET) :: static $(MAKE_APERL_FILE)
+       $(MAKE) -f $(MAKE_APERL_FILE) $@
+
+$(MAKE_APERL_FILE) : $(FIRST_MAKEFILE)
+       @echo Writing \"$(MAKE_APERL_FILE)\" for this $(MAP_TARGET)
+       @$(PERL) -I$(INST_ARCHLIB) -I$(INST_LIB) -I$(PERL_ARCHLIB) -I$(PERL_LIB) \
+               Makefile.PL DIR= \
+               MAKEFILE=$(MAKE_APERL_FILE) LINKTYPE=static \
+               MAKEAPERL=1 NORECURS=1 CCCDLFLAGS=
+
+
+# --- MakeMaker test section:
+
+TEST_VERBOSE=0
+TEST_TYPE=test_$(LINKTYPE)
+TEST_FILE = test.pl
+TEST_FILES = t/*.t
+TESTDB_SW = -d
+
+testdb :: testdb_$(LINKTYPE)
+
+test :: $(TEST_TYPE)
+
+test_dynamic :: pure_all
+       PERL_DL_NONLAZY=1 $(FULLPERL) -I$(INST_ARCHLIB) -I$(INST_LIB) -I$(PERL_ARCHLIB) -I$(PERL_LIB) -e 'use Test::Harness qw(&runtests $$verbose); $$verbose=$(TEST_VERBOSE); runtests @ARGV;' $(TEST_FILES)
+
+testdb_dynamic :: pure_all
+       PERL_DL_NONLAZY=1 $(FULLPERL) $(TESTDB_SW) -I$(INST_ARCHLIB) -I$(INST_LIB) -I$(PERL_ARCHLIB) -I$(PERL_LIB) $(TEST_FILE)
+
+test_ : test_dynamic
+
+test_static :: test_dynamic
+testdb_static :: testdb_dynamic
+
+
+# --- MakeMaker ppd section:
+# Creates a PPD (Perl Package Description) for a binary distribution.
+ppd:
+       @$(PERL) -e "print qq{<SOFTPKG NAME=\"File-Slurp\" VERSION=\"9999,03,0,0\">\n}. qq{\t<TITLE>File-Slurp</TITLE>\n}. qq{\t<ABSTRACT>Efficient Reading/Writing of Complete Files</ABSTRACT>\n}. qq{\t<AUTHOR>Uri Guttman &lt;uri\@stemsystems.com&gt;</AUTHOR>\n}. qq{\t<IMPLEMENTATION>\n}. qq{\t\t<OS NAME=\"$(OSNAME)\" />\n}. qq{\t\t<ARCHITECTURE NAME=\"sun4-solaris\" />\n}. qq{\t\t<CODEBASE HREF=\"\" />\n}. qq{\t</IMPLEMENTATION>\n}. qq{</SOFTPKG>\n}" > File-Slurp.ppd
+
+# --- MakeMaker pm_to_blib section:
+
+pm_to_blib: $(TO_INST_PM)
+       @$(PERL) "-I$(INST_ARCHLIB)" "-I$(INST_LIB)" \
+       "-I$(PERL_ARCHLIB)" "-I$(PERL_LIB)" -MExtUtils::Install \
+        -e "pm_to_blib({qw{$(PM_TO_BLIB)}},'$(INST_LIB)/auto','$(PM_FILTER)')"
+       @$(TOUCH) $@
+
+
+# --- MakeMaker selfdocument section:
+
+
+# --- MakeMaker postamble section:
+
+
+# End.
diff --git a/README b/README

new file mode 100644 (file)

index 0000000..1a7a9d4
--- /dev/null
+++ b/README
@@ -0,0 +1,41 @@
+File::Slurp.pm version 0.04
+===========================
+
+This module provides subroutines to read or write entire files with a
+simple call.  It also has a subroutine for reading the list of filenames
+in a directory.
+
+In the extras/ directory you can read an article (slurp_article.pod)
+about file slurping and also run a benchmark (slurp_bench.pl) that
+compares many ways of slurping/spewing files.
+
+This module was first written and owned by David Muir Sharnoff (MUIR on
+CPAN).  I checked out his module and decided to write a new version
+which would be faster, and with many more features.  To that end, David
+graciously transfered the namespace to me.
+
+Since then, I discovered and fixed a bug in the original module's test
+script (which had only 7 tests), which is included now as t/original.t.
+This module now has 164 tests in 7 test scripts, and passes on Windows,
+Linux, Solaris and Mac OS X.
+
+There have been some comments about the somewhat unusual version number.
+The problem was that David used a future date (2004.0904) in his version
+number, and the only way I could get CPAN to index my new module was to
+make it have a version number higher than the old one, so I chose the
+9999 prefix and appended the real revision number to it.
+
+INSTALLATION
+
+To install this module type the following:
+
+   perl Makefile.PL
+   make
+   make test
+   make install
+
+COPYRIGHT AND LICENCE
+
+Copyright (C) 2003 Uri Guttman <uri@stemsystems.com>
+
+Licensed the same as Perl.
diff --git a/Slurp.pm b/Slurp.pm

new file mode 100755 (executable)

index 0000000..74d73a7
--- /dev/null
+++ b/Slurp.pm
@@ -0,0 +1,745 @@
+package File::Slurp;
+
+use strict;
+
+use Carp ;
+use Fcntl qw( :DEFAULT ) ;
+use POSIX qw( :fcntl_h ) ;
+use Symbol ;
+
+use base 'Exporter' ;
+use vars qw( %EXPORT_TAGS @EXPORT_OK $VERSION @EXPORT ) ;
+
+%EXPORT_TAGS = ( 'all' => [
+       qw( read_file write_file overwrite_file append_file read_dir ) ] ) ;
+
+@EXPORT = ( @{ $EXPORT_TAGS{'all'} } );
+@EXPORT_OK = qw( slurp ) ;
+
+$VERSION = '9999.13';
+
+my $is_win32 = $^O =~ /win32/i ;
+
+# Install subs for various constants that aren't set in older perls
+# (< 5.005).  Fcntl on old perls uses Exporter to define subs without a
+# () prototype These can't be overridden with the constant pragma or
+# we get a prototype mismatch.  Hence this less than aesthetically
+# appealing BEGIN block:
+
+BEGIN {
+       unless( eval { defined SEEK_SET() } ) {
+               *SEEK_SET = sub { 0 };
+               *SEEK_CUR = sub { 1 };
+               *SEEK_END = sub { 2 };
+       }
+
+       unless( eval { defined O_BINARY() } ) {
+               *O_BINARY = sub { 0 };
+               *O_RDONLY = sub { 0 };
+               *O_WRONLY = sub { 1 };
+       }
+
+       unless ( eval { defined O_APPEND() } ) {
+
+               if ( $^O =~ /olaris/ ) {
+                       *O_APPEND = sub { 8 };
+                       *O_CREAT = sub { 256 };
+                       *O_EXCL = sub { 1024 };
+               }
+               elsif ( $^O =~ /inux/ ) {
+                       *O_APPEND = sub { 1024 };
+                       *O_CREAT = sub { 64 };
+                       *O_EXCL = sub { 128 };
+               }
+               elsif ( $^O =~ /BSD/i ) {
+                       *O_APPEND = sub { 8 };
+                       *O_CREAT = sub { 512 };
+                       *O_EXCL = sub { 2048 };
+               }
+       }
+}
+
+# print "OS [$^O]\n" ;
+
+# print "O_BINARY = ", O_BINARY(), "\n" ;
+# print "O_RDONLY = ", O_RDONLY(), "\n" ;
+# print "O_WRONLY = ", O_WRONLY(), "\n" ;
+# print "O_APPEND = ", O_APPEND(), "\n" ;
+# print "O_CREAT   ", O_CREAT(), "\n" ;
+# print "O_EXCL   ", O_EXCL(), "\n" ;
+
+
+*slurp = \&read_file ;
+
+sub read_file {
+
+       my( $file_name, %args ) = @_ ;
+
+# set the buffer to either the passed in one or ours and init it to the null
+# string
+
+       my $buf ;
+       my $buf_ref = $args{'buf_ref'} || \$buf ;
+       ${$buf_ref} = '' ;
+
+       my( $read_fh, $size_left, $blk_size ) ;
+
+# check if we are reading from a handle (glob ref or IO:: object)
+
+       if ( ref $file_name ) {
+
+# slurping a handle so use it and don't open anything.
+# set the block size so we know it is a handle and read that amount
+
+               $read_fh = $file_name ;
+               $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+               $size_left = $blk_size ;
+
+# DEEP DARK MAGIC. this checks the UNTAINT IO flag of a
+# glob/handle. only the DATA handle is untainted (since it is from
+# trusted data in the source file). this allows us to test if this is
+# the DATA handle and then to do a sysseek to make sure it gets
+# slurped correctly. on some systems, the buffered i/o pointer is not
+# left at the same place as the fd pointer. this sysseek makes them
+# the same so slurping with sysread will work.
+
+               eval{ require B } ;
+
+               if ( $@ ) {
+
+                       @_ = ( \%args, <<ERR ) ;
+Can't find B.pm with this Perl: $!.
+That module is needed to slurp the DATA handle.
+ERR
+                       goto &_error ;
+               }
+
+               if ( B::svref_2object( $read_fh )->IO->IoFLAGS & 16 ) {
+
+# set the seek position to the current tell.
+
+                       sysseek( $read_fh, tell( $read_fh ), SEEK_SET ) ||
+                               croak "sysseek $!" ;
+               }
+       }
+       else {
+
+# a regular file. set the sysopen mode
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+#printf "RD: BINARY %x MODE %x\n", O_BINARY, $mode ;
+
+# open the file and handle any error
+
+               $read_fh = gensym ;
+               unless ( sysopen( $read_fh, $file_name, $mode ) ) {
+                       @_ = ( \%args, "read_file '$file_name' - sysopen: $!");
+                       goto &_error ;
+               }
+
+# get the size of the file for use in the read loop
+
+               $size_left = -s $read_fh ;
+
+               unless( $size_left ) {
+
+                       $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+                       $size_left = $blk_size ;
+               }
+       }
+
+# infinite read loop. we exit when we are done slurping
+
+       while( 1 ) {
+
+# do the read and see how much we got
+
+               my $read_cnt = sysread( $read_fh, ${$buf_ref},
+                               $size_left, length ${$buf_ref} ) ;
+
+               if ( defined $read_cnt ) {
+
+# good read. see if we hit EOF (nothing left to read)
+
+                       last if $read_cnt == 0 ;
+
+# loop if we are slurping a handle. we don't track $size_left then.
+
+                       next if $blk_size ;
+
+# count down how much we read and loop if we have more to read.
+                       $size_left -= $read_cnt ;
+                       last if $size_left <= 0 ;
+                       next ;
+               }
+
+# handle the read error
+
+               @_ = ( \%args, "read_file '$file_name' - sysread: $!");
+               goto &_error ;
+       }
+
+# fix up cr/lf to be a newline if this is a windows text file
+
+       ${$buf_ref} =~ s/\015\012/\n/g if $is_win32 && !$args{'binmode'} ;
+
+# this is the 5 returns in a row. each handles one possible
+# combination of caller context and requested return type
+
+       my $sep = $/ ;
+       $sep = '\n\n+' if defined $sep && $sep eq '' ;
+
+# caller wants to get an array ref of lines
+
+# this split doesn't work since it tries to use variable length lookbehind
+# the m// line works.
+#      return [ split( m|(?<=$sep)|, ${$buf_ref} ) ] if $args{'array_ref'}  ;
+       return [ length(${$buf_ref}) ? ${$buf_ref} =~ /(.*?$sep|.+)/sg : () ]
+               if $args{'array_ref'}  ;
+
+# caller wants a list of lines (normal list context)
+
+# same problem with this split as before.
+#      return split( m|(?<=$sep)|, ${$buf_ref} ) if wantarray ;
+       return length(${$buf_ref}) ? ${$buf_ref} =~ /(.*?$sep|.+)/sg : ()
+               if wantarray ;
+
+# caller wants a scalar ref to the slurped text
+
+       return $buf_ref if $args{'scalar_ref'} ;
+
+# caller wants a scalar with the slurped text (normal scalar context)
+
+       return ${$buf_ref} if defined wantarray ;
+
+# caller passed in an i/o buffer by reference (normal void context)
+
+       return ;
+}
+
+sub write_file {
+
+       my $file_name = shift ;
+
+# get the optional argument hash ref from @_ or an empty hash ref.
+
+       my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+       my( $buf_ref, $write_fh, $no_truncate, $orig_file_name, $data_is_ref ) ;
+
+# get the buffer ref - it depends on how the data is passed into write_file
+# after this if/else $buf_ref will have a scalar ref to the data.
+
+       if ( ref $args->{'buf_ref'} eq 'SCALAR' ) {
+
+# a scalar ref passed in %args has the data
+# note that the data was passed by ref
+
+               $buf_ref = $args->{'buf_ref'} ;
+               $data_is_ref = 1 ;
+       }
+       elsif ( ref $_[0] eq 'SCALAR' ) {
+
+# the first value in @_ is the scalar ref to the data
+# note that the data was passed by ref
+
+               $buf_ref = shift ;
+               $data_is_ref = 1 ;
+       }
+       elsif ( ref $_[0] eq 'ARRAY' ) {
+
+# the first value in @_ is the array ref to the data so join it.
+
+               ${$buf_ref} = join '', @{$_[0]} ;
+       }
+       else {
+
+# good old @_ has all the data so join it.
+
+               ${$buf_ref} = join '', @_ ;
+       }
+
+# see if we were passed a open handle to spew to.
+
+       if ( ref $file_name ) {
+
+# we have a handle. make sure we don't call truncate on it.
+
+               $write_fh = $file_name ;
+               $no_truncate = 1 ;
+       }
+       else {
+
+# spew to regular file.
+
+               if ( $args->{'atomic'} ) {
+
+# in atomic mode, we spew to a temp file so make one and save the original
+# file name.
+                       $orig_file_name = $file_name ;
+                       $file_name .= ".$$" ;
+               }
+
+# set the mode for the sysopen
+
+               my $mode = O_WRONLY | O_CREAT ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+               $mode |= O_EXCL if $args->{'no_clobber'} ;
+
+#printf "WR: BINARY %x MODE %x\n", O_BINARY, $mode ;
+
+# open the file and handle any error.
+
+               $write_fh = gensym ;
+               unless ( sysopen( $write_fh, $file_name, $mode ) ) {
+                       @_ = ( $args, "write_file '$file_name' - sysopen: $!");
+                       goto &_error ;
+               }
+       }
+
+       sysseek( $write_fh, 0, SEEK_END ) if $args->{'append'} ;
+
+
+#print 'WR before data ', unpack( 'H*', ${$buf_ref}), "\n" ;
+
+# fix up newline to write cr/lf if this is a windows text file
+
+       if ( $is_win32 && !$args->{'binmode'} ) {
+
+# copy the write data if it was passed by ref so we don't clobber the
+# caller's data
+               $buf_ref = \do{ my $copy = ${$buf_ref}; } if $data_is_ref ;
+               ${$buf_ref} =~ s/\n/\015\012/g ;
+       }
+
+#print 'after data ', unpack( 'H*', ${$buf_ref}), "\n" ;
+
+# get the size of how much we are writing and init the offset into that buffer
+
+       my $size_left = length( ${$buf_ref} ) ;
+       my $offset = 0 ;
+
+# loop until we have no more data left to write
+
+       do {
+
+# do the write and track how much we just wrote
+
+               my $write_cnt = syswrite( $write_fh, ${$buf_ref},
+                               $size_left, $offset ) ;
+
+               unless ( defined $write_cnt ) {
+
+# the write failed
+                       @_ = ( $args, "write_file '$file_name' - syswrite: $!");
+                       goto &_error ;
+               }
+
+# track much left to write and where to write from in the buffer
+
+               $size_left -= $write_cnt ;
+               $offset += $write_cnt ;
+
+       } while( $size_left > 0 ) ;
+
+# we truncate regular files in case we overwrite a long file with a shorter file
+# so seek to the current position to get it (same as tell()).
+
+       truncate( $write_fh,
+                 sysseek( $write_fh, 0, SEEK_CUR ) ) unless $no_truncate ;
+
+       close( $write_fh ) ;
+
+# handle the atomic mode - move the temp file to the original filename.
+
+       rename( $file_name, $orig_file_name ) if $args->{'atomic'} ;
+
+       return 1 ;
+}
+
+# this is for backwards compatibility with the previous File::Slurp module. 
+# write_file always overwrites an existing file
+
+*overwrite_file = \&write_file ;
+
+# the current write_file has an append mode so we use that. this
+# supports the same API with an optional second argument which is a
+# hash ref of options.
+
+sub append_file {
+
+# get the optional args hash ref
+       my $args = $_[1] ;
+       if ( ref $args eq 'HASH' ) {
+
+# we were passed an args ref so just mark the append mode
+
+               $args->{append} = 1 ;
+       }
+       else {
+
+# no args hash so insert one with the append mode
+
+               splice( @_, 1, 0, { append => 1 } ) ;
+       }
+
+# magic goto the main write_file sub. this overlays the sub without touching
+# the stack or @_
+
+       goto &write_file
+}
+
+# basic wrapper around opendir/readdir
+
+sub read_dir {
+
+       my ($dir, %args ) = @_;
+
+# this handle will be destroyed upon return
+
+       local(*DIRH);
+
+# open the dir and handle any errors
+
+       unless ( opendir( DIRH, $dir ) ) {
+
+               @_ = ( \%args, "read_dir '$dir' - opendir: $!" ) ;
+               goto &_error ;
+       }
+
+       my @dir_entries = readdir(DIRH) ;
+
+       @dir_entries = grep( $_ ne "." && $_ ne "..", @dir_entries )
+               unless $args{'keep_dot_dot'} ;
+
+       return @dir_entries if wantarray ;
+       return \@dir_entries ;
+}
+
+# error handling section
+#
+# all the error handling uses magic goto so the caller will get the
+# error message as if from their code and not this module. if we just
+# did a call on the error code, the carp/croak would report it from
+# this module since the error sub is one level down on the call stack
+# from read_file/write_file/read_dir.
+
+
+my %err_func = (
+       'carp'  => \&carp,
+       'croak' => \&croak,
+) ;
+
+sub _error {
+
+       my( $args, $err_msg ) = @_ ;
+
+# get the error function to use
+
+       my $func = $err_func{ $args->{'err_mode'} || 'croak' } ;
+
+# if we didn't find it in our error function hash, they must have set
+# it to quiet and we don't do anything.
+
+       return unless $func ;
+
+# call the carp/croak function
+
+       $func->($err_msg) ;
+
+# return a hard undef (in list context this will be a single value of
+# undef which is not a legal in-band value)
+
+       return undef ;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+File::Slurp - Efficient Reading/Writing of Complete Files
+
+=head1 SYNOPSIS
+
+  use File::Slurp;
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+  write_file( 'filename', @lines ) ;
+
+  use File::Slurp qw( slurp ) ;
+
+  my $text = slurp( 'filename' ) ;
+
+
+=head1 DESCRIPTION
+
+This module provides subs that allow you to read or write entire files
+with one simple call. They are designed to be simple to use, have
+flexible ways to pass in or get the file contents and to be very
+efficient.  There is also a sub to read in all the files in a
+directory other than C<.> and C<..>
+
+These slurp/spew subs work for files, pipes and
+sockets, and stdio, pseudo-files, and DATA.
+
+=head2 B<read_file>
+
+This sub reads in an entire file and returns its contents to the
+caller. In list context it will return a list of lines (using the
+current value of $/ as the separator including support for paragraph
+mode when it is set to ''). In scalar context it returns the entire
+file as a single scalar.
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+The first argument to C<read_file> is the filename and the rest of the
+arguments are key/value pairs which are optional and which modify the
+behavior of the call. Other than binmode the options all control how
+the slurped file is returned to the caller.
+
+If the first argument is a file handle reference or I/O object (if ref
+is true), then that handle is slurped in. This mode is supported so
+you slurp handles such as C<DATA>, C<STDIN>. See the test handle.t
+for an example that does C<open( '-|' )> and child process spews data
+to the parant which slurps it in.  All of the options that control how
+the data is returned to the caller still work in this case.
+
+NOTE: as of version 9999.06, read_file works correctly on the C<DATA>
+handle. It used to need a sysseek workaround but that is now handled
+when needed by the module itself.
+
+You can optionally request that C<slurp()> is exported to your code. This
+is an alias for read_file and is meant to be forward compatible with
+Perl 6 (which will have slurp() built-in).
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be slurped in binary
+mode.
+
+       my $bin_data = read_file( $bin_file, binmode => ':raw' ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 array_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an array reference which contains the lines of the
+slurped file. The following two calls are equivalent:
+
+       my $lines_ref = read_file( $bin_file, array_ref => 1 ) ;
+       my $lines_ref = [ read_file( $bin_file ) ] ;
+
+=head3 scalar_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an scalar reference to a string which is the contents
+of the slurped file. This will usually be faster than returning the
+plain scalar.
+
+       my $text_ref = read_file( $bin_file, scalar_ref => 1 ) ;
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference and the slurped
+file contents will be stored in the scalar. This can be used in
+conjunction with any of the other options.
+
+       my $text_ref = read_file( $bin_file, buf_ref => \$buffer,
+                                            array_ref => 1 ) ;
+       my @lines = read_file( $bin_file, buf_ref => \$buffer ) ;
+
+=head3 blk_size
+
+You can use this option to set the block size used when slurping from an already open handle (like \*STDIN). It defaults to 1MB.
+
+       my $text_ref = read_file( $bin_file, blk_size => 10_000_000,
+                                            array_ref => 1 ) ;
+
+=head3 err_mode
+
+You can use this option to control how read_file behaves when an error
+occurs. This option defaults to 'croak'. You can set it to 'carp' or
+to 'quiet to have no error handling. This code wants to carp and then
+read abother file if it fails.
+
+       my $text_ref = read_file( $file, err_mode => 'carp' ) ;
+       unless ( $text_ref ) {
+
+               # read a different file but croak if not found
+               $text_ref = read_file( $another_file ) ;
+       }
+       
+       # process ${$text_ref}
+
+=head2 B<write_file>
+
+This sub writes out an entire file in one call.
+
+  write_file( 'filename', @data ) ;
+
+The first argument to C<write_file> is the filename. The next argument
+is an optional hash reference and it contains key/values that can
+modify the behavior of C<write_file>. The rest of the argument list is
+the data to be written to the file.
+
+  write_file( 'filename', {append => 1 }, @data ) ;
+  write_file( 'filename', {binmode => ':raw' }, $buffer ) ;
+
+As a shortcut if the first data argument is a scalar or array
+reference, it is used as the only data to be written to the file. Any
+following arguments in @_ are ignored. This is a faster way to pass in
+the output to be written to the file and is equivilent to the
+C<buf_ref> option. These following pairs are equivilent but the pass
+by reference call will be faster in most cases (especially with larger
+files).
+
+  write_file( 'filename', \$buffer ) ;
+  write_file( 'filename', $buffer ) ;
+
+  write_file( 'filename', \@lines ) ;
+  write_file( 'filename', @lines ) ;
+
+If the first argument is a file handle reference or I/O object (if ref
+is true), then that handle is slurped in. This mode is supported so
+you spew to handles such as \*STDOUT. See the test handle.t for an
+example that does C<open( '-|' )> and child process spews data to the
+parant which slurps it in.  All of the options that control how the
+data is passes into C<write_file> still work in this case.
+
+C<write_file> returns 1 upon successfully writing the file or undef if
+it encountered an error.
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be written in binary
+mode.
+
+       write_file( $bin_file, {binmode => ':raw'}, @data ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference which has the
+data to be written. If this is set then any data arguments (including
+the scalar reference shortcut) in @_ will be ignored. These are
+equivilent:
+
+       write_file( $bin_file, { buf_ref => \$buffer } ) ;
+       write_file( $bin_file, \$buffer ) ;
+       write_file( $bin_file, $buffer ) ;
+
+=head3 atomic
+
+If you set this boolean option, the file will be written to in an
+atomic fashion. A temporary file name is created by appending the pid
+($$) to the file name argument and that file is spewed to. After the
+file is closed it is renamed to the original file name (and rename is
+an atomic operation on most OS's). If the program using this were to
+crash in the middle of this, then the file with the pid suffix could
+be left behind.
+
+=head3 append
+
+If you set this boolean option, the data will be written at the end of
+the current file.
+
+       write_file( $file, {append => 1}, @data ) ;
+
+C<write_file> croaks if it cannot open the file. It returns true if it
+succeeded in writing out the file and undef if there was an
+error. (Yes, I know if it croaks it can't return anything but that is
+for when I add the options to select the error handling mode).
+
+=head3 no_clobber
+
+If you set this boolean option, an existing file will not be overwritten.
+
+       write_file( $file, {no_clobber => 1}, @data ) ;
+
+=head3 err_mode
+
+You can use this option to control how C<write_file> behaves when an
+error occurs. This option defaults to 'croak'. You can set it to
+'carp' or to 'quiet' to have no error handling other than the return
+value. If the first call to C<write_file> fails it will carp and then
+write to another file. If the second call to C<write_file> fails, it
+will croak.
+
+       unless ( write_file( $file, { err_mode => 'carp', \$data ) ;
+
+               # write a different file but croak if not found
+               write_file( $other_file, \$data ) ;
+       }
+
+=head2 overwrite_file
+
+This sub is just a typeglob alias to write_file since write_file
+always overwrites an existing file. This sub is supported for
+backwards compatibility with the original version of this module. See
+write_file for its API and behavior.
+
+=head2 append_file
+
+This sub will write its data to the end of the file. It is a wrapper
+around write_file and it has the same API so see that for the full
+documentation. These calls are equivilent:
+
+       append_file( $file, @data ) ;
+       write_file( $file, {append => 1}, @data ) ;
+
+=head2 read_dir
+
+This sub reads all the file names from directory and returns them to
+the caller but C<.> and C<..> are removed by default.
+
+       my @files = read_dir( '/path/to/dir' ) ;
+
+It croaks if it cannot open the directory.
+
+In a list context C<read_dir> returns a list of the entries in the
+directory. In a scalar context it returns an array reference which has
+the entries.
+
+=head3 keep_dot_dot
+
+If this boolean option is set, C<.> and C<..> are not removed from the
+list of files.
+
+       my @all_files = read_dir( '/path/to/dir', keep_dot_dot => 1 ) ;
+
+=head2 EXPORT
+
+  read_file write_file overwrite_file append_file read_dir
+
+=head2 SEE ALSO
+
+An article on file slurping in extras/slurp_article.pod. There is
+also a benchmarking script in extras/slurp_bench.pl.
+
+=head2 BUGS
+
+If run under Perl 5.004, slurping from the DATA handle will fail as
+that requires B.pm which didn't get into core until 5.005.
+
+=head1 AUTHOR
+
+Uri Guttman, E<lt>uri@stemsystems.comE<gt>
+
+=cut
diff --git a/TODO b/TODO

new file mode 100644 (file)

index 0000000..ea70a25
--- /dev/null
+++ b/TODO
@@ -0,0 +1,27 @@
+
+optimize read_file for smaller scalar slurps with no options
+
+prepend_file()
+
+       options: lock file?
+
+edit_file()
+
+       options: lock file?
+
+edit_file_lines()
+
+       options: lock file?
+
+add options to read_dir
+       prepend_dir
+       grep filter qr or code ref
+
+
+BUGS:
+
+restart sysread/write after a signal (or check i/o count)
+
+fix SEEK stuff
+
+
diff --git a/experiment/DATA_taint_check b/experiment/DATA_taint_check

new file mode 100644 (file)

index 0000000..f7b933a
--- /dev/null
+++ b/experiment/DATA_taint_check
@@ -0,0 +1,5 @@
+
+# DATA handle is always untainted but all other handles are tainted
+
+use B
+svref_2object($foo)->IoFLAGS & 16
diff --git a/experiment/carp.pl b/experiment/carp.pl

new file mode 100644 (file)

index 0000000..3e2dce5
--- /dev/null
+++ b/experiment/carp.pl
@@ -0,0 +1,10 @@
+#!/usr/local/bin/perl
+
+
+use Carp ;
+
+$carp = shift ;
+
+( ( $carp eq 'carp' ) ? \&carp : \&croak )->( "can't open file\n ) ;
+
+print "ok\n" ;
diff --git a/experiment/seek.pl b/experiment/seek.pl

new file mode 100644 (file)

index 0000000..159d269
--- /dev/null
+++ b/experiment/seek.pl
@@ -0,0 +1,12 @@
+#!/usr/local/bin/perl
+
+use Fcntl qw( :seek ) ;
+my $tell = tell( \*DATA );
+my $sysseek = sysseek( \*DATA, 0, SEEK_CUR ) ;
+
+print "TELL $tell SYSSEEK $sysseek\n" ;
+
+__DATA__
+foo
+bar
+
diff --git a/experiment/split.pl b/experiment/split.pl

new file mode 100644 (file)

index 0000000..3c4db00
--- /dev/null
+++ b/experiment/split.pl
@@ -0,0 +1,42 @@
+#!/usr/local/bin/perl
+
+use strict ;
+
+use Benchmark qw( timethese cmpthese ) ;
+
+my $dur = shift || -2 ;
+
+my $data = 'abc' x 30 . "\n"  x 1000 ;
+
+my $sep = $/ ;
+
+# my $result = timethese( $dur, {
+#              split => 'my @lines = splitter()',
+#              regex => 'my @lines = regex()',
+#              damian => 'my @lines = damian()',
+# } ) ;
+
+# cmpthese( $result ) ;
+
+$data = "abcdefgh\n\n\n" x 5 ;
+$data = "abcdefgh\n" x 2 . 'z' ;
+
+$data = '' ;
+
+#$sep = '\n\n+' ;
+$sep = '\n' ;
+
+my @paras ;
+
+@paras = regex() ;
+print "REG\n", map "[$_]\n", @paras ;
+
+#@paras = damian() ;
+#print "DAM\n", map "[$_]\n", @paras ;
+
+sub splitter { split( m|(?<=$sep)|, $data ) }
+sub regex { $data =~ /(.*?$sep|.*)/sg }
+sub damian { $data =~ /.*?(?:$sep|\Z)/gs }
+
+
+exit ;
diff --git a/experiment/sysread.pl b/experiment/sysread.pl

new file mode 100644 (file)

index 0000000..dc40a49
--- /dev/null
+++ b/experiment/sysread.pl
@@ -0,0 +1,28 @@
+#!/usr/local/bin/perl
+
+print tell(\*DATA), "\n" ;
+print sysseek(\*DATA, 0, 1), "\n" ;
+
+my $read_cnt = sysread( \*DATA, $buf, 1000 ) ;
+print "CNT $read_cnt\n[$buf]\n" ;
+
+my $read_cnt = sysread( *DATA, $buf, 1000 ) ;
+print "CNT $read_cnt\n[$buf]\n" ;
+
+open( FOO, "<&DATA" ) || die "reopen $!" ;
+
+
+my $read_cnt = sysread( \*FOO, $buf, 1000 ) ;
+print "CNT $read_cnt\n[$buf]\n" ;
+
+my $read_cnt = read( \*FOO, $buf, 1000 ) ;
+print "CNT $read_cnt\n[$buf]\n" ;
+
+@lines = <DATA> ;
+print "lines [@lines]\n" ;
+
+
+__END__
+line 1
+foo bar
+
diff --git a/extras/new_text b/extras/new_text

new file mode 100644 (file)

index 0000000..169e9b4
--- /dev/null
+++ b/extras/new_text
@@ -0,0 +1,148 @@
+Somewhere along the line, I learned about a way to slurp files faster
+than by setting $/ to undef. The method is very simple, you do a single
+read call with the size of the file (which the -s operator provides).
+This bypasses the I/O loop inside perl that checks for EOF and does all
+sorts of processing. I then decided to experiment and found that
+sysread is even faster as you would expect. sysread bypasses all of
+Perl's stdio and reads the file from the kernel buffers directly into a
+Perl scalar. This is why the slurp code in File::Slurp uses
+sysopen/sysread/syswrite. All the rest of the code is just to support
+the various options and data passing techniques.
+
+
+Benchmarks can be enlightening, informative, frustrating and
+deceiving. It would make no sense to create a new and more complex slurp
+module unless it also gained signifigantly in speed. So I created a
+benchmark script which compares various slurp methods with differing
+file sizes and calling contexts. This script can be run from the main
+directory of the tarball like this:
+
+       perl -Ilib extras/slurp_bench.pl
+
+If you pass in an argument on the command line, it will be passed to
+timethese() and it will control the duration. It defaults to -2 which
+makes each benchmark run to at least 2 seconds of cpu time.
+
+The following numbers are from a run I did on my 300Mhz sparc. You will
+most likely get much faster counts on your boxes but the relative speeds
+shouldn't change by much. If you see major differences on your
+benchmarks, please send me the results and your Perl and OS
+versions. Also you can play with the benchmark script and add more slurp
+variations or data files.
+
+The rest of this section will be discussing the results of the
+benchmarks. You can refer to extras/slurp_bench.pl to see the code for
+the individual benchmarks. If the benchmark name starts with cpan_, it
+is either from Slurp.pm or File::Slurp.pm. Those starting with new_ are
+from the new File::Slurp.pm. Those that start with file_contents_ are
+from a client's code base. The rest are variations I created to
+highlight certain aspects of the benchmarks.
+
+The short and long file data is made like this:
+
+       my @lines = ( 'abc' x 30 . "\n")  x 100 ;
+       my $text = join( '', @lines ) ;
+
+       @lines = ( 'abc' x 40 . "\n")  x 1000 ;
+       $text = join( '', @lines ) ;
+
+So the short file is 9,100 bytes and the long file is 121,000
+bytes. 
+
+=head3         Scalar Slurp of Short File
+
+       file_contents        651/s
+       file_contents_no_OO  828/s
+       cpan_read_file      1866/s
+       cpan_slurp          1934/s
+       read_file           2079/s
+       new                 2270/s
+       new_buf_ref         2403/s
+       new_scalar_ref      2415/s
+       sysread_file        2572/s
+
+=head3         Scalar Slurp of Long File
+
+       file_contents_no_OO 82.9/s
+       file_contents       85.4/s
+       cpan_read_file       250/s
+       cpan_slurp           257/s
+       read_file            323/s
+       new                  468/s
+       sysread_file         489/s
+       new_scalar_ref       766/s
+       new_buf_ref          767/s
+
+The primary inference you get from looking at the mumbers above is that
+when slurping a file into a scalar, the longer the file, the more time
+you save by returning the result via a scalar reference. The time for
+the extra buffer copy can add up. The new module came out on top overall
+except for the very simple sysread_file entry which was added to
+highlight the overhead of the more flexible new module which isn't that
+much. The file_contents entries are always the worst since they do a
+list slurp and then a join, which is a classic newbie and cargo culted
+style which is extremely slow. Also the OO code in file_contents slows
+it down even more (I added the file_contents_no_OO entry to show this).
+The two CPAN modules are decent with small files but they are laggards
+compared to the new module when the file gets much larger.
+
+=head3         List Slurp of Short File
+
+       cpan_read_file          589/s
+       cpan_slurp_to_array     620/s
+       read_file               824/s
+       new_array_ref           824/s
+       sysread_file            828/s
+       new                     829/s
+       new_in_anon_array       833/s
+       cpan_slurp_to_array_ref 836/s
+
+=head3         List Slurp of Long File
+
+       cpan_read_file          62.4/s
+       cpan_slurp_to_array     62.7/s
+       read_file               92.9/s
+       sysread_file            94.8/s
+       new_array_ref           95.5/s
+       new                     96.2/s
+       cpan_slurp_to_array_ref 96.3/s
+       new_in_anon_array       97.2/s
+
+
+=head3         Scalar Spew of Short File
+
+       cpan_write_file 1035/s
+       print_file      1055/s
+       syswrite_file   1135/s
+       new             1519/s
+       print_join_file 1766/s
+       new_ref         1900/s
+       syswrite_file2  2138/s
+
+=head3         Scalar Spew of Long File
+
+       cpan_write_file 164/s   20
+       print_file      211/s   26
+       syswrite_file   236/s   25
+       print_join_file 277/s   2
+       new             295/s   2
+       syswrite_file2  428/s   25
+       new_ref         608/s   2
+
+
+=head3 List Spew of Short File
+
+       cpan_write_file  794/s
+       syswrite_file   1000/s
+       print_file      1013/s
+       new             1399/s
+       print_join_file 1557/s
+
+=head3         List Spew of Long File
+
+       cpan_write_file 112/s   12
+       print_file      179/s   21
+       syswrite_file   181/s   19
+       print_join_file 205/s   2
+       new             228/s   2
+
diff --git a/extras/slurp2.pod b/extras/slurp2.pod

new file mode 100644 (file)

index 0000000..264c174
--- /dev/null
+++ b/extras/slurp2.pod
@@ -0,0 +1,516 @@
+=head1 Perl Slurp Ease
+
+=head2 Introduction
+ 
+
+One of the common Perl idioms is processing text files line by line
+
+       while( <FH> ) {
+               do something with $_
+       }
+
+This idiom has several variants but the key point is that it reads in
+only one line from the file in each loop iteration. This has several
+advantages including limiting memory use to one line, the ability to
+handle any size file (including data piped in via STDIN), and it is
+easily taught and understood to Perl newbies. In fact newbies are the
+ones who do silly things like this:
+
+       while( <FH> ) {
+               push @lines, $_ ;
+       }
+
+       foreach ( @lines ) {
+               do something with $_
+       }
+
+Line by line processing is fine but it isn't the only way to deal with
+reading files. The other common style is reading the entire file into a
+scalar or array and that is commonly known as slurping. Now slurping has
+somewhat of a poor reputation and this article is an attempt at
+rehabilitating it. Slurping files has advantages and limitations and is
+not something you should just do when line by line processing is fine.
+It is best when you need the entire file in memory for processing all at
+once. Slurping with in memory processing can be faster and lead to
+simpler code than line by line if done properly.
+
+The biggest issue to watch for with slurping is file size. Slurping very
+large files or unknown amounts of data from STDIN can be disastrous to
+your memory usage and cause swap disk thrashing. I advocate slurping
+only disk files and only when you know their size is reasonable and you
+have a real reason to process the file as a whole. Note that reasonable
+size these days is larger than the bad old days of limited RAM. Slurping
+in a megabyte size file is not an issue on most systems. But most of the
+files I tend to slurp in are much smaller than that. Typical files that
+work well with slurping are configuration files, (mini)language scripts,
+some data (especially binary) files, and other files of known sizes
+which need fast processing.
+
+Another major win for slurping over line by line is speed. Perl's IO
+system (like many others) is slow. Calling <> for each line requires a
+check for the end of line, checks for EOF, copying a line, munging the
+internal handle structure, etc. Plenty of work for each line read
+in. Whereas slurping, if done correctly, will usually involve only one
+IO call and no extra data copying. The same is true for writing files to
+disk and we will cover that as well (even though the term slurping is
+traditionally a read operation, I use the term slurp for the concept of
+doing IO with an entire file in one operation).
+
+Finally, when you have slurped the entire file into memory, you can do
+operations on the data that are not possible or easily done with line by
+line processing. These include global search/replace (without regard for
+newlines), grabbing all matches with one call of m//g, complex parsing
+(which in many cases must ignore newlines), processing *ML (where line
+endings are just white space) and performing complex transformations
+such as template expansion.
+
+=head2 Global Operations
+
+Here are some simple global operations that can be done quickly and
+easily on an entire file that has been slurped in. They could also be
+done with line by line processing but that would be slower and require
+more code.
+
+A common problem is reading in a file with key/value pairs. There are
+modules which do this but who needs them for simple formats? Just slurp
+in the file and do a single parse to grab all the key/value pairs.
+
+       my $text = read_file( $file ) ;
+       my %config = $test =~ /^(\w+)=(.+)$/mg ;
+
+That matches a key which starts a line (anywhere inside the string
+because of the /m modifier), the '=' char and the text to the end of the
+line (again /m makes that work). In fact the ending $ is not even needed
+since . will not normally match a newline. Since the key and value are
+grabbed and the m// is in list context with the /g modifier, it will
+grab all key/value pairs and return them. The %config hash will be
+assigned this list and now you have the file fully parsed into a hash.
+
+Various projects I have worked on needed some simple templating and I
+wasn't in the mood to use a full module (please,no flames about your
+favorite template module :-). So I rolled my own by slurping in the
+template file, setting up a template hash and doing this one line:
+
+       $text =~ s/<%(.+?)%>/$template{$1}/g ;
+
+That only works if the entire file was slurped in. With a little
+extra work it can handle chunks of text to be expanded:
+
+       $text =~ s/<%(\w+)_START%>(.+)<%\1_END%>/ template($1, $2)/sge ;
+
+Just supply a template sub to expand the text between the markers and
+you have yourself a simple system with minimal code. Note that this will
+work and grab over multiple lines due the the /s modifier. This is
+something that is much trickier with line by line processing.
+
+Note that this is a very simple templating system and it can't directly
+handle nested tags and other complex features. But even if you use one
+of the myriad of template modules on the CPAN, you will gain by having
+speedier ways to read/write files.
+
+Slurping in a file into an array also offers some useful advantages. 
+
+
+=head2 Traditional Slurping
+
+Perl has always supported slurping files with minimal code. Slurping of
+a file to a list of lines is trivial, just call the <> operator in a
+list context:
+
+       my @lines = <FH> ;
+
+and slurping to a scalar isn't much more work. Just set the built in
+variable $/ (the input record separator to the undefined value and read
+in the file with <>:
+
+       {
+               local( $/, *FH ) ;
+               open( FH, $file ) or die "sudden flaming death\n"
+               $text = <FH>
+       }
+
+Notice the use of local(). It sets $/ to undef for you and when the
+scope exits it will revert $/ back to its previous value (most likely
+"\n"). Here is a Perl idiom that allows the $text variable to be
+declared and there is no need for a tightly nested block. The do block
+will execute the <FH> in a scalar context and slurp in the file which is
+assigned to $text.
+
+       local( *FH ) ;
+       open( FH, $file ) or die "sudden flaming death\n"
+       my $text = do { local( $/ ) ; <FH> } ;
+
+Both of those slurps used localized filehandles to be compatible with
+5.005. Here they are with 5.6.0 lexical autovivified handles:
+
+       {
+               local( $/ ) ;
+               open( my $fh, $file ) or die "sudden flaming death\n"
+               $text = <$fh>
+       }
+
+       open( my $fh, $file ) or die "sudden flaming death\n"
+       my $text = do { local( $/ ) ; <$fh> } ;
+
+And this is a variant of that idiom that removes the need for the open
+call:
+
+       my $text = do { local( @ARGV, $/ ) = $file ; <> } ;
+
+The filename in $file is assigned to a localized @ARGV and the null
+filehandle is used which reads the data from the files in @ARGV.
+
+Instead of assigning to a scalar, all the above slurps can assign to an
+array and it will get the file but split into lines (using $/ as the end
+of line marker).
+
+There is one common variant of those slurps which is very slow and not
+good code. You see it around and it is almost always cargo cult code:
+
+       my $text = join( '', <FH> ) ;
+
+That needlessly splits the input file into lines (join provides a list
+context to <FH>) and then joins up those lines again. The original coder
+of this idiom obviously never read perlvar and learned how to use $/ to
+allow scalar slurping.
+
+=head2 Write Slurping
+
+While reading in entire files at one time is common, writing out entire
+files is also done. We call it slurping when we read in files but there
+is no commonly accepted term for the write operation. I asked some Perl
+colleagues and got two interesting nominations. Peter Scott said to call
+it burping (rhymes with slurping and the noise is the opposite
+direction). Others suggested spewing which has a stronger visual image
+:-) Tell me your favorite or suggest your own. I will use both in this
+section so you can see how they work for you.
+
+Spewing a file is a much simpler operation than slurping. You don't have
+context issues to worry about and there is no efficiency problem with
+returning a buffer. Here is a simple burp sub:
+
+       sub burp {
+               my( $file_name ) = shift ;
+               open( my $fh, ">$file_name" ) || 
+                                die "can't create $file_name $!" ;
+               print $fh @_ ;
+       }
+
+Note that it doesn't copy the input text but passes @_ directly to
+print. We will look at faster variations of that later on.
+
+=head2 Slurp on the CPAN
+
+As you would expect there are modules in the CPAN that will slurp files
+for you. The two I found are called Slurp.pm (by Rob Casey - ROBAU on
+CPAN) and File::Slurp.pm (by David Muir Sharnoff - MUIR on CPAN).
+
+Here is the code from Slurp.pm:
+
+    sub slurp { 
+       local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+       return <ARGV>;
+    }
+
+    sub to_array {
+       my @array = slurp( @_ );
+       return wantarray ? @array : \@array;
+    }
+
+    sub to_scalar {
+       my $scalar = slurp( @_ );
+       return $scalar;
+    }
+
+The sub slurp uses the magic undefined value of $/ and the magic file
+handle ARGV to support slurping into a scalar or array. It also provides
+two wrapper subs that allow the caller to control the context of the
+slurp. And the to_array sub will return the list of slurped lines or a
+anonymous array of them according to its caller's context by checking
+wantarray. It has 'slurp' in @EXPORT and all three subs in @EXPORT_OK.
+A final point is that Slurp.pm is poorly named and it shouldn't be in
+the top level namespace.
+
+File::Slurp.pm has this code:
+
+sub read_file
+{
+       my ($file) = @_;
+
+       local($/) = wantarray ? $/ : undef;
+       local(*F);
+       my $r;
+       my (@r);
+
+       open(F, "<$file") || croak "open $file: $!";
+       @r = <F>;
+       close(F) || croak "close $file: $!";
+
+       return $r[0] unless wantarray;
+       return @r;
+}
+
+This module provides several subs including read_file (more on the
+others later). read_file behaves simularly to Slurp::slurp in that it
+will slurp a list of lines or a single scalar depending on the caller's
+context. It also uses the magic undefined value of $/ for scalar
+slurping but it uses an explicit open call rather than using a localized
+@ARGV and the other module did. Also it doesn't provide a way to get an
+anonymous array of the lines but that can easily be rectified by calling
+it inside an anonymous array constuctor [].
+
+Both of these modules make it easier for Perl coders to slurp in
+files. They both use the magic $/ to slurp in scalar mode and the
+natural behavior of <> in list context to slurp as lines. But neither is
+optmized for speed nor can they handle binmode to support binary or
+unicode files. See below for more on slurp features and speedups.
+
+=head2 Slurping API Design
+
+The slurp modules on CPAN are have a very simple API and don't support
+binmode. This section will cover various API design issues such as
+efficient return by reference, binmode and calling variations.
+
+Let's start with the call variations. Slurped files can be returned in
+four formats, as a single scalar, as a reference to a scalar, as a list
+of lines and as an anonymous array of lines. But the caller can only
+provide two contexts, scalar or list. So we have to either provide an
+API with more than one sub as Slurp.pm did or just provide one sub which
+only returns a scalar or a list (no anonymous array) as File::Slurp.pm
+does.
+
+I have used my own read_file sub for years and it has the same API as
+File::Slurp.pm, a single sub which returns a scalar or a list of lines
+depending on context. But I recognize the interest of those that want an
+anonymous array for line slurping. For one thing it is easier to pass
+around to other subs and another it eliminates the extra copying of the
+lines via return. So my module will support multiple subs with one that
+returns the file based on context and the other returns only lines
+(either as a list or as an anonymous array). So this API is in between
+the two CPAN modules. There is no need for a specific slurp in as a
+scalar sub as the general slurp will do that in scalar context. If you
+wanted to slurp a scalar into an array, just select the desired array
+element and that will provide scalar context to the read_file sub.
+
+The next area to cover is what to name these subs. I will go with
+read_file and read_file_lines. They are descriptive, simple and don't
+use the 'slurp' nickname (though that nick is in the module name).
+
+Another critical area when designing APIs is how to pass in
+arguments. The read_file subs takes one required argument which is the
+file name. To support binmode we need another optional argument. And a
+third optional argument is needed to support returning a slurped scalar
+by reference. My first thought was to design the API with 3 positional
+arguments - file name, buffer reference and binmode. But if you want to
+set the binmode and not pass in a buffer reference, you have to fill the
+second argument with undef and that is ugly. So I decided to make the
+filename argument positional and the other two are pass by name.
+The sub will start off like this:
+
+       sub read_file {
+
+               my( $file_name, %args ) = @_ ;
+
+               my $buf ;
+               my $buf_ref = $args{'buf'} || \$buf ;
+
+The binmode argument will be handled later (see code below).
+
+The other sub read_file_lines will only take an optional binmode (so you
+can read files with binary delimiters). It doesn't need a buffer
+reference argument since it can return an anonymous array if the called
+in a scalar context. So this sub could use positional arguments but to
+keep its API similar to the API of read_file, it will also use pass by
+name for the optional arguments. This also means that new optional
+arguments can be added later without breaking any legacy code. A bonus
+with keeping the API the same for both subs will be seen how the two
+subs are optimized to work together.
+
+Write slurping (or spewing or burping :-) needs to have its API designed
+as well. The biggest issue is not only needing to support optional
+arguments but a list of arguments to be written is needed. Perl 6 can
+handle that with optional named arguments and a final slurp
+argument. Since this is Perl 5 we have to do it using some
+cleverness. The first argument is the file name and it will be
+positional as with the read_file sub. But how can we pass in the
+optional arguments and also a list of data? The solution lies in the
+fact that the data list should never contain a reference.
+Burping/spewing works only on plain data. So if the next argument is a
+hash reference, we can assume it is the optional arguments and the rest
+of the arguments is the data list. So the write_file sub will start off
+like this:
+
+       sub write_file {
+
+               my $file_name = shift ;
+
+               my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+Whether or not optional arguments are passed in, we leave the data list
+in @_ to minimize any more copying. You call write_file like this:
+
+       write_file( 'foo', { binmode => ':raw' }, @data ) ;
+       write_file( 'junk', { append => 1 }, @more_junk ) ;
+       write_file( 'bar', @spew ) ;
+
+=head2 Fast Slurping
+
+ 
+=head2 Benchmarks
+
+
+=head2 Error Handling
+
+Slurp subs are subject to conditions such as not being able to open the
+file or I/O errors. How these errors are handled and what the caller
+will see are important aspects of the design of an API. The classic
+error handling for slurping has been to call die or even better,
+croak. But sometimes you want to either the slurp to either warn/carp
+and allow your code to handle the error. Sure, this can be done by
+wrapping the slurp in a eval block to catch a fatal error, but not
+everyone wants all that extra code. So I have added another option to
+all the subs which selects the error handling. If the 'err_mode' option
+is 'croak' (which is also the default, the called sub will croak. If set
+to 'carp' then carp will be called. Set to any other string (use 'quiet'
+by convention) and no error handler call is made. Then the caller can
+use the error status from the call.
+
+C<write_file> doesn't use the return value for data so it can return a
+false status value in-band to mark an error. C<read_file> does use its
+return value for data but we can still make it pass back the error
+status. A successful read in any scalar mode will return either a
+defined data string or a (scalar or array) reference. So a bare return
+would work here. But if you slurp in lines by calling it in a list
+context, a bare return will return an empty list which is the same value
+it would from from an existing but empty file. So now, C<read_file> will
+do something I strongly advocate against, which is returning a call to
+undef. In the scalar contexts this still returns a error and now in list
+context, the returned first value will be undef and that is not legal
+data for the first element. So the list context also gets a error status
+it can detect:
+
+       my @lines = read_file( $file_name, err_mode => 'quiet' ) ;
+       your_handle_error( "$file_name can't be read\n" ) unless
+                                       @lines && defined $lines[0] ;
+
+
+=head2 File::FastSlurp
+
+       sub read_file {
+
+               my( $file_name, %args ) = @_ ;
+
+               my $buf ;
+               my $buf_ref = $args{'buf_ref'} || \$buf ;
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+               local( *FH ) ;
+               sysopen( FH, $file_name, $mode ) or
+                                       carp "Can't open $file_name: $!" ;
+
+               my $size_left = -s FH ;
+
+               while( $size_left > 0 ) {
+
+                       my $read_cnt = sysread( FH, ${$buf_ref},
+                                       $size_left, length ${$buf_ref} ) ;
+
+                       unless( $read_cnt ) {
+
+                               carp "read error in file $file_name: $!" ;
+                               last ;
+                       }
+
+                       $size_left -= $read_cnt ;
+               }
+
+       # handle void context (return scalar by buffer reference)
+
+               return unless defined wantarray ;
+
+       # handle list context
+
+               return split m|?<$/|g, ${$buf_ref} if wantarray ;
+
+       # handle scalar context
+
+               return ${$buf_ref} ;
+       }
+
+
+       sub read_file_lines {
+
+       # handle list context
+
+               return &read_file if wantarray ;;
+
+       # otherwise handle scalar context
+
+               return [ &read_file ] ;
+       }
+
+
+       sub write_file {
+
+               my $file_name = shift ;
+
+               my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+               my $buf = join '', @_ ;
+
+
+               my $mode = O_WRONLY ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+
+               local( *FH ) ;
+               sysopen( FH, $file_name, $mode ) or
+                                       carp "Can't open $file_name: $!" ;
+
+               my $size_left = length( $buf ) ;
+               my $offset = 0 ;
+
+               while( $size_left > 0 ) {
+
+                       my $write_cnt = syswrite( FH, $buf,
+                                       $size_left, $offset ) ;
+
+                       unless( $write_cnt ) {
+
+                               carp "write error in file $file_name: $!" ;
+                               last ;
+                       }
+
+                       $size_left -= $write_cnt ;
+                       $offset += $write_cnt ;
+               }
+
+               return ;
+       }
+
+=head2 Slurping in Perl 6
+
+As usual with Perl 6, much of the work in this article will be put to
+pasture. Perl 6 will allow you to set a 'slurp' property on file handles
+and when you read from such a handle, the file is slurped. List and
+scalar context will still be supported so you can slurp into lines or a
+<scalar. I would expect that support for slurping in Perl 6 will be
+optimized and bypass the stdio subsystem since it can use the slurp
+property to trigger a call to special code. Otherwise some enterprising
+individual will just create a File::FastSlurp module for Perl 6. The
+code in the Perl 5 module could easily be modified to Perl 6 syntax and
+semantics. Any volunteers?
+
+=head2 In Summary
+
+We have compared classic line by line processing with munging a whole
+file in memory. Slurping files can speed up your programs and simplify
+your code if done properly. You must still be aware to not slurp
+humongous files (logs, DNA sequences, etc.) or STDIN where you don't
+know how much data you will read in. But slurping megabyte sized files
+of is not an major issue on today's systems with the typical amount of
+RAM installed. When Perl was first being used in depth (Perl 4),
+slurping was limited by the smalle RAM size of 10 years ago. Now you
+should be able to slurp most any reasonably sized file be they
+configurations, source code, data, etc.
diff --git a/extras/slurp_article.pod b/extras/slurp_article.pod

new file mode 100644 (file)

index 0000000..8b000f7
--- /dev/null
+++ b/extras/slurp_article.pod
@@ -0,0 +1,743 @@
+=head1 Perl Slurp Ease
+
+=head2 Introduction
+ 
+
+One of the common Perl idioms is processing text files line by line:
+
+       while( <FH> ) {
+               do something with $_
+       }
+
+This idiom has several variants, but the key point is that it reads in
+only one line from the file in each loop iteration. This has several
+advantages, including limiting memory use to one line, the ability to
+handle any size file (including data piped in via STDIN), and it is
+easily taught and understood to Perl newbies. In fact newbies are the
+ones who do silly things like this:
+
+       while( <FH> ) {
+               push @lines, $_ ;
+       }
+
+       foreach ( @lines ) {
+               do something with $_
+       }
+
+Line by line processing is fine, but it isn't the only way to deal with
+reading files. The other common style is reading the entire file into a
+scalar or array, and that is commonly known as slurping. Now, slurping has
+somewhat of a poor reputation, and this article is an attempt at
+rehabilitating it. Slurping files has advantages and limitations, and is
+not something you should just do when line by line processing is fine.
+It is best when you need the entire file in memory for processing all at
+once. Slurping with in memory processing can be faster and lead to
+simpler code than line by line if done properly.
+
+The biggest issue to watch for with slurping is file size. Slurping very
+large files or unknown amounts of data from STDIN can be disastrous to
+your memory usage and cause swap disk thrashing.  You can slurp STDIN if
+you know that you can handle the maximum size input without
+detrimentally affecting your memory usage. So I advocate slurping only
+disk files and only when you know their size is reasonable and you have
+a real reason to process the file as a whole.  Note that reasonable size
+these days is larger than the bad old days of limited RAM. Slurping in a
+megabyte is not an issue on most systems. But most of the
+files I tend to slurp in are much smaller than that. Typical files that
+work well with slurping are configuration files, (mini-)language scripts,
+some data (especially binary) files, and other files of known sizes
+which need fast processing.
+
+Another major win for slurping over line by line is speed. Perl's IO
+system (like many others) is slow. Calling C<< <> >> for each line
+requires a check for the end of line, checks for EOF, copying a line,
+munging the internal handle structure, etc. Plenty of work for each line
+read in. Whereas slurping, if done correctly, will usually involve only
+one I/O call and no extra data copying. The same is true for writing
+files to disk, and we will cover that as well (even though the term
+slurping is traditionally a read operation, I use the term ``slurp'' for
+the concept of doing I/O with an entire file in one operation).
+
+Finally, when you have slurped the entire file into memory, you can do
+operations on the data that are not possible or easily done with line by
+line processing. These include global search/replace (without regard for
+newlines), grabbing all matches with one call of C<//g>, complex parsing
+(which in many cases must ignore newlines), processing *ML (where line
+endings are just white space) and performing complex transformations
+such as template expansion.
+
+=head2 Global Operations
+
+Here are some simple global operations that can be done quickly and
+easily on an entire file that has been slurped in. They could also be
+done with line by line processing but that would be slower and require
+more code.
+
+A common problem is reading in a file with key/value pairs. There are
+modules which do this but who needs them for simple formats? Just slurp
+in the file and do a single parse to grab all the key/value pairs.
+
+       my $text = read_file( $file ) ;
+       my %config = $text =~ /^(\w+)=(.+)$/mg ;
+
+That matches a key which starts a line (anywhere inside the string
+because of the C</m> modifier), the '=' char and the text to the end of the
+line (again, C</m> makes that work). In fact the ending C<$> is not even needed
+since C<.> will not normally match a newline. Since the key and value are
+grabbed and the C<m//> is in list context with the C</g> modifier, it will
+grab all key/value pairs and return them. The C<%config>hash will be
+assigned this list and now you have the file fully parsed into a hash.
+
+Various projects I have worked on needed some simple templating and I
+wasn't in the mood to use a full module (please, no flames about your
+favorite template module :-). So I rolled my own by slurping in the
+template file, setting up a template hash and doing this one line:
+
+       $text =~ s/<%(.+?)%>/$template{$1}/g ;
+
+That only works if the entire file was slurped in. With a little
+extra work it can handle chunks of text to be expanded:
+
+       $text =~ s/<%(\w+)_START%>(.+?)<%\1_END%>/ template($1, $2)/sge ;
+
+Just supply a C<template> sub to expand the text between the markers and
+you have yourself a simple system with minimal code. Note that this will
+work and grab over multiple lines due the the C</s> modifier. This is
+something that is much trickier with line by line processing.
+
+Note that this is a very simple templating system, and it can't directly
+handle nested tags and other complex features. But even if you use one
+of the myriad of template modules on the CPAN, you will gain by having
+speedier ways to read and write files.
+
+Slurping in a file into an array also offers some useful advantages. 
+One simple example is reading in a flat database where each record has
+fields separated by a character such as C<:>:
+
+       my @pw_fields = map [ split /:/ ], read_file( '/etc/passwd' ) ;
+
+Random access to any line of the slurped file is another advantage. Also
+a line index could be built to speed up searching the array of lines.
+
+
+=head2 Traditional Slurping
+
+Perl has always supported slurping files with minimal code. Slurping of
+a file to a list of lines is trivial, just call the C<< <> >> operator
+in a list context:
+
+       my @lines = <FH> ;
+
+and slurping to a scalar isn't much more work. Just set the built in
+variable C<$/> (the input record separator to the undefined value and read
+in the file with C<< <> >>:
+
+       {
+               local( $/, *FH ) ;
+               open( FH, $file ) or die "sudden flaming death\n"
+               $text = <FH>
+       }
+
+Notice the use of C<local()>. It sets C<$/> to C<undef> for you and when
+the scope exits it will revert C<$/> back to its previous value (most
+likely "\n").
+
+Here is a Perl idiom that allows the C<$text> variable to be declared,
+and there is no need for a tightly nested block. The C<do> block will
+execute C<< <FH> >> in a scalar context and slurp in the file named by
+C<$text>:
+
+       local( *FH ) ;
+       open( FH, $file ) or die "sudden flaming death\n"
+       my $text = do { local( $/ ) ; <FH> } ;
+
+Both of those slurps used localized filehandles to be compatible with
+5.005. Here they are with 5.6.0 lexical autovivified handles:
+
+       {
+               local( $/ ) ;
+               open( my $fh, $file ) or die "sudden flaming death\n"
+               $text = <$fh>
+       }
+
+       open( my $fh, $file ) or die "sudden flaming death\n"
+       my $text = do { local( $/ ) ; <$fh> } ;
+
+And this is a variant of that idiom that removes the need for the open
+call:
+
+       my $text = do { local( @ARGV, $/ ) = $file ; <> } ;
+
+The filename in C<$file> is assigned to a localized C<@ARGV> and the
+null filehandle is used which reads the data from the files in C<@ARGV>.
+
+Instead of assigning to a scalar, all the above slurps can assign to an
+array and it will get the file but split into lines (using C<$/> as the
+end of line marker).
+
+There is one common variant of those slurps which is very slow and not
+good code. You see it around, and it is almost always cargo cult code:
+
+       my $text = join( '', <FH> ) ;
+
+That needlessly splits the input file into lines (C<join> provides a
+list context to C<< <FH> >>) and then joins up those lines again. The
+original coder of this idiom obviously never read I<perlvar> and learned
+how to use C<$/> to allow scalar slurping.
+
+=head2 Write Slurping
+
+While reading in entire files at one time is common, writing out entire
+files is also done. We call it ``slurping'' when we read in files, but
+there is no commonly accepted term for the write operation. I asked some
+Perl colleagues and got two interesting nominations. Peter Scott said to
+call it ``burping'' (rhymes with ``slurping'' and suggests movement in
+the opposite direction). Others suggested ``spewing'' which has a
+stronger visual image :-) Tell me your favorite or suggest your own. I
+will use both in this section so you can see how they work for you.
+
+Spewing a file is a much simpler operation than slurping. You don't have
+context issues to worry about and there is no efficiency problem with
+returning a buffer. Here is a simple burp subroutine:
+
+       sub burp {
+               my( $file_name ) = shift ;
+               open( my $fh, ">$file_name" ) || 
+                                die "can't create $file_name $!" ;
+               print $fh @_ ;
+       }
+
+Note that it doesn't copy the input text but passes @_ directly to
+print. We will look at faster variations of that later on.
+
+=head2 Slurp on the CPAN
+
+As you would expect there are modules in the CPAN that will slurp files
+for you. The two I found are called Slurp.pm (by Rob Casey - ROBAU on
+CPAN) and File::Slurp.pm (by David Muir Sharnoff - MUIR on CPAN).
+
+Here is the code from Slurp.pm:
+
+    sub slurp { 
+       local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+       return <ARGV>;
+    }
+
+    sub to_array {
+       my @array = slurp( @_ );
+       return wantarray ? @array : \@array;
+    }
+
+    sub to_scalar {
+       my $scalar = slurp( @_ );
+       return $scalar;
+    }
+
++The subroutine C<slurp()> uses the magic undefined value of C<$/> and
+the magic file +handle C<ARGV> to support slurping into a scalar or
+array. It also provides two wrapper subs that allow the caller to
+control the context of the slurp. And the C<to_array()> subroutine will
+return the list of slurped lines or a anonymous array of them according
+to its caller's context by checking C<wantarray>. It has 'slurp' in
+C<@EXPORT> and all three subroutines in C<@EXPORT_OK>.
+
+<Footnote: Slurp.pm is poorly named and it shouldn't be in the top level
+namespace.>
+
+The original File::Slurp.pm has this code:
+
+sub read_file
+{
+       my ($file) = @_;
+
+       local($/) = wantarray ? $/ : undef;
+       local(*F);
+       my $r;
+       my (@r);
+
+       open(F, "<$file") || croak "open $file: $!";
+       @r = <F>;
+       close(F) || croak "close $file: $!";
+
+       return $r[0] unless wantarray;
+       return @r;
+}
+
+This module provides several subroutines including C<read_file()> (more
+on the others later). C<read_file()> behaves simularly to
+C<Slurp::slurp()> in that it will slurp a list of lines or a single
+scalar depending on the caller's context. It also uses the magic
+undefined value of C<$/> for scalar slurping but it uses an explicit
+open call rather than using a localized C<@ARGV> and the other module
+did. Also it doesn't provide a way to get an anonymous array of the
+lines but that can easily be rectified by calling it inside an anonymous
+array constuctor C<[]>.
+
+Both of these modules make it easier for Perl coders to slurp in
+files. They both use the magic C<$/> to slurp in scalar mode and the
+natural behavior of C<< <> >> in list context to slurp as lines. But
+neither is optmized for speed nor can they handle C<binmode()> to
+support binary or unicode files. See below for more on slurp features
+and speedups.
+
+=head2 Slurping API Design
+
+The slurp modules on CPAN are have a very simple API and don't support
+C<binmode()>. This section will cover various API design issues such as
+efficient return by reference, C<binmode()> and calling variations.
+
+Let's start with the call variations. Slurped files can be returned in
+four formats: as a single scalar, as a reference to a scalar, as a list
+of lines or as an anonymous array of lines. But the caller can only
+provide two contexts: scalar or list. So we have to either provide an
+API with more than one subroutine (as Slurp.pm did) or just provide one
+subroutine which only returns a scalar or a list (not an anonymous
+array) as File::Slurp does.
+
+I have used my own C<read_file()> subroutine for years and it has the
+same API as File::Slurp: a single subroutine that returns a scalar or a
+list of lines depending on context. But I recognize the interest of
+those that want an anonymous array for line slurping. For one thing, it
+is easier to pass around to other subs and for another, it eliminates
+the extra copying of the lines via C<return>. So my module provides only
+one slurp subroutine that returns the file data based on context and any
+format options passed in. There is no need for a specific
+slurp-in-as-a-scalar or list subroutine as the general C<read_file()>
+sub will do that by default in the appropriate context. If you want
+C<read_file()> to return a scalar reference or anonymous array of lines,
+you can request those formats with options. You can even pass in a
+reference to a scalar (e.g. a previously allocated buffer) and have that
+filled with the slurped data (and that is one of the fastest slurp
+modes. see the benchmark section for more on that).  If you want to
+slurp a scalar into an array, just select the desired array element and
+that will provide scalar context to the C<read_file()> subroutine.
+
+The next area to cover is what to name the slurp sub. I will go with
+C<read_file()>. It is descriptive and keeps compatibilty with the
+current simple and don't use the 'slurp' nickname (though that nickname
+is in the module name). Also I decided to keep the  File::Slurp
+namespace which was graciously handed over to me by its current owner,
+David Muir.
+
+Another critical area when designing APIs is how to pass in
+arguments. The C<read_file()> subroutine takes one required argument
+which is the file name. To support C<binmode()> we need another optional
+argument. A third optional argument is needed to support returning a
+slurped scalar by reference. My first thought was to design the API with
+3 positional arguments - file name, buffer reference and binmode. But if
+you want to set the binmode and not pass in a buffer reference, you have
+to fill the second argument with C<undef> and that is ugly. So I decided
+to make the filename argument positional and the other two named. The
+subroutine starts off like this:
+
+       sub read_file {
+
+               my( $file_name, %args ) = @_ ;
+
+               my $buf ;
+               my $buf_ref = $args{'buf'} || \$buf ;
+
+The other sub (C<read_file_lines()>) will only take an optional binmode
+(so you can read files with binary delimiters). It doesn't need a buffer
+reference argument since it can return an anonymous array if the called
+in a scalar context. So this subroutine could use positional arguments,
+but to keep its API similar to the API of C<read_file()>, it will also
+use pass by name for the optional arguments. This also means that new
+optional arguments can be added later without breaking any legacy
+code. A bonus with keeping the API the same for both subs will be seen
+how the two subs are optimized to work together.
+
+Write slurping (or spewing or burping :-)) needs to have its API
+designed as well. The biggest issue is not only needing to support
+optional arguments but a list of arguments to be written is needed. Perl
+6 will be able to handle that with optional named arguments and a final
+slurp argument. Since this is Perl 5 we have to do it using some
+cleverness. The first argument is the file name and it will be
+positional as with the C<read_file> subroutine. But how can we pass in
+the optional arguments and also a list of data? The solution lies in the
+fact that the data list should never contain a reference.
+Burping/spewing works only on plain data. So if the next argument is a
+hash reference, we can assume it cointains the optional arguments and
+the rest of the arguments is the data list. So the C<write_file()>
+subroutine will start off like this:
+
+       sub write_file {
+
+               my $file_name = shift ;
+
+               my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+Whether or not optional arguments are passed in, we leave the data list
+in C<@_> to minimize any more copying. You call C<write_file()> like this:
+
+       write_file( 'foo', { binmode => ':raw' }, @data ) ;
+       write_file( 'junk', { append => 1 }, @more_junk ) ;
+       write_file( 'bar', @spew ) ;
+
+=head2 Fast Slurping
+
+Somewhere along the line, I learned about a way to slurp files faster
+than by setting $/ to undef. The method is very simple, you do a single
+read call with the size of the file (which the -s operator provides).
+This bypasses the I/O loop inside perl that checks for EOF and does all
+sorts of processing. I then decided to experiment and found that
+sysread is even faster as you would expect. sysread bypasses all of
+Perl's stdio and reads the file from the kernel buffers directly into a
+Perl scalar. This is why the slurp code in File::Slurp uses
+sysopen/sysread/syswrite. All the rest of the code is just to support
+the various options and data passing techniques.
+
+ 
+=head2 Benchmarks
+
+Benchmarks can be enlightening, informative, frustrating and
+deceiving. It would make no sense to create a new and more complex slurp
+module unless it also gained signifigantly in speed. So I created a
+benchmark script which compares various slurp methods with differing
+file sizes and calling contexts. This script can be run from the main
+directory of the tarball like this:
+
+       perl -Ilib extras/slurp_bench.pl
+
+If you pass in an argument on the command line, it will be passed to
+timethese() and it will control the duration. It defaults to -2 which
+makes each benchmark run to at least 2 seconds of cpu time.
+
+The following numbers are from a run I did on my 300Mhz sparc. You will
+most likely get much faster counts on your boxes but the relative speeds
+shouldn't change by much. If you see major differences on your
+benchmarks, please send me the results and your Perl and OS
+versions. Also you can play with the benchmark script and add more slurp
+variations or data files.
+
+The rest of this section will be discussing the results of the
+benchmarks. You can refer to extras/slurp_bench.pl to see the code for
+the individual benchmarks. If the benchmark name starts with cpan_, it
+is either from Slurp.pm or File::Slurp.pm. Those starting with new_ are
+from the new File::Slurp.pm. Those that start with file_contents_ are
+from a client's code base. The rest are variations I created to
+highlight certain aspects of the benchmarks.
+
+The short and long file data is made like this:
+
+       my @lines = ( 'abc' x 30 . "\n")  x 100 ;
+       my $text = join( '', @lines ) ;
+
+       @lines = ( 'abc' x 40 . "\n")  x 1000 ;
+       $text = join( '', @lines ) ;
+
+So the short file is 9,100 bytes and the long file is 121,000
+bytes. 
+
+=head3         Scalar Slurp of Short File
+
+       file_contents        651/s
+       file_contents_no_OO  828/s
+       cpan_read_file      1866/s
+       cpan_slurp          1934/s
+       read_file           2079/s
+       new                 2270/s
+       new_buf_ref         2403/s
+       new_scalar_ref      2415/s
+       sysread_file        2572/s
+
+=head3         Scalar Slurp of Long File
+
+       file_contents_no_OO 82.9/s
+       file_contents       85.4/s
+       cpan_read_file       250/s
+       cpan_slurp           257/s
+       read_file            323/s
+       new                  468/s
+       sysread_file         489/s
+       new_scalar_ref       766/s
+       new_buf_ref          767/s
+
+The primary inference you get from looking at the mumbers above is that
+when slurping a file into a scalar, the longer the file, the more time
+you save by returning the result via a scalar reference. The time for
+the extra buffer copy can add up. The new module came out on top overall
+except for the very simple sysread_file entry which was added to
+highlight the overhead of the more flexible new module which isn't that
+much. The file_contents entries are always the worst since they do a
+list slurp and then a join, which is a classic newbie and cargo culted
+style which is extremely slow. Also the OO code in file_contents slows
+it down even more (I added the file_contents_no_OO entry to show this).
+The two CPAN modules are decent with small files but they are laggards
+compared to the new module when the file gets much larger.
+
+=head3         List Slurp of Short File
+
+       cpan_read_file          589/s
+       cpan_slurp_to_array     620/s
+       read_file               824/s
+       new_array_ref           824/s
+       sysread_file            828/s
+       new                     829/s
+       new_in_anon_array       833/s
+       cpan_slurp_to_array_ref 836/s
+
+=head3         List Slurp of Long File
+
+       cpan_read_file          62.4/s
+       cpan_slurp_to_array     62.7/s
+       read_file               92.9/s
+       sysread_file            94.8/s
+       new_array_ref           95.5/s
+       new                     96.2/s
+       cpan_slurp_to_array_ref 96.3/s
+       new_in_anon_array       97.2/s
+
+This is perhaps the most interesting result of this benchmark. Five
+different entries have effectively tied for the lead. The logical
+conclusion is that splitting the input into lines is the bounding
+operation, no matter how the file gets slurped. This is the only
+benchmark where the new module isn't the clear winner (in the long file
+entries - it is no worse than a close second in the short file
+entries). 
+
+
+Note: In the benchmark information for all the spew entries, the extra
+number at the end of each line is how many wallclock seconds the whole
+entry took. The benchmarks were run for at least 2 CPU seconds per
+entry. The unusually large wallclock times will be discussed below.
+
+=head3         Scalar Spew of Short File
+
+       cpan_write_file 1035/s  38
+       print_file      1055/s  41
+       syswrite_file   1135/s  44
+       new             1519/s  2
+       print_join_file 1766/s  2
+       new_ref         1900/s  2
+       syswrite_file2  2138/s  2
+
+=head3         Scalar Spew of Long File
+
+       cpan_write_file 164/s   20
+       print_file      211/s   26
+       syswrite_file   236/s   25
+       print_join_file 277/s   2
+       new             295/s   2
+       syswrite_file2  428/s   2
+       new_ref         608/s   2
+
+In the scalar spew entries, the new module API wins when it is passed a
+reference to the scalar buffer. The C<syswrite_file2> entry beats it
+with the shorter file due to its simpler code. The old CPAN module is
+the slowest due to its extra copying of the data and its use of print.
+
+=head3 List Spew of Short File
+
+       cpan_write_file  794/s  29
+       syswrite_file   1000/s  38
+       print_file      1013/s  42
+       new             1399/s  2
+       print_join_file 1557/s  2
+
+=head3         List Spew of Long File
+
+       cpan_write_file 112/s   12
+       print_file      179/s   21
+       syswrite_file   181/s   19
+       print_join_file 205/s   2
+       new             228/s   2
+
+Again, the simple C<print_join_file> entry beats the new module when
+spewing a short list of lines to a file. But is loses to the new module
+when the file size gets longer. The old CPAN module lags behind the
+others since it first makes an extra copy of the lines and then it calls
+C<print> on the output list and that is much slower than passing to
+C<print> a single scalar generated by join. The C<print_file> entry
+shows the advantage of directly printing C<@_> and the
+C<print_join_file> adds the join optimization.
+
+Now about those long wallclock times. If you look carefully at the
+benchmark code of all the spew entries, you will find that some always
+write to new files and some overwrite existing files. When I asked David
+Muir why the old File::Slurp module had an C<overwrite> subroutine, he
+answered that by overwriting a file, you always guarantee something
+readable is in the file. If you create a new file, there is a moment
+when the new file is created but has no data in it. I feel this is not a
+good enough answer. Even when overwriting, you can write a shorter file
+than the existing file and then you have to truncate the file to the new
+size. There is a small race window there where another process can slurp
+in the file with the new data followed by leftover junk from the
+previous version of the file. This reinforces the point that the only
+way to ensure consistant file data is the proper use of file locks.
+
+But what about those long times? Well it is all about the difference
+between creating files and overwriting existing ones. The former have to
+allocate new inodes (or the equivilent on other file systems) and the
+latter can reuse the exising inode. This mean the overwrite will save on
+disk seeks as well as on cpu time. In fact when running this benchmark,
+I could hear my disk going crazy allocating inodes during the spew
+operations. This speedup in both cpu and wallclock is why the new module
+always does overwriting when spewing files. It also does the proper
+truncate (and this is checked in the tests by spewing shorter files
+after longer ones had previously been written). The C<overwrite>
+subroutine is just an typeglob alias to C<write_file> and is there for
+backwards compatibilty with the old File::Slurp module.
+
+=head3 Benchmark Conclusion
+
+Other than a few cases where a simpler entry beat it out, the new
+File::Slurp module is either the speed leader or among the leaders. Its
+special APIs for passing buffers by reference prove to be very useful
+speedups. Also it uses all the other optimizations including using
+C<sysread/syswrite> and joining output lines. I expect many projects
+that extensively use slurping will notice the speed improvements,
+especially if they rewrite their code to take advantage of the new API
+features. Even if they don't touch their code and use the simple API
+they will get a significant speedup.
+
+=head2 Error Handling
+
+Slurp subroutines are subject to conditions such as not being able to
+open the file, or I/O errors. How these errors are handled, and what the
+caller will see, are important aspects of the design of an API. The
+classic error handling for slurping has been to call C<die()> or even
+better, C<croak()>. But sometimes you want the slurp to either
+C<warn()>/C<carp()> or allow your code to handle the error. Sure, this
+can be done by wrapping the slurp in a C<eval> block to catch a fatal
+error, but not everyone wants all that extra code. So I have added
+another option to all the subroutines which selects the error
+handling. If the 'err_mode' option is 'croak' (which is also the
+default), the called subroutine will croak. If set to 'carp' then carp
+will be called. Set to any other string (use 'quiet' when you want to
+be explicit) and no error handler is called. Then the caller can use the
+error status from the call.
+
+C<write_file()> doesn't use the return value for data so it can return a
+false status value in-band to mark an error. C<read_file()> does use its
+return value for data, but we can still make it pass back the error
+status. A successful read in any scalar mode will return either a
+defined data string or a reference to a scalar or array. So a bare
+return would work here. But if you slurp in lines by calling it in a
+list context, a bare C<return> will return an empty list, which is the
+same value it would get from an existing but empty file. So now,
+C<read_file()> will do something I normally strongly advocate against,
+i.e., returning an explicit C<undef> value. In the scalar context this
+still returns a error, and in list context, the returned first value
+will be C<undef>, and that is not legal data for the first element. So
+the list context also gets a error status it can detect:
+
+       my @lines = read_file( $file_name, err_mode => 'quiet' ) ;
+       your_handle_error( "$file_name can't be read\n" ) unless
+                                       @lines && defined $lines[0] ;
+
+
+=head2 File::FastSlurp
+
+       sub read_file {
+
+               my( $file_name, %args ) = @_ ;
+
+               my $buf ;
+               my $buf_ref = $args{'buf_ref'} || \$buf ;
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+               local( *FH ) ;
+               sysopen( FH, $file_name, $mode ) or
+                                       carp "Can't open $file_name: $!" ;
+
+               my $size_left = -s FH ;
+
+               while( $size_left > 0 ) {
+
+                       my $read_cnt = sysread( FH, ${$buf_ref},
+                                       $size_left, length ${$buf_ref} ) ;
+
+                       unless( $read_cnt ) {
+
+                               carp "read error in file $file_name: $!" ;
+                               last ;
+                       }
+
+                       $size_left -= $read_cnt ;
+               }
+
+       # handle void context (return scalar by buffer reference)
+
+               return unless defined wantarray ;
+
+       # handle list context
+
+               return split m|?<$/|g, ${$buf_ref} if wantarray ;
+
+       # handle scalar context
+
+               return ${$buf_ref} ;
+       }
+
+       sub write_file {
+
+               my $file_name = shift ;
+
+               my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+               my $buf = join '', @_ ;
+
+
+               my $mode = O_WRONLY ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+
+               local( *FH ) ;
+               sysopen( FH, $file_name, $mode ) or
+                                       carp "Can't open $file_name: $!" ;
+
+               my $size_left = length( $buf ) ;
+               my $offset = 0 ;
+
+               while( $size_left > 0 ) {
+
+                       my $write_cnt = syswrite( FH, $buf,
+                                       $size_left, $offset ) ;
+
+                       unless( $write_cnt ) {
+
+                               carp "write error in file $file_name: $!" ;
+                               last ;
+                       }
+
+                       $size_left -= $write_cnt ;
+                       $offset += $write_cnt ;
+               }
+
+               return ;
+       }
+
+=head2 Slurping in Perl 6
+
+As usual with Perl 6, much of the work in this article will be put to
+pasture. Perl 6 will allow you to set a 'slurp' property on file handles
+and when you read from such a handle, the file is slurped. List and
+scalar context will still be supported so you can slurp into lines or a
+<scalar. I would expect that support for slurping in Perl 6 will be
+optimized and bypass the stdio subsystem since it can use the slurp
+property to trigger a call to special code. Otherwise some enterprising
+individual will just create a File::FastSlurp module for Perl 6. The
+code in the Perl 5 module could easily be modified to Perl 6 syntax and
+semantics. Any volunteers?
+
+=head2 In Summary
+
+We have compared classic line by line processing with munging a whole
+file in memory. Slurping files can speed up your programs and simplify
+your code if done properly. You must still be aware to not slurp
+humongous files (logs, DNA sequences, etc.) or STDIN where you don't
+know how much data you will read in. But slurping megabyte sized files
+is not an major issue on today's systems with the typical amount of RAM
+installed. When Perl was first being used in depth (Perl 4), slurping
+was limited by the smaller RAM size of 10 years ago. Now, you should be
+able to slurp almost any reasonably sized file, whether it contains
+configuration, source code, data, etc.
+
+=head2 Acknowledgements
+
+
+
+
+
diff --git a/extras/slurp_bench.pl b/extras/slurp_bench.pl

new file mode 100755 (executable)

index 0000000..0b3a001
--- /dev/null
+++ b/extras/slurp_bench.pl
@@ -0,0 +1,564 @@
+#!/usr/local/bin/perl
+
+use strict ;
+use warnings ;
+
+use Getopt::Long ;
+use Benchmark qw( timethese cmpthese ) ;
+use Carp ;
+use FileHandle ;
+use Fcntl qw( :DEFAULT :seek );
+
+use File::Slurp () ;
+
+my $file = 'slurp_data' ;
+
+GetOptions (\my %opts,
+       qw( slurp spew scalar list sizes=s key duration=i help ) ) ;
+
+parse_options() ;
+
+run_benchmarks() ;
+
+unlink $file ;
+
+exit ;
+
+my( @lines, $text, $size ) ;
+
+sub run_benchmarks {
+
+       foreach my $size ( @{$opts{sizes}} ) {
+
+               @lines = ( 'a' x 80 . "\n") x $size ;
+               $text = join( '', @lines ) ;
+               $size = length $text ;
+
+               File::Slurp::write_file( $file, $text ) ;
+
+#              bench_list_slurp( $size ) if $opts{list} && $opts{slurp} ;
+#              bench_scalar_slurp( $size ) if $opts{scalar} && $opts{slurp} ;
+               bench_spew_list()
+#              bench_scalar_spew( $size ) if $opts{scalar} && $opts{spew} ;
+       }
+}
+
+sub bench_spew_list {
+
+       return unless $opts{list} && $opts{spew} ;
+
+       print "\n\nWriting (Spew) a list of lines: Size = $size bytes\n\n" ;
+
+       my $result = timethese( $opts{duration}, {
+
+               'FS::write_file' =>
+                       sub { File::Slurp::write_file( $file, @lines ) },
+
+               'print' =>
+                       sub { print_file( $file, @lines ) },
+
+               'print/join' =>
+                       sub { print_join_file( $file, @lines ) },
+
+               'syswrite/join' =>
+                       sub { syswrite_join_file( $file, @lines ) },
+
+               'original write_file' =>
+                       sub { orig_write_file( $file, @lines ) },
+
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+# sub bench_scalar_spew {
+
+#      my ( $size ) = @_ ;
+
+#      print "\n\nScalar Spew of $size file\n\n" ;
+
+#      my $result = timethese( $dur, {
+
+#              new =>
+#                      sub { File::Slurp::write_file( $file, $text ) },
+
+#              new_ref =>
+#                      sub { File::Slurp::write_file( $file, \$text ) },
+
+#              print_file =>
+#                      sub { print_file( $file, $text ) },
+
+#              print_join_file =>
+#                      sub { print_join_file( $file, $text ) },
+
+#              syswrite_file =>
+#                      sub { syswrite_file( $file, $text ) },
+
+#              syswrite_file2 =>
+#                      sub { syswrite_file2( $file, $text ) },
+
+#              orig_write_file =>
+#                      sub { orig_write_file( $file, $text ) },
+
+#      } ) ;
+
+#      cmpthese( $result ) ;
+# }
+
+# sub bench_scalar_slurp {
+
+#      my ( $size ) = @_ ;
+
+#      print "\n\nScalar Slurp of $size file\n\n" ;
+
+#      my $buffer ;
+
+#      my $result = timethese( $dur, {
+
+#              new =>
+#                      sub { my $text = File::Slurp::read_file( $file ) },
+
+#              new_buf_ref =>
+#                      sub { my $text ;
+#                         File::Slurp::read_file( $file, buf_ref => \$text ) },
+#              new_buf_ref2 =>
+#                      sub { 
+#                         File::Slurp::read_file( $file, buf_ref => \$buffer ) },
+#              new_scalar_ref =>
+#                      sub { my $text =
+#                          File::Slurp::read_file( $file, scalar_ref => 1 ) },
+
+#              read_file =>
+#                      sub { my $text = read_file( $file ) },
+
+#              sysread_file =>
+#                      sub { my $text = sysread_file( $file ) },
+
+#              orig_read_file =>
+#                      sub { my $text = orig_read_file( $file ) },
+
+#              'Slurp.pm scalar' =>
+#                      sub { my $text = slurp_scalar( $file ) },
+
+#              file_contents =>
+#                      sub { my $text = file_contents( $file ) },
+
+#              file_contents_no_OO =>
+#                      sub { my $text = file_contents_no_OO( $file ) },
+#      } ) ;
+
+#      cmpthese( $result ) ;
+# }
+
+# sub bench_list_slurp {
+
+#      my ( $size ) = @_ ;
+
+#      print "\n\nList Slurp of $size file\n\n" ;
+
+#      my $result = timethese( $dur, {
+
+#              new =>
+#                      sub { my @lines = File::Slurp::read_file( $file ) },
+
+#              new_array_ref =>
+#                      sub { my $lines_ref =
+#                           File::Slurp::read_file( $file, array_ref => 1 ) },
+
+#              new_in_anon_array =>
+#                      sub { my $lines_ref =
+#                           [ File::Slurp::read_file( $file ) ] },
+
+#              read_file =>
+#                      sub { my @lines = read_file( $file ) },
+
+#              sysread_file =>
+#                      sub { my @lines = sysread_file( $file ) },
+
+#              orig_read_file =>
+#                      sub { my @lines = orig_read_file( $file ) },
+
+#              'Slurp.pm to array' =>
+#                      sub { my @lines = slurp_array( $file ) },
+
+#              orig_slurp_to_array_ref =>
+#                      sub { my $lines_ref = orig_slurp_to_array( $file ) },
+#      } ) ;
+
+#      cmpthese( $result ) ;
+# }
+
+
+###########################
+# write file benchmark subs
+###########################
+
+
+sub print_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH @_ ;
+}
+
+sub print_file2 {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       my $mode = ( -e $file_name ) ? '<' : '>' ;
+
+       open( FH, "+$mode$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH @_ ;
+}
+
+sub print_join_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       my $mode = ( -e $file_name ) ? '<' : '>' ;
+
+       open( FH, "+$mode$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH join( '', @_ ) ;
+}
+
+
+sub syswrite_join_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || carp "can't create $file_name $!" ;
+
+       syswrite( FH, join( '', @_ ) ) ;
+}
+
+sub sysopen_syswrite_join_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       sysopen( FH, $file_name, O_WRONLY | O_CREAT ) ||
+                               carp "can't create $file_name $!" ;
+
+       syswrite( FH, join( '', @_ ) ) ;
+}
+
+sub orig_write_file
+{
+       my ($f, @data) = @_;
+
+       local(*F);
+
+       open(F, ">$f") || croak "open >$f: $!";
+       (print F @data) || croak "write $f: $!";
+       close(F) || croak "close $f: $!";
+       return 1;
+}
+
+
+#######################
+# top level subs for script
+
+#######################
+
+sub parse_options {
+
+       help() if $opts{help} ;
+
+       key() if $opts{key} ;
+
+       unless( $opts{spew} || $opts{slurp} ) {
+
+               $opts{spew} = 1 ;
+               $opts{slurp} = 1 ;
+       }
+
+       unless( $opts{list} || $opts{scalar} ) {
+
+               $opts{list} = 1 ;
+               $opts{scalar} = 1 ;
+       }
+
+       $opts{sizes} = [split ',', ( $opts{sizes} || '10,100,1000' ) ];
+
+       $opts{duration} ||= -2 ;
+}
+
+sub key {
+
+       print <<'KEY' ;
+
+Key to Slurp/Spew Benchmarks
+
+
+Write a list of lines to a file
+
+       Key                     Description/Source
+       ---                     ------------------
+
+       FS:write_file           Current File::Slurp::write_file
+       FS:write_file_ref       Current File::Slurp::write_file (scalar ref)
+       print                   Open a file and call print()
+       syswrite/join           Open a file, call syswrite on joined lines
+       sysopen/syswrite        Sysopen a file, call syswrite on joined lines
+       original write_file     Original (pre 9999.*) File::Slurp::write_file
+
+
+KEY
+
+       exit ;
+}
+
+sub help {
+
+       die <<DIE ;
+
+Usage: $0 [--list] [--scalar] [--slurp] [--spew]
+          [--sizes=10,100] [--help]
+
+       --list          Run the list context benchmarks
+       --scalar        Run the scalar context benchmarks
+                       Those default to on unless one is set
+
+       --slurp         Run the slurp benchmarks
+       --spew          Run the spew benchmarks
+                       Those default to on unless one is set
+
+       --sizes         Comma separated list of file sizes to benchmark
+                       Defaults to 10,100,1000
+
+       --key           Print the benchmark names and code orig_ins
+
+       --help          Print this help text
+
+DIE
+
+}
+
+
+__END__
+
+
+sub bench_scalar_spew {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nScalar Spew of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { File::Slurp::write_file( $file, $text ) },
+
+               new_ref =>
+                       sub { File::Slurp::write_file( $file, \$text ) },
+
+               print_file =>
+                       sub { print_file( $file, $text ) },
+
+               print_join_file =>
+                       sub { print_join_file( $file, $text ) },
+
+               syswrite_file =>
+                       sub { syswrite_file( $file, $text ) },
+
+               syswrite_file2 =>
+                       sub { syswrite_file2( $file, $text ) },
+
+               orig_write_file =>
+                       sub { orig_write_file( $file, $text ) },
+
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_scalar_slurp {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nScalar Slurp of $size file\n\n" ;
+
+       my $buffer ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { my $text = File::Slurp::read_file( $file ) },
+
+               new_buf_ref =>
+                       sub { my $text ;
+                          File::Slurp::read_file( $file, buf_ref => \$text ) },
+               new_buf_ref2 =>
+                       sub { 
+                          File::Slurp::read_file( $file, buf_ref => \$buffer ) },
+               new_scalar_ref =>
+                       sub { my $text =
+                           File::Slurp::read_file( $file, scalar_ref => 1 ) },
+
+               read_file =>
+                       sub { my $text = read_file( $file ) },
+
+               sysread_file =>
+                       sub { my $text = sysread_file( $file ) },
+
+               orig_read_file =>
+                       sub { my $text = orig_read_file( $file ) },
+
+               orig_slurp =>
+                       sub { my $text = orig_slurp_to_scalar( $file ) },
+
+               file_contents =>
+                       sub { my $text = file_contents( $file ) },
+
+               file_contents_no_OO =>
+                       sub { my $text = file_contents_no_OO( $file ) },
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_list_slurp {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nList Slurp of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { my @lines = File::Slurp::read_file( $file ) },
+
+               new_array_ref =>
+                       sub { my $lines_ref =
+                            File::Slurp::read_file( $file, array_ref => 1 ) },
+
+               new_in_anon_array =>
+                       sub { my $lines_ref =
+                            [ File::Slurp::read_file( $file ) ] },
+
+               read_file =>
+                       sub { my @lines = read_file( $file ) },
+
+               sysread_file =>
+                       sub { my @lines = sysread_file( $file ) },
+
+               orig_read_file =>
+                       sub { my @lines = orig_read_file( $file ) },
+
+               orig_slurp_to_array =>
+                       sub { my @lines = orig_slurp_to_array( $file ) },
+
+               orig_slurp_to_array_ref =>
+                       sub { my $lines_ref = orig_slurp_to_array( $file ) },
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+######################################
+# uri's old fast slurp
+
+sub read_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+       open( FH, $file_name ) || carp "can't open $file_name $!" ;
+
+       return <FH> if wantarray ;
+
+       my $buf ;
+
+       read( FH, $buf, -s FH ) ;
+       return $buf ;
+}
+
+sub sysread_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+       open( FH, $file_name ) || carp "can't open $file_name $!" ;
+
+       return <FH> if wantarray ;
+
+       my $buf ;
+
+       sysread( FH, $buf, -s FH ) ;
+       return $buf ;
+}
+
+######################################
+# from File::Slurp.pm on cpan
+
+sub orig_read_file
+{
+       my ($file) = @_;
+
+       local($/) = wantarray ? $/ : undef;
+       local(*F);
+       my $r;
+       my (@r);
+
+       open(F, "<$file") || croak "open $file: $!";
+       @r = <F>;
+       close(F) || croak "close $file: $!";
+
+       return $r[0] unless wantarray;
+       return @r;
+}
+
+
+######################################
+# from Slurp.pm on cpan
+
+sub slurp { 
+    local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+    return <ARGV>;
+}
+
+sub slurp_array {
+    my @array = slurp( @_ );
+    return wantarray ? @array : \@array;
+}
+
+sub slurp_scalar {
+    my $scalar = slurp( @_ );
+    return $scalar;
+}
+
+######################################
+# very slow slurp code used by a client
+
+sub file_contents {
+    my $file = shift;
+    my $fh = new FileHandle $file or
+        warn("Util::file_contents:Can't open file $file"), return '';
+    return join '', <$fh>;
+}
+
+# same code but doesn't use FileHandle.pm
+
+sub file_contents_no_OO {
+    my $file = shift;
+
+       local( *FH ) ;
+       open( FH, $file ) || carp "can't open $file $!" ;
+
+    return join '', <FH>;
+}
+
+##########################
diff --git a/extras/slurp_bench.pl.~1.2.~ b/extras/slurp_bench.pl.~1.2.~

new file mode 100755 (executable)

index 0000000..376a75b
--- /dev/null
+++ b/extras/slurp_bench.pl.~1.2.~
@@ -0,0 +1,351 @@
+#!/usr/local/bin/perl
+
+use strict ;
+
+use Benchmark qw( timethese cmpthese ) ;
+use Carp ;
+use FileHandle ;
+use Fcntl qw( :DEFAULT :seek );
+
+use File::Slurp () ;
+
+my $dur = shift || -2 ;
+
+my $file = 'slurp_data' ;
+
+my @lines = ( 'abc' x 30 . "\n")  x 100 ;
+my $text = join( '', @lines ) ;
+
+bench_list_spew( 'SHORT' ) ;
+bench_scalar_spew( 'SHORT' ) ;
+
+File::Slurp::write_file( $file, $text ) ;
+
+bench_scalar_slurp( 'SHORT' ) ;
+bench_list_slurp( 'SHORT' ) ;
+
+@lines = ( 'abc' x 40 . "\n")  x 1000 ;
+$text = join( '', @lines ) ;
+
+bench_list_spew( 'LONG' ) ;
+bench_scalar_spew( 'LONG' ) ;
+
+File::Slurp::write_file( $file, $text ) ;
+
+bench_scalar_slurp( 'LONG' ) ;
+bench_list_slurp( 'LONG' ) ;
+
+exit ;
+
+sub bench_list_spew {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nList Spew of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { File::Slurp::write_file( $file, @lines ) },
+
+               print_file =>
+                       sub { print_file( $file, @lines ) },
+
+               print_join_file =>
+                       sub { print_join_file( $file, @lines ) },
+
+               syswrite_file =>
+                       sub { syswrite_file( $file, @lines ) },
+
+               cpan_write_file =>
+                       sub { cpan_write_file( $file, @lines ) },
+
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_scalar_spew {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nScalar Spew of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { File::Slurp::write_file( $file, $text ) },
+
+               new_ref =>
+                       sub { File::Slurp::write_file( $file, \$text ) },
+
+               print_file =>
+                       sub { print_file( $file, $text ) },
+
+               print_join_file =>
+                       sub { print_join_file( $file, $text ) },
+
+               syswrite_file =>
+                       sub { syswrite_file( $file, $text ) },
+
+               syswrite_file2 =>
+                       sub { syswrite_file2( $file, $text ) },
+
+               cpan_write_file =>
+                       sub { cpan_write_file( $file, $text ) },
+
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_scalar_slurp {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nScalar Slurp of $size file\n\n" ;
+
+       my $buffer ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { my $text = File::Slurp::read_file( $file ) },
+
+               new_buf_ref =>
+                       sub { my $text ;
+                          File::Slurp::read_file( $file, buf_ref => \$text ) },
+               new_buf_ref2 =>
+                       sub { 
+                          File::Slurp::read_file( $file, buf_ref => \$buffer ) },
+               new_scalar_ref =>
+                       sub { my $text =
+                           File::Slurp::read_file( $file, scalar_ref => 1 ) },
+
+               read_file =>
+                       sub { my $text = read_file( $file ) },
+
+               sysread_file =>
+                       sub { my $text = sysread_file( $file ) },
+
+               cpan_read_file =>
+                       sub { my $text = cpan_read_file( $file ) },
+
+               cpan_slurp =>
+                       sub { my $text = cpan_slurp_to_scalar( $file ) },
+
+               file_contents =>
+                       sub { my $text = file_contents( $file ) },
+
+               file_contents_no_OO =>
+                       sub { my $text = file_contents_no_OO( $file ) },
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_list_slurp {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nList Slurp of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { my @lines = File::Slurp::read_file( $file ) },
+
+               new_array_ref =>
+                       sub { my $lines_ref =
+                            File::Slurp::read_file( $file, array_ref => 1 ) },
+
+               new_in_anon_array =>
+                       sub { my $lines_ref =
+                            [ File::Slurp::read_file( $file ) ] },
+
+               read_file =>
+                       sub { my @lines = read_file( $file ) },
+
+               sysread_file =>
+                       sub { my @lines = sysread_file( $file ) },
+
+               cpan_read_file =>
+                       sub { my @lines = cpan_read_file( $file ) },
+
+               cpan_slurp_to_array =>
+                       sub { my @lines = cpan_slurp_to_array( $file ) },
+
+               cpan_slurp_to_array_ref =>
+                       sub { my $lines_ref = cpan_slurp_to_array( $file ) },
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+######################################
+# uri's old fast slurp
+
+sub read_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+       open( FH, $file_name ) || carp "can't open $file_name $!" ;
+
+       return <FH> if wantarray ;
+
+       my $buf ;
+
+       read( FH, $buf, -s FH ) ;
+       return $buf ;
+}
+
+sub sysread_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+       open( FH, $file_name ) || carp "can't open $file_name $!" ;
+
+       return <FH> if wantarray ;
+
+       my $buf ;
+
+       sysread( FH, $buf, -s FH ) ;
+       return $buf ;
+}
+
+######################################
+# from File::Slurp.pm on cpan
+
+sub cpan_read_file
+{
+       my ($file) = @_;
+
+       local($/) = wantarray ? $/ : undef;
+       local(*F);
+       my $r;
+       my (@r);
+
+       open(F, "<$file") || croak "open $file: $!";
+       @r = <F>;
+       close(F) || croak "close $file: $!";
+
+       return $r[0] unless wantarray;
+       return @r;
+}
+
+sub cpan_write_file
+{
+       my ($f, @data) = @_;
+
+       local(*F);
+
+       open(F, ">$f") || croak "open >$f: $!";
+       (print F @data) || croak "write $f: $!";
+       close(F) || croak "close $f: $!";
+       return 1;
+}
+
+
+######################################
+# from Slurp.pm on cpan
+
+sub slurp { 
+    local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+    return <ARGV>;
+}
+
+sub cpan_slurp_to_array {
+    my @array = slurp( @_ );
+    return wantarray ? @array : \@array;
+}
+
+sub cpan_slurp_to_scalar {
+    my $scalar = slurp( @_ );
+    return $scalar;
+}
+
+######################################
+# very slow slurp code used by a client
+
+sub file_contents {
+    my $file = shift;
+    my $fh = new FileHandle $file or
+        warn("Util::file_contents:Can't open file $file"), return '';
+    return join '', <$fh>;
+}
+
+# same code but doesn't use FileHandle.pm
+
+sub file_contents_no_OO {
+    my $file = shift;
+
+       local( *FH ) ;
+       open( FH, $file ) || carp "can't open $file $!" ;
+
+    return join '', <FH>;
+}
+
+##########################
+
+sub print_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH @_ ;
+}
+
+sub print_file2 {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       my $mode = ( -e $file_name ) ? '<' : '>' ;
+
+       open( FH, "+$mode$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH @_ ;
+}
+
+sub print_join_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       my $mode = ( -e $file_name ) ? '<' : '>' ;
+
+       open( FH, "+$mode$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH join( '', @_ ) ;
+}
+
+
+sub syswrite_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || carp "can't create $file_name $!" ;
+
+       syswrite( FH, join( '', @_ ) ) ;
+}
+
+sub syswrite_file2 {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       sysopen( FH, $file_name, O_WRONLY | O_CREAT ) ||
+                               carp "can't create $file_name $!" ;
+
+       syswrite( FH, join( '', @_ ) ) ;
+}
diff --git a/extras/slurp_data b/extras/slurp_data

new file mode 100644 (file)

index 0000000..e69de29
diff --git a/lib/File/Slurp.pm b/lib/File/Slurp.pm

new file mode 100755 (executable)

index 0000000..b958f3b
--- /dev/null
+++ b/lib/File/Slurp.pm
@@ -0,0 +1,744 @@
+package File::Slurp;
+
+use strict;
+
+use Carp ;
+use POSIX qw( :fcntl_h ) ;
+use Fcntl qw( :DEFAULT ) ;
+use Symbol ;
+
+my $is_win32 = $^O =~ /win32/i ;
+
+# Install subs for various constants that aren't set in older perls
+# (< 5.005).  Fcntl on old perls uses Exporter to define subs without a
+# () prototype These can't be overridden with the constant pragma or
+# we get a prototype mismatch.  Hence this less than aesthetically
+# appealing BEGIN block:
+
+BEGIN {
+       unless( eval { defined SEEK_SET() } ) {
+               *SEEK_SET = sub { 0 };
+               *SEEK_CUR = sub { 1 };
+               *SEEK_END = sub { 2 };
+       }
+
+       unless( eval { defined O_BINARY() } ) {
+               *O_BINARY = sub { 0 };
+               *O_RDONLY = sub { 0 };
+               *O_WRONLY = sub { 1 };
+       }
+
+       unless ( eval { defined O_APPEND() } ) {
+
+               if ( $^O =~ /olaris/ ) {
+                       *O_APPEND = sub { 8 };
+                       *O_CREAT = sub { 256 };
+                       *O_EXCL = sub { 1024 };
+               }
+               elsif ( $^O =~ /inux/ ) {
+                       *O_APPEND = sub { 1024 };
+                       *O_CREAT = sub { 64 };
+                       *O_EXCL = sub { 128 };
+               }
+               elsif ( $^O =~ /BSD/i ) {
+                       *O_APPEND = sub { 8 };
+                       *O_CREAT = sub { 512 };
+                       *O_EXCL = sub { 2048 };
+               }
+       }
+}
+
+# print "OS [$^O]\n" ;
+
+# print "O_BINARY = ", O_BINARY(), "\n" ;
+# print "O_RDONLY = ", O_RDONLY(), "\n" ;
+# print "O_WRONLY = ", O_WRONLY(), "\n" ;
+# print "O_APPEND = ", O_APPEND(), "\n" ;
+# print "O_CREAT   ", O_CREAT(), "\n" ;
+# print "O_EXCL   ", O_EXCL(), "\n" ;
+
+use base 'Exporter' ;
+use vars qw( %EXPORT_TAGS @EXPORT_OK $VERSION @EXPORT ) ;
+
+%EXPORT_TAGS = ( 'all' => [
+       qw( read_file write_file overwrite_file append_file read_dir ) ] ) ;
+
+@EXPORT = ( @{ $EXPORT_TAGS{'all'} } );
+@EXPORT_OK = qw( slurp ) ;
+
+$VERSION = '9999.12';
+
+*slurp = \&read_file ;
+
+sub read_file {
+
+       my( $file_name, %args ) = @_ ;
+
+# set the buffer to either the passed in one or ours and init it to the null
+# string
+
+       my $buf ;
+       my $buf_ref = $args{'buf_ref'} || \$buf ;
+       ${$buf_ref} = '' ;
+
+       my( $read_fh, $size_left, $blk_size ) ;
+
+# check if we are reading from a handle (glob ref or IO:: object)
+
+       if ( ref $file_name ) {
+
+# slurping a handle so use it and don't open anything.
+# set the block size so we know it is a handle and read that amount
+
+               $read_fh = $file_name ;
+               $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+               $size_left = $blk_size ;
+
+# DEEP DARK MAGIC. this checks the UNTAINT IO flag of a
+# glob/handle. only the DATA handle is untainted (since it is from
+# trusted data in the source file). this allows us to test if this is
+# the DATA handle and then to do a sysseek to make sure it gets
+# slurped correctly. on some systems, the buffered i/o pointer is not
+# left at the same place as the fd pointer. this sysseek makes them
+# the same so slurping with sysread will work.
+
+               eval{ require B } ;
+
+               if ( $@ ) {
+
+                       @_ = ( \%args, <<ERR ) ;
+Can't find B.pm with this Perl: $!.
+That module is needed to slurp the DATA handle.
+ERR
+                       goto &_error ;
+               }
+
+               if ( B::svref_2object( $read_fh )->IO->IoFLAGS & 16 ) {
+
+# set the seek position to the current tell.
+
+                       sysseek( $read_fh, tell( $read_fh ), SEEK_SET ) ||
+                               croak "sysseek $!" ;
+               }
+       }
+       else {
+
+# a regular file. set the sysopen mode
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+#printf "RD: BINARY %x MODE %x\n", O_BINARY, $mode ;
+
+# open the file and handle any error
+
+               $read_fh = gensym ;
+               unless ( sysopen( $read_fh, $file_name, $mode ) ) {
+                       @_ = ( \%args, "read_file '$file_name' - sysopen: $!");
+                       goto &_error ;
+               }
+
+# get the size of the file for use in the read loop
+
+               $size_left = -s $read_fh ;
+
+               unless( $size_left ) {
+
+                       $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+                       $size_left = $blk_size ;
+               }
+       }
+
+# infinite read loop. we exit when we are done slurping
+
+       while( 1 ) {
+
+# do the read and see how much we got
+
+               my $read_cnt = sysread( $read_fh, ${$buf_ref},
+                               $size_left, length ${$buf_ref} ) ;
+
+               if ( defined $read_cnt ) {
+
+# good read. see if we hit EOF (nothing left to read)
+
+                       last if $read_cnt == 0 ;
+
+# loop if we are slurping a handle. we don't track $size_left then.
+
+                       next if $blk_size ;
+
+# count down how much we read and loop if we have more to read.
+                       $size_left -= $read_cnt ;
+                       last if $size_left <= 0 ;
+                       next ;
+               }
+
+# handle the read error
+
+               @_ = ( \%args, "read_file '$file_name' - sysread: $!");
+               goto &_error ;
+       }
+
+# fix up cr/lf to be a newline if this is a windows text file
+
+       ${$buf_ref} =~ s/\015\012/\n/g if $is_win32 && !$args{'binmode'} ;
+
+# this is the 5 returns in a row. each handles one possible
+# combination of caller context and requested return type
+
+       my $sep = $/ ;
+       $sep = '\n\n+' if defined $sep && $sep eq '' ;
+
+# caller wants to get an array ref of lines
+
+# this split doesn't work since it tries to use variable length lookbehind
+# the m// line works.
+#      return [ split( m|(?<=$sep)|, ${$buf_ref} ) ] if $args{'array_ref'}  ;
+       return [ length(${$buf_ref}) ? ${$buf_ref} =~ /(.*?$sep|.+)/sg : () ]
+               if $args{'array_ref'}  ;
+
+# caller wants a list of lines (normal list context)
+
+# same problem with this split as before.
+#      return split( m|(?<=$sep)|, ${$buf_ref} ) if wantarray ;
+       return length(${$buf_ref}) ? ${$buf_ref} =~ /(.*?$sep|.+)/sg : ()
+               if wantarray ;
+
+# caller wants a scalar ref to the slurped text
+
+       return $buf_ref if $args{'scalar_ref'} ;
+
+# caller wants a scalar with the slurped text (normal scalar context)
+
+       return ${$buf_ref} if defined wantarray ;
+
+# caller passed in an i/o buffer by reference (normal void context)
+
+       return ;
+}
+
+sub write_file {
+
+       my $file_name = shift ;
+
+# get the optional argument hash ref from @_ or an empty hash ref.
+
+       my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+       my( $buf_ref, $write_fh, $no_truncate, $orig_file_name, $data_is_ref ) ;
+
+# get the buffer ref - it depends on how the data is passed into write_file
+# after this if/else $buf_ref will have a scalar ref to the data.
+
+       if ( ref $args->{'buf_ref'} eq 'SCALAR' ) {
+
+# a scalar ref passed in %args has the data
+# note that the data was passed by ref
+
+               $buf_ref = $args->{'buf_ref'} ;
+               $data_is_ref = 1 ;
+       }
+       elsif ( ref $_[0] eq 'SCALAR' ) {
+
+# the first value in @_ is the scalar ref to the data
+# note that the data was passed by ref
+
+               $buf_ref = shift ;
+               $data_is_ref = 1 ;
+       }
+       elsif ( ref $_[0] eq 'ARRAY' ) {
+
+# the first value in @_ is the array ref to the data so join it.
+
+               ${$buf_ref} = join '', @{$_[0]} ;
+       }
+       else {
+
+# good old @_ has all the data so join it.
+
+               ${$buf_ref} = join '', @_ ;
+       }
+
+# see if we were passed a open handle to spew to.
+
+       if ( ref $file_name ) {
+
+# we have a handle. make sure we don't call truncate on it.
+
+               $write_fh = $file_name ;
+               $no_truncate = 1 ;
+       }
+       else {
+
+# spew to regular file.
+
+               if ( $args->{'atomic'} ) {
+
+# in atomic mode, we spew to a temp file so make one and save the original
+# file name.
+                       $orig_file_name = $file_name ;
+                       $file_name .= ".$$" ;
+               }
+
+# set the mode for the sysopen
+
+               my $mode = O_WRONLY | O_CREAT ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+               $mode |= O_EXCL if $args->{'no_clobber'} ;
+
+#printf "WR: BINARY %x MODE %x\n", O_BINARY, $mode ;
+
+# open the file and handle any error.
+
+               $write_fh = gensym ;
+               unless ( sysopen( $write_fh, $file_name, $mode ) ) {
+                       @_ = ( $args, "write_file '$file_name' - sysopen: $!");
+                       goto &_error ;
+               }
+       }
+
+       sysseek( $write_fh, 0, SEEK_END ) if $args->{'append'} ;
+
+
+#print 'WR before data ', unpack( 'H*', ${$buf_ref}), "\n" ;
+
+# fix up newline to write cr/lf if this is a windows text file
+
+       if ( $is_win32 && !$args->{'binmode'} ) {
+
+# copy the write data if it was passed by ref so we don't clobber the
+# caller's data
+               $buf_ref = \do{ my $copy = ${$buf_ref}; } if $data_is_ref ;
+               ${$buf_ref} =~ s/\n/\015\012/g ;
+       }
+
+#print 'after data ', unpack( 'H*', ${$buf_ref}), "\n" ;
+
+# get the size of how much we are writing and init the offset into that buffer
+
+       my $size_left = length( ${$buf_ref} ) ;
+       my $offset = 0 ;
+
+# loop until we have no more data left to write
+
+       do {
+
+# do the write and track how much we just wrote
+
+               my $write_cnt = syswrite( $write_fh, ${$buf_ref},
+                               $size_left, $offset ) ;
+
+               unless ( defined $write_cnt ) {
+
+# the write failed
+                       @_ = ( $args, "write_file '$file_name' - syswrite: $!");
+                       goto &_error ;
+               }
+
+# track much left to write and where to write from in the buffer
+
+               $size_left -= $write_cnt ;
+               $offset += $write_cnt ;
+
+       } while( $size_left > 0 ) ;
+
+# we truncate regular files in case we overwrite a long file with a shorter file
+# so seek to the current position to get it (same as tell()).
+
+       truncate( $write_fh,
+                 sysseek( $write_fh, 0, SEEK_CUR ) ) unless $no_truncate ;
+
+       close( $write_fh ) ;
+
+# handle the atomic mode - move the temp file to the original filename.
+
+       rename( $file_name, $orig_file_name ) if $args->{'atomic'} ;
+
+       return 1 ;
+}
+
+# this is for backwards compatibility with the previous File::Slurp module. 
+# write_file always overwrites an existing file
+
+*overwrite_file = \&write_file ;
+
+# the current write_file has an append mode so we use that. this
+# supports the same API with an optional second argument which is a
+# hash ref of options.
+
+sub append_file {
+
+# get the optional args hash ref
+       my $args = $_[1] ;
+       if ( ref $args eq 'HASH' ) {
+
+# we were passed an args ref so just mark the append mode
+
+               $args->{append} = 1 ;
+       }
+       else {
+
+# no args hash so insert one with the append mode
+
+               splice( @_, 1, 0, { append => 1 } ) ;
+       }
+
+# magic goto the main write_file sub. this overlays the sub without touching
+# the stack or @_
+
+       goto &write_file
+}
+
+# basic wrapper around opendir/readdir
+
+sub read_dir {
+
+       my ($dir, %args ) = @_;
+
+# this handle will be destroyed upon return
+
+       local(*DIRH);
+
+# open the dir and handle any errors
+
+       unless ( opendir( DIRH, $dir ) ) {
+
+               @_ = ( \%args, "read_dir '$dir' - opendir: $!" ) ;
+               goto &_error ;
+       }
+
+       my @dir_entries = readdir(DIRH) ;
+
+       @dir_entries = grep( $_ ne "." && $_ ne "..", @dir_entries )
+               unless $args{'keep_dot_dot'} ;
+
+       return @dir_entries if wantarray ;
+       return \@dir_entries ;
+}
+
+# error handling section
+#
+# all the error handling uses magic goto so the caller will get the
+# error message as if from their code and not this module. if we just
+# did a call on the error code, the carp/croak would report it from
+# this module since the error sub is one level down on the call stack
+# from read_file/write_file/read_dir.
+
+
+my %err_func = (
+       'carp'  => \&carp,
+       'croak' => \&croak,
+) ;
+
+sub _error {
+
+       my( $args, $err_msg ) = @_ ;
+
+# get the error function to use
+
+       my $func = $err_func{ $args->{'err_mode'} || 'croak' } ;
+
+# if we didn't find it in our error function hash, they must have set
+# it to quiet and we don't do anything.
+
+       return unless $func ;
+
+# call the carp/croak function
+
+       $func->($err_msg) ;
+
+# return a hard undef (in list context this will be a single value of
+# undef which is not a legal in-band value)
+
+       return undef ;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+File::Slurp - Efficient Reading/Writing of Complete Files
+
+=head1 SYNOPSIS
+
+  use File::Slurp;
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+  write_file( 'filename', @lines ) ;
+
+  use File::Slurp qw( slurp ) ;
+
+  my $text = slurp( 'filename' ) ;
+
+
+=head1 DESCRIPTION
+
+This module provides subs that allow you to read or write entire files
+with one simple call. They are designed to be simple to use, have
+flexible ways to pass in or get the file contents and to be very
+efficient.  There is also a sub to read in all the files in a
+directory other than C<.> and C<..>
+
+These slurp/spew subs work for files, pipes and
+sockets, and stdio, pseudo-files, and DATA.
+
+=head2 B<read_file>
+
+This sub reads in an entire file and returns its contents to the
+caller. In list context it will return a list of lines (using the
+current value of $/ as the separator including support for paragraph
+mode when it is set to ''). In scalar context it returns the entire
+file as a single scalar.
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+The first argument to C<read_file> is the filename and the rest of the
+arguments are key/value pairs which are optional and which modify the
+behavior of the call. Other than binmode the options all control how
+the slurped file is returned to the caller.
+
+If the first argument is a file handle reference or I/O object (if ref
+is true), then that handle is slurped in. This mode is supported so
+you slurp handles such as C<DATA>, C<STDIN>. See the test handle.t
+for an example that does C<open( '-|' )> and child process spews data
+to the parant which slurps it in.  All of the options that control how
+the data is returned to the caller still work in this case.
+
+NOTE: as of version 9999.06, read_file works correctly on the C<DATA>
+handle. It used to need a sysseek workaround but that is now handled
+when needed by the module itself.
+
+You can optionally request that C<slurp()> is exported to your code. This
+is an alias for read_file and is meant to be forward compatible with
+Perl 6 (which will have slurp() built-in).
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be slurped in binary
+mode.
+
+       my $bin_data = read_file( $bin_file, binmode => ':raw' ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 array_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an array reference which contains the lines of the
+slurped file. The following two calls are equivalent:
+
+       my $lines_ref = read_file( $bin_file, array_ref => 1 ) ;
+       my $lines_ref = [ read_file( $bin_file ) ] ;
+
+=head3 scalar_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an scalar reference to a string which is the contents
+of the slurped file. This will usually be faster than returning the
+plain scalar.
+
+       my $text_ref = read_file( $bin_file, scalar_ref => 1 ) ;
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference and the slurped
+file contents will be stored in the scalar. This can be used in
+conjunction with any of the other options.
+
+       my $text_ref = read_file( $bin_file, buf_ref => \$buffer,
+                                            array_ref => 1 ) ;
+       my @lines = read_file( $bin_file, buf_ref => \$buffer ) ;
+
+=head3 blk_size
+
+You can use this option to set the block size used when slurping from an already open handle (like \*STDIN). It defaults to 1MB.
+
+       my $text_ref = read_file( $bin_file, blk_size => 10_000_000,
+                                            array_ref => 1 ) ;
+
+=head3 err_mode
+
+You can use this option to control how read_file behaves when an error
+occurs. This option defaults to 'croak'. You can set it to 'carp' or
+to 'quiet to have no error handling. This code wants to carp and then
+read abother file if it fails.
+
+       my $text_ref = read_file( $file, err_mode => 'carp' ) ;
+       unless ( $text_ref ) {
+
+               # read a different file but croak if not found
+               $text_ref = read_file( $another_file ) ;
+       }
+       
+       # process ${$text_ref}
+
+=head2 B<write_file>
+
+This sub writes out an entire file in one call.
+
+  write_file( 'filename', @data ) ;
+
+The first argument to C<write_file> is the filename. The next argument
+is an optional hash reference and it contains key/values that can
+modify the behavior of C<write_file>. The rest of the argument list is
+the data to be written to the file.
+
+  write_file( 'filename', {append => 1 }, @data ) ;
+  write_file( 'filename', {binmode => ':raw' }, $buffer ) ;
+
+As a shortcut if the first data argument is a scalar or array
+reference, it is used as the only data to be written to the file. Any
+following arguments in @_ are ignored. This is a faster way to pass in
+the output to be written to the file and is equivilent to the
+C<buf_ref> option. These following pairs are equivilent but the pass
+by reference call will be faster in most cases (especially with larger
+files).
+
+  write_file( 'filename', \$buffer ) ;
+  write_file( 'filename', $buffer ) ;
+
+  write_file( 'filename', \@lines ) ;
+  write_file( 'filename', @lines ) ;
+
+If the first argument is a file handle reference or I/O object (if ref
+is true), then that handle is slurped in. This mode is supported so
+you spew to handles such as \*STDOUT. See the test handle.t for an
+example that does C<open( '-|' )> and child process spews data to the
+parant which slurps it in.  All of the options that control how the
+data is passes into C<write_file> still work in this case.
+
+C<write_file> returns 1 upon successfully writing the file or undef if
+it encountered an error.
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be written in binary
+mode.
+
+       write_file( $bin_file, {binmode => ':raw'}, @data ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference which has the
+data to be written. If this is set then any data arguments (including
+the scalar reference shortcut) in @_ will be ignored. These are
+equivilent:
+
+       write_file( $bin_file, { buf_ref => \$buffer } ) ;
+       write_file( $bin_file, \$buffer ) ;
+       write_file( $bin_file, $buffer ) ;
+
+=head3 atomic
+
+If you set this boolean option, the file will be written to in an
+atomic fashion. A temporary file name is created by appending the pid
+($$) to the file name argument and that file is spewed to. After the
+file is closed it is renamed to the original file name (and rename is
+an atomic operation on most OS's). If the program using this were to
+crash in the middle of this, then the file with the pid suffix could
+be left behind.
+
+=head3 append
+
+If you set this boolean option, the data will be written at the end of
+the current file.
+
+       write_file( $file, {append => 1}, @data ) ;
+
+C<write_file> croaks if it cannot open the file. It returns true if it
+succeeded in writing out the file and undef if there was an
+error. (Yes, I know if it croaks it can't return anything but that is
+for when I add the options to select the error handling mode).
+
+=head3 no_clobber
+
+If you set this boolean option, an existing file will not be overwritten.
+
+       write_file( $file, {no_clobber => 1}, @data ) ;
+
+=head3 err_mode
+
+You can use this option to control how C<write_file> behaves when an
+error occurs. This option defaults to 'croak'. You can set it to
+'carp' or to 'quiet' to have no error handling other than the return
+value. If the first call to C<write_file> fails it will carp and then
+write to another file. If the second call to C<write_file> fails, it
+will croak.
+
+       unless ( write_file( $file, { err_mode => 'carp', \$data ) ;
+
+               # write a different file but croak if not found
+               write_file( $other_file, \$data ) ;
+       }
+
+=head2 overwrite_file
+
+This sub is just a typeglob alias to write_file since write_file
+always overwrites an existing file. This sub is supported for
+backwards compatibility with the original version of this module. See
+write_file for its API and behavior.
+
+=head2 append_file
+
+This sub will write its data to the end of the file. It is a wrapper
+around write_file and it has the same API so see that for the full
+documentation. These calls are equivilent:
+
+       append_file( $file, @data ) ;
+       write_file( $file, {append => 1}, @data ) ;
+
+=head2 read_dir
+
+This sub reads all the file names from directory and returns them to
+the caller but C<.> and C<..> are removed by default.
+
+       my @files = read_dir( '/path/to/dir' ) ;
+
+It croaks if it cannot open the directory.
+
+In a list context C<read_dir> returns a list of the entries in the
+directory. In a scalar context it returns an array reference which has
+the entries.
+
+=head3 keep_dot_dot
+
+If this boolean option is set, C<.> and C<..> are not removed from the
+list of files.
+
+       my @all_files = read_dir( '/path/to/dir', keep_dot_dot => 1 ) ;
+
+=head2 EXPORT
+
+  read_file write_file overwrite_file append_file read_dir
+
+=head2 SEE ALSO
+
+An article on file slurping in extras/slurp_article.pod. There is
+also a benchmarking script in extras/slurp_bench.pl.
+
+=head2 BUGS
+
+If run under Perl 5.004, slurping from the DATA handle will fail as
+that requires B.pm which didn't get into core until 5.005.
+
+=head1 AUTHOR
+
+Uri Guttman, E<lt>uri@stemsystems.comE<gt>
+
+=cut
diff --git a/lib/File/Slurp.pm.~1.26.~ b/lib/File/Slurp.pm.~1.26.~

new file mode 100755 (executable)

index 0000000..ce890aa
--- /dev/null
+++ b/lib/File/Slurp.pm.~1.26.~
@@ -0,0 +1,744 @@
+package File::Slurp;
+
+use strict;
+
+use Carp ;
+use Fcntl qw( :DEFAULT ) ;
+use POSIX qw( :fcntl_h ) ;
+use Symbol ;
+
+my $is_win32 = $^O =~ /win32/i ;
+
+# Install subs for various constants that aren't set in older perls
+# (< 5.005).  Fcntl on old perls uses Exporter to define subs without a
+# () prototype These can't be overridden with the constant pragma or
+# we get a prototype mismatch.  Hence this less than aesthetically
+# appealing BEGIN block:
+
+BEGIN {
+       unless( eval { defined SEEK_SET() } ) {
+               *SEEK_SET = sub { 0 };
+               *SEEK_CUR = sub { 1 };
+               *SEEK_END = sub { 2 };
+       }
+
+       unless( eval { defined O_BINARY() } ) {
+               *O_BINARY = sub { 0 };
+               *O_RDONLY = sub { 0 };
+               *O_WRONLY = sub { 1 };
+       }
+
+       unless ( eval { defined O_APPEND() } ) {
+
+               if ( $^O =~ /olaris/ ) {
+                       *O_APPEND = sub { 8 };
+                       *O_CREAT = sub { 256 };
+                       *O_EXCL = sub { 1024 };
+               }
+               elsif ( $^O =~ /inux/ ) {
+                       *O_APPEND = sub { 1024 };
+                       *O_CREAT = sub { 64 };
+                       *O_EXCL = sub { 128 };
+               }
+               elsif ( $^O =~ /BSD/i ) {
+                       *O_APPEND = sub { 8 };
+                       *O_CREAT = sub { 512 };
+                       *O_EXCL = sub { 2048 };
+               }
+       }
+}
+
+# print "OS [$^O]\n" ;
+
+# print "O_BINARY = ", O_BINARY(), "\n" ;
+# print "O_RDONLY = ", O_RDONLY(), "\n" ;
+# print "O_WRONLY = ", O_WRONLY(), "\n" ;
+# print "O_APPEND = ", O_APPEND(), "\n" ;
+# print "O_CREAT   ", O_CREAT(), "\n" ;
+# print "O_EXCL   ", O_EXCL(), "\n" ;
+
+use base 'Exporter' ;
+use vars qw( %EXPORT_TAGS @EXPORT_OK $VERSION @EXPORT ) ;
+
+%EXPORT_TAGS = ( 'all' => [
+       qw( read_file write_file overwrite_file append_file read_dir ) ] ) ;
+
+@EXPORT = ( @{ $EXPORT_TAGS{'all'} } );
+@EXPORT_OK = qw( slurp ) ;
+
+$VERSION = '9999.12';
+
+*slurp = \&read_file ;
+
+sub read_file {
+
+       my( $file_name, %args ) = @_ ;
+
+# set the buffer to either the passed in one or ours and init it to the null
+# string
+
+       my $buf ;
+       my $buf_ref = $args{'buf_ref'} || \$buf ;
+       ${$buf_ref} = '' ;
+
+       my( $read_fh, $size_left, $blk_size ) ;
+
+# check if we are reading from a handle (glob ref or IO:: object)
+
+       if ( ref $file_name ) {
+
+# slurping a handle so use it and don't open anything.
+# set the block size so we know it is a handle and read that amount
+
+               $read_fh = $file_name ;
+               $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+               $size_left = $blk_size ;
+
+# DEEP DARK MAGIC. this checks the UNTAINT IO flag of a
+# glob/handle. only the DATA handle is untainted (since it is from
+# trusted data in the source file). this allows us to test if this is
+# the DATA handle and then to do a sysseek to make sure it gets
+# slurped correctly. on some systems, the buffered i/o pointer is not
+# left at the same place as the fd pointer. this sysseek makes them
+# the same so slurping with sysread will work.
+
+               eval{ require B } ;
+
+               if ( $@ ) {
+
+                       @_ = ( \%args, <<ERR ) ;
+Can't find B.pm with this Perl: $!.
+That module is needed to slurp the DATA handle.
+ERR
+                       goto &_error ;
+               }
+
+               if ( B::svref_2object( $read_fh )->IO->IoFLAGS & 16 ) {
+
+# set the seek position to the current tell.
+
+                       sysseek( $read_fh, tell( $read_fh ), SEEK_SET ) ||
+                               croak "sysseek $!" ;
+               }
+       }
+       else {
+
+# a regular file. set the sysopen mode
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+#printf "RD: BINARY %x MODE %x\n", O_BINARY, $mode ;
+
+# open the file and handle any error
+
+               $read_fh = gensym ;
+               unless ( sysopen( $read_fh, $file_name, $mode ) ) {
+                       @_ = ( \%args, "read_file '$file_name' - sysopen: $!");
+                       goto &_error ;
+               }
+
+# get the size of the file for use in the read loop
+
+               $size_left = -s $read_fh ;
+
+               unless( $size_left ) {
+
+                       $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+                       $size_left = $blk_size ;
+               }
+       }
+
+# infinite read loop. we exit when we are done slurping
+
+       while( 1 ) {
+
+# do the read and see how much we got
+
+               my $read_cnt = sysread( $read_fh, ${$buf_ref},
+                               $size_left, length ${$buf_ref} ) ;
+
+               if ( defined $read_cnt ) {
+
+# good read. see if we hit EOF (nothing left to read)
+
+                       last if $read_cnt == 0 ;
+
+# loop if we are slurping a handle. we don't track $size_left then.
+
+                       next if $blk_size ;
+
+# count down how much we read and loop if we have more to read.
+                       $size_left -= $read_cnt ;
+                       last if $size_left <= 0 ;
+                       next ;
+               }
+
+# handle the read error
+
+               @_ = ( \%args, "read_file '$file_name' - sysread: $!");
+               goto &_error ;
+       }
+
+# fix up cr/lf to be a newline if this is a windows text file
+
+       ${$buf_ref} =~ s/\015\012/\n/g if $is_win32 && !$args{'binmode'} ;
+
+# this is the 5 returns in a row. each handles one possible
+# combination of caller context and requested return type
+
+       my $sep = $/ ;
+       $sep = '\n\n+' if defined $sep && $sep eq '' ;
+
+# caller wants to get an array ref of lines
+
+# this split doesn't work since it tries to use variable length lookbehind
+# the m// line works.
+#      return [ split( m|(?<=$sep)|, ${$buf_ref} ) ] if $args{'array_ref'}  ;
+       return [ length(${$buf_ref}) ? ${$buf_ref} =~ /(.*?$sep|.+)/sg : () ]
+               if $args{'array_ref'}  ;
+
+# caller wants a list of lines (normal list context)
+
+# same problem with this split as before.
+#      return split( m|(?<=$sep)|, ${$buf_ref} ) if wantarray ;
+       return length(${$buf_ref}) ? ${$buf_ref} =~ /(.*?$sep|.+)/sg : ()
+               if wantarray ;
+
+# caller wants a scalar ref to the slurped text
+
+       return $buf_ref if $args{'scalar_ref'} ;
+
+# caller wants a scalar with the slurped text (normal scalar context)
+
+       return ${$buf_ref} if defined wantarray ;
+
+# caller passed in an i/o buffer by reference (normal void context)
+
+       return ;
+}
+
+sub write_file {
+
+       my $file_name = shift ;
+
+# get the optional argument hash ref from @_ or an empty hash ref.
+
+       my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+       my( $buf_ref, $write_fh, $no_truncate, $orig_file_name, $data_is_ref ) ;
+
+# get the buffer ref - it depends on how the data is passed into write_file
+# after this if/else $buf_ref will have a scalar ref to the data.
+
+       if ( ref $args->{'buf_ref'} eq 'SCALAR' ) {
+
+# a scalar ref passed in %args has the data
+# note that the data was passed by ref
+
+               $buf_ref = $args->{'buf_ref'} ;
+               $data_is_ref = 1 ;
+       }
+       elsif ( ref $_[0] eq 'SCALAR' ) {
+
+# the first value in @_ is the scalar ref to the data
+# note that the data was passed by ref
+
+               $buf_ref = shift ;
+               $data_is_ref = 1 ;
+       }
+       elsif ( ref $_[0] eq 'ARRAY' ) {
+
+# the first value in @_ is the array ref to the data so join it.
+
+               ${$buf_ref} = join '', @{$_[0]} ;
+       }
+       else {
+
+# good old @_ has all the data so join it.
+
+               ${$buf_ref} = join '', @_ ;
+       }
+
+# see if we were passed a open handle to spew to.
+
+       if ( ref $file_name ) {
+
+# we have a handle. make sure we don't call truncate on it.
+
+               $write_fh = $file_name ;
+               $no_truncate = 1 ;
+       }
+       else {
+
+# spew to regular file.
+
+               if ( $args->{'atomic'} ) {
+
+# in atomic mode, we spew to a temp file so make one and save the original
+# file name.
+                       $orig_file_name = $file_name ;
+                       $file_name .= ".$$" ;
+               }
+
+# set the mode for the sysopen
+
+               my $mode = O_WRONLY | O_CREAT ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+               $mode |= O_EXCL if $args->{'no_clobber'} ;
+
+#printf "WR: BINARY %x MODE %x\n", O_BINARY, $mode ;
+
+# open the file and handle any error.
+
+               $write_fh = gensym ;
+               unless ( sysopen( $write_fh, $file_name, $mode ) ) {
+                       @_ = ( $args, "write_file '$file_name' - sysopen: $!");
+                       goto &_error ;
+               }
+       }
+
+       sysseek( $write_fh, 0, SEEK_END ) if $args->{'append'} ;
+
+
+#print 'WR before data ', unpack( 'H*', ${$buf_ref}), "\n" ;
+
+# fix up newline to write cr/lf if this is a windows text file
+
+       if ( $is_win32 && !$args->{'binmode'} ) {
+
+# copy the write data if it was passed by ref so we don't clobber the
+# caller's data
+               $buf_ref = \do{ my $copy = ${$buf_ref}; } if $data_is_ref ;
+               ${$buf_ref} =~ s/\n/\015\012/g ;
+       }
+
+#print 'after data ', unpack( 'H*', ${$buf_ref}), "\n" ;
+
+# get the size of how much we are writing and init the offset into that buffer
+
+       my $size_left = length( ${$buf_ref} ) ;
+       my $offset = 0 ;
+
+# loop until we have no more data left to write
+
+       do {
+
+# do the write and track how much we just wrote
+
+               my $write_cnt = syswrite( $write_fh, ${$buf_ref},
+                               $size_left, $offset ) ;
+
+               unless ( defined $write_cnt ) {
+
+# the write failed
+                       @_ = ( $args, "write_file '$file_name' - syswrite: $!");
+                       goto &_error ;
+               }
+
+# track much left to write and where to write from in the buffer
+
+               $size_left -= $write_cnt ;
+               $offset += $write_cnt ;
+
+       } while( $size_left > 0 ) ;
+
+# we truncate regular files in case we overwrite a long file with a shorter file
+# so seek to the current position to get it (same as tell()).
+
+       truncate( $write_fh,
+                 sysseek( $write_fh, 0, SEEK_CUR ) ) unless $no_truncate ;
+
+       close( $write_fh ) ;
+
+# handle the atomic mode - move the temp file to the original filename.
+
+       rename( $file_name, $orig_file_name ) if $args->{'atomic'} ;
+
+       return 1 ;
+}
+
+# this is for backwards compatibility with the previous File::Slurp module. 
+# write_file always overwrites an existing file
+
+*overwrite_file = \&write_file ;
+
+# the current write_file has an append mode so we use that. this
+# supports the same API with an optional second argument which is a
+# hash ref of options.
+
+sub append_file {
+
+# get the optional args hash ref
+       my $args = $_[1] ;
+       if ( ref $args eq 'HASH' ) {
+
+# we were passed an args ref so just mark the append mode
+
+               $args->{append} = 1 ;
+       }
+       else {
+
+# no args hash so insert one with the append mode
+
+               splice( @_, 1, 0, { append => 1 } ) ;
+       }
+
+# magic goto the main write_file sub. this overlays the sub without touching
+# the stack or @_
+
+       goto &write_file
+}
+
+# basic wrapper around opendir/readdir
+
+sub read_dir {
+
+       my ($dir, %args ) = @_;
+
+# this handle will be destroyed upon return
+
+       local(*DIRH);
+
+# open the dir and handle any errors
+
+       unless ( opendir( DIRH, $dir ) ) {
+
+               @_ = ( \%args, "read_dir '$dir' - opendir: $!" ) ;
+               goto &_error ;
+       }
+
+       my @dir_entries = readdir(DIRH) ;
+
+       @dir_entries = grep( $_ ne "." && $_ ne "..", @dir_entries )
+               unless $args{'keep_dot_dot'} ;
+
+       return @dir_entries if wantarray ;
+       return \@dir_entries ;
+}
+
+# error handling section
+#
+# all the error handling uses magic goto so the caller will get the
+# error message as if from their code and not this module. if we just
+# did a call on the error code, the carp/croak would report it from
+# this module since the error sub is one level down on the call stack
+# from read_file/write_file/read_dir.
+
+
+my %err_func = (
+       'carp'  => \&carp,
+       'croak' => \&croak,
+) ;
+
+sub _error {
+
+       my( $args, $err_msg ) = @_ ;
+
+# get the error function to use
+
+       my $func = $err_func{ $args->{'err_mode'} || 'croak' } ;
+
+# if we didn't find it in our error function hash, they must have set
+# it to quiet and we don't do anything.
+
+       return unless $func ;
+
+# call the carp/croak function
+
+       $func->($err_msg) ;
+
+# return a hard undef (in list context this will be a single value of
+# undef which is not a legal in-band value)
+
+       return undef ;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+File::Slurp - Efficient Reading/Writing of Complete Files
+
+=head1 SYNOPSIS
+
+  use File::Slurp;
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+  write_file( 'filename', @lines ) ;
+
+  use File::Slurp qw( slurp ) ;
+
+  my $text = slurp( 'filename' ) ;
+
+
+=head1 DESCRIPTION
+
+This module provides subs that allow you to read or write entire files
+with one simple call. They are designed to be simple to use, have
+flexible ways to pass in or get the file contents and to be very
+efficient.  There is also a sub to read in all the files in a
+directory other than C<.> and C<..>
+
+These slurp/spew subs work for files, pipes and
+sockets, and stdio, pseudo-files, and DATA.
+
+=head2 B<read_file>
+
+This sub reads in an entire file and returns its contents to the
+caller. In list context it will return a list of lines (using the
+current value of $/ as the separator including support for paragraph
+mode when it is set to ''). In scalar context it returns the entire
+file as a single scalar.
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+The first argument to C<read_file> is the filename and the rest of the
+arguments are key/value pairs which are optional and which modify the
+behavior of the call. Other than binmode the options all control how
+the slurped file is returned to the caller.
+
+If the first argument is a file handle reference or I/O object (if ref
+is true), then that handle is slurped in. This mode is supported so
+you slurp handles such as C<DATA>, C<STDIN>. See the test handle.t
+for an example that does C<open( '-|' )> and child process spews data
+to the parant which slurps it in.  All of the options that control how
+the data is returned to the caller still work in this case.
+
+NOTE: as of version 9999.06, read_file works correctly on the C<DATA>
+handle. It used to need a sysseek workaround but that is now handled
+when needed by the module itself.
+
+You can optionally request that C<slurp()> is exported to your code. This
+is an alias for read_file and is meant to be forward compatible with
+Perl 6 (which will have slurp() built-in).
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be slurped in binary
+mode.
+
+       my $bin_data = read_file( $bin_file, binmode => ':raw' ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 array_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an array reference which contains the lines of the
+slurped file. The following two calls are equivalent:
+
+       my $lines_ref = read_file( $bin_file, array_ref => 1 ) ;
+       my $lines_ref = [ read_file( $bin_file ) ] ;
+
+=head3 scalar_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an scalar reference to a string which is the contents
+of the slurped file. This will usually be faster than returning the
+plain scalar.
+
+       my $text_ref = read_file( $bin_file, scalar_ref => 1 ) ;
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference and the slurped
+file contents will be stored in the scalar. This can be used in
+conjunction with any of the other options.
+
+       my $text_ref = read_file( $bin_file, buf_ref => \$buffer,
+                                            array_ref => 1 ) ;
+       my @lines = read_file( $bin_file, buf_ref => \$buffer ) ;
+
+=head3 blk_size
+
+You can use this option to set the block size used when slurping from an already open handle (like \*STDIN). It defaults to 1MB.
+
+       my $text_ref = read_file( $bin_file, blk_size => 10_000_000,
+                                            array_ref => 1 ) ;
+
+=head3 err_mode
+
+You can use this option to control how read_file behaves when an error
+occurs. This option defaults to 'croak'. You can set it to 'carp' or
+to 'quiet to have no error handling. This code wants to carp and then
+read abother file if it fails.
+
+       my $text_ref = read_file( $file, err_mode => 'carp' ) ;
+       unless ( $text_ref ) {
+
+               # read a different file but croak if not found
+               $text_ref = read_file( $another_file ) ;
+       }
+       
+       # process ${$text_ref}
+
+=head2 B<write_file>
+
+This sub writes out an entire file in one call.
+
+  write_file( 'filename', @data ) ;
+
+The first argument to C<write_file> is the filename. The next argument
+is an optional hash reference and it contains key/values that can
+modify the behavior of C<write_file>. The rest of the argument list is
+the data to be written to the file.
+
+  write_file( 'filename', {append => 1 }, @data ) ;
+  write_file( 'filename', {binmode => ':raw' }, $buffer ) ;
+
+As a shortcut if the first data argument is a scalar or array
+reference, it is used as the only data to be written to the file. Any
+following arguments in @_ are ignored. This is a faster way to pass in
+the output to be written to the file and is equivilent to the
+C<buf_ref> option. These following pairs are equivilent but the pass
+by reference call will be faster in most cases (especially with larger
+files).
+
+  write_file( 'filename', \$buffer ) ;
+  write_file( 'filename', $buffer ) ;
+
+  write_file( 'filename', \@lines ) ;
+  write_file( 'filename', @lines ) ;
+
+If the first argument is a file handle reference or I/O object (if ref
+is true), then that handle is slurped in. This mode is supported so
+you spew to handles such as \*STDOUT. See the test handle.t for an
+example that does C<open( '-|' )> and child process spews data to the
+parant which slurps it in.  All of the options that control how the
+data is passes into C<write_file> still work in this case.
+
+C<write_file> returns 1 upon successfully writing the file or undef if
+it encountered an error.
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be written in binary
+mode.
+
+       write_file( $bin_file, {binmode => ':raw'}, @data ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference which has the
+data to be written. If this is set then any data arguments (including
+the scalar reference shortcut) in @_ will be ignored. These are
+equivilent:
+
+       write_file( $bin_file, { buf_ref => \$buffer } ) ;
+       write_file( $bin_file, \$buffer ) ;
+       write_file( $bin_file, $buffer ) ;
+
+=head3 atomic
+
+If you set this boolean option, the file will be written to in an
+atomic fashion. A temporary file name is created by appending the pid
+($$) to the file name argument and that file is spewed to. After the
+file is closed it is renamed to the original file name (and rename is
+an atomic operation on most OS's). If the program using this were to
+crash in the middle of this, then the file with the pid suffix could
+be left behind.
+
+=head3 append
+
+If you set this boolean option, the data will be written at the end of
+the current file.
+
+       write_file( $file, {append => 1}, @data ) ;
+
+C<write_file> croaks if it cannot open the file. It returns true if it
+succeeded in writing out the file and undef if there was an
+error. (Yes, I know if it croaks it can't return anything but that is
+for when I add the options to select the error handling mode).
+
+=head3 no_clobber
+
+If you set this boolean option, an existing file will not be overwritten.
+
+       write_file( $file, {no_clobber => 1}, @data ) ;
+
+=head3 err_mode
+
+You can use this option to control how C<write_file> behaves when an
+error occurs. This option defaults to 'croak'. You can set it to
+'carp' or to 'quiet' to have no error handling other than the return
+value. If the first call to C<write_file> fails it will carp and then
+write to another file. If the second call to C<write_file> fails, it
+will croak.
+
+       unless ( write_file( $file, { err_mode => 'carp', \$data ) ;
+
+               # write a different file but croak if not found
+               write_file( $other_file, \$data ) ;
+       }
+
+=head2 overwrite_file
+
+This sub is just a typeglob alias to write_file since write_file
+always overwrites an existing file. This sub is supported for
+backwards compatibility with the original version of this module. See
+write_file for its API and behavior.
+
+=head2 append_file
+
+This sub will write its data to the end of the file. It is a wrapper
+around write_file and it has the same API so see that for the full
+documentation. These calls are equivilent:
+
+       append_file( $file, @data ) ;
+       write_file( $file, {append => 1}, @data ) ;
+
+=head2 read_dir
+
+This sub reads all the file names from directory and returns them to
+the caller but C<.> and C<..> are removed by default.
+
+       my @files = read_dir( '/path/to/dir' ) ;
+
+It croaks if it cannot open the directory.
+
+In a list context C<read_dir> returns a list of the entries in the
+directory. In a scalar context it returns an array reference which has
+the entries.
+
+=head3 keep_dot_dot
+
+If this boolean option is set, C<.> and C<..> are not removed from the
+list of files.
+
+       my @all_files = read_dir( '/path/to/dir', keep_dot_dot => 1 ) ;
+
+=head2 EXPORT
+
+  read_file write_file overwrite_file append_file read_dir
+
+=head2 SEE ALSO
+
+An article on file slurping in extras/slurp_article.pod. There is
+also a benchmarking script in extras/slurp_bench.pl.
+
+=head2 BUGS
+
+If run under Perl 5.004, slurping from the DATA handle will fail as
+that requires B.pm which didn't get into core until 5.005.
+
+=head1 AUTHOR
+
+Uri Guttman, E<lt>uri@stemsystems.comE<gt>
+
+=cut
diff --git a/pm_to_blib b/pm_to_blib

new file mode 100644 (file)

index 0000000..e69de29
diff --git a/slurp_talk/Slurp.pm b/slurp_talk/Slurp.pm

new file mode 100755 (executable)

index 0000000..00a56f9
--- /dev/null
+++ b/slurp_talk/Slurp.pm
@@ -0,0 +1,460 @@
+package File::Slurp;
+
+use strict;
+
+use Carp ;
+use Fcntl qw( :DEFAULT :seek ) ;
+use Symbol ;
+
+use base 'Exporter' ;
+use vars qw( %EXPORT_TAGS @EXPORT_OK $VERSION  @EXPORT) ;
+
+%EXPORT_TAGS = ( 'all' => [
+       qw( read_file write_file overwrite_file append_file read_dir ) ] ) ;
+
+#@EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
+@EXPORT = ( @{ $EXPORT_TAGS{'all'} } );
+
+$VERSION = '9999.01';
+
+
+sub read_file {
+
+       my( $file_name, %args ) = @_ ;
+
+       my $buf ;
+       my $buf_ref = $args{'buf_ref'} || \$buf ;
+
+       ${$buf_ref} = '' ;
+
+       my( $read_fh, $size_left, $blk_size ) ;
+
+       if ( defined( fileno( $file_name ) ) ) {
+
+               $read_fh = $file_name ;
+               $blk_size = $args{'blk_size'} || 1024 * 1024 ;
+               $size_left = $blk_size ;
+       }
+       else {
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+
+               $read_fh = gensym ;
+               unless ( sysopen( $read_fh, $file_name, $mode ) ) {
+                       @_ = ( \%args, "read_file '$file_name' - sysopen: $!");
+                       goto &error ;
+               }
+
+               $size_left = -s $read_fh ;
+       }
+
+       while( 1 ) {
+
+               my $read_cnt = sysread( $read_fh, ${$buf_ref},
+                               $size_left, length ${$buf_ref} ) ;
+
+               if ( defined $read_cnt ) {
+
+                       last if $read_cnt == 0 ;
+                       next if $blk_size ;
+
+                       $size_left -= $read_cnt ;
+                       last if $size_left <= 0 ;
+                       next ;
+               }
+
+# handle the read error
+
+               @_ = ( \%args, "read_file '$file_name' - sysread: $!");
+               goto &error ;
+       }
+
+# handle array ref
+
+       return [ split( m|(?<=$/)|, ${$buf_ref} ) ] if $args{'array_ref'}  ;
+
+# handle list context
+
+       return split( m|(?<=$/)|, ${$buf_ref} ) if wantarray ;
+
+# handle scalar ref
+
+       return $buf_ref if $args{'scalar_ref'} ;
+
+# handle scalar context
+
+       return ${$buf_ref} if defined wantarray ;
+
+# handle void context (return scalar by buffer reference)
+
+       return ;
+}
+
+sub write_file {
+
+       my $file_name = shift ;
+
+       my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+       my( $buf_ref, $write_fh, $no_truncate ) ;
+
+# get the buffer ref - either passed by name or first data arg or autovivified
+# ${$buf_ref} will have the data after this
+
+       if ( ref $args->{'buf_ref'} eq 'SCALAR' ) {
+
+               $buf_ref = $args->{'buf_ref'} ;
+       }
+       elsif ( ref $_[0] eq 'SCALAR' ) {
+
+               $buf_ref = shift ;
+       }
+       elsif ( ref $_[0] eq 'ARRAY' ) {
+
+               ${$buf_ref} = join '', @{$_[0]} ;
+       }
+       else {
+
+               ${$buf_ref} = join '', @_ ;
+       }
+
+       if ( defined( fileno( $file_name ) ) ) {
+
+               $write_fh = $file_name ;
+               $no_truncate = 1 ;
+       }
+       else {
+
+               my $mode = O_WRONLY | O_CREAT ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+
+               $write_fh = gensym ;
+               unless ( sysopen( $write_fh, $file_name, $mode ) ) {
+                       @_ = ( $args, "write_file '$file_name' - sysopen: $!");
+                       goto &error ;
+               }
+
+       }
+
+       my $size_left = length( ${$buf_ref} ) ;
+       my $offset = 0 ;
+
+       do {
+               my $write_cnt = syswrite( $write_fh, ${$buf_ref},
+                               $size_left, $offset ) ;
+
+               unless ( defined $write_cnt ) {
+
+                       @_ = ( $args, "write_file '$file_name' - syswrite: $!");
+                       goto &error ;
+               }
+
+               $size_left -= $write_cnt ;
+               $offset += $write_cnt ;
+
+       } while( $size_left > 0 ) ;
+
+       truncate( $write_fh,
+                 sysseek( $write_fh, 0, SEEK_CUR ) ) unless $no_truncate ;
+
+       close( $write_fh ) ;
+
+       return 1 ;
+}
+
+# this is for backwards compatibility with the previous File::Slurp module. 
+# write_file always overwrites an existing file
+
+*overwrite_file = \&write_file ;
+
+# the current write_file has an append mode so we use that. this
+# supports the same API with an optional second argument which is a
+# hash ref of options.
+
+sub append_file {
+
+       my $args = $_[1] ;
+       if ( ref $args eq 'HASH' ) {
+               $args->{append} = 1 ;
+       }
+       else {
+
+               splice( @_, 1, 0, { append => 1 } ) ;
+       }
+       
+       goto &write_file
+}
+
+sub read_dir {
+       my ($dir, %args ) = @_;
+
+       local(*DIRH);
+
+       if ( opendir( DIRH, $dir ) ) {
+               return grep( $_ ne "." && $_ ne "..", readdir(DIRH));
+       }
+
+       @_ = ( \%args, "read_dir '$dir' - opendir: $!" ) ; goto &error ;
+
+       return undef ;
+}
+
+my %err_func = (
+       carp => \&carp,
+       croak => \&croak,
+) ;
+
+sub error {
+
+       my( $args, $err_msg ) = @_ ;
+
+#print $err_msg ;
+
+       my $func = $err_func{ $args->{'err_mode'} || 'croak' } ;
+
+       return unless $func ;
+
+       $func->($err_msg) ;
+
+       return undef ;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+File::Slurp - Efficient Reading/Writing of Complete Files
+
+=head1 SYNOPSIS
+
+  use File::Slurp;
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+  write_file( 'filename', @lines ) ;
+
+=head1 DESCRIPTION
+
+This module provides subs that allow you to read or write entire files
+with one simple call. They are designed to be simple to use, have
+flexible ways to pass in or get the file contents and to be very
+efficient.  There is also a sub to read in all the files in a
+directory other than C<.> and C<..>
+
+Note that these slurp/spew subs work only for files and not for pipes
+or stdio. If you want to slurp the latter, use the standard techniques
+such as setting $/ to undef, reading <> in a list context, or printing
+all you want to STDOUT.
+
+=head2 B<read_file>
+
+This sub reads in an entire file and returns its contents to the
+caller. In list context it will return a list of lines (using the
+current value of $/ as the separator. In scalar context it returns the
+entire file as a single scalar.
+
+  my $text = read_file( 'filename' ) ;
+  my @lines = read_file( 'filename' ) ;
+
+The first argument to C<read_file> is the filename and the rest of the
+arguments are key/value pairs which are optional and which modify the
+behavior of the call. Other than binmode the options all control how
+the slurped file is returned to the caller.
+
+If the first argument is a file handle reference or I/O object (if
+fileno returns a defined value), then that handle is slurped in. This
+mode is supported so you slurp handles such as <DATA>, \*STDIN. See
+the test handle.t for an example that does C<open( '-|' )> and child
+process spews data to the parant which slurps it in.  All of the
+options that control how the data is returned to the caller still work
+in this case.
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be slurped in binary
+mode.
+
+       my $bin_data = read_file( $bin_file, binmode => ':raw' ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 array_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an array reference which contains the lines of the
+slurped file. The following two calls are equivilent:
+
+       my $lines_ref = read_file( $bin_file, array_ref => 1 ) ;
+       my $lines_ref = [ read_file( $bin_file ) ] ;
+
+=head3 scalar_ref
+
+If this boolean option is set, the return value (only in scalar
+context) will be an scalar reference to a string which is the contents
+of the slurped file. This will usually be faster than returning the
+plain scalar.
+
+       my $text_ref = read_file( $bin_file, scalar_ref => 1 ) ;
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference and the slurped
+file contents will be stored in the scalar. This can be used in
+conjunction with any of the other options.
+
+       my $text_ref = read_file( $bin_file, buf_ref => \$buffer,
+                                            array_ref => 1 ) ;
+       my @lines = read_file( $bin_file, buf_ref => \$buffer ) ;
+
+=head3 blk_size
+
+You can use this option to set the block size used when slurping from an already open handle (like \*STDIN). It defaults to 1MB.
+
+       my $text_ref = read_file( $bin_file, blk_size => 10_000_000,
+                                            array_ref => 1 ) ;
+
+=head3 err_mode
+
+You can use this option to control how read_file behaves when an error
+occurs. This option defaults to 'croak'. You can set it to 'carp' or
+to 'quiet to have no error handling. This code wants to carp and then
+read abother file if it fails.
+
+       my $text_ref = read_file( $file, err_mode => 'carp' ) ;
+       unless ( $text_ref ) {
+
+               # read a different file but croak if not found
+               $text_ref = read_file( $another_file ) ;
+       }
+       
+       # process ${$text_ref}
+
+=head2 B<write_file>
+
+This sub writes out an entire file in one call.
+
+  write_file( 'filename', @data ) ;
+
+The first argument to C<write_file> is the filename. The next argument
+is an optional hash reference and it contains key/values that can
+modify the behavior of C<write_file>. The rest of the argument list is
+the data to be written to the file.
+
+  write_file( 'filename', {append => 1 }, @data ) ;
+  write_file( 'filename', {binmode => ':raw' }, $buffer ) ;
+
+As a shortcut if the first data argument is a scalar or array
+reference, it is used as the only data to be written to the file. Any
+following arguments in @_ are ignored. This is a faster way to pass in
+the output to be written to the file and is equivilent to the
+C<buf_ref> option. These following pairs are equivilent but the pass
+by reference call will be faster in most cases (especially with larger
+files).
+
+  write_file( 'filename', \$buffer ) ;
+  write_file( 'filename', $buffer ) ;
+
+  write_file( 'filename', \@lines ) ;
+  write_file( 'filename', @lines ) ;
+
+If the first argument is a file handle reference or I/O object (if
+fileno returns a defined value), then that handle is slurped in. This
+mode is supported so you spew to handles such as \*STDOUT. See the
+test handle.t for an example that does C<open( '-|' )> and child
+process spews data to the parant which slurps it in.  All of the
+options that control how the data is passes into C<write_file> still
+work in this case.
+
+The options are:
+
+=head3 binmode
+
+If you set the binmode option, then the file will be written in binary
+mode.
+
+       write_file( $bin_file, {binmode => ':raw'}, @data ) ;
+
+NOTE: this actually sets the O_BINARY mode flag for sysopen. It
+probably should call binmode and pass its argument to support other
+file modes.
+
+=head3 buf_ref
+
+You can use this option to pass in a scalar reference which has the
+data to be written. If this is set then any data arguments (including
+the scalar reference shortcut) in @_ will be ignored. These are
+equivilent:
+
+       write_file( $bin_file, { buf_ref => \$buffer } ) ;
+       write_file( $bin_file, \$buffer ) ;
+       write_file( $bin_file, $buffer ) ;
+
+=head3 append
+
+If you set this boolean option, the data will be written at the end of
+the current file.
+
+       write_file( $file, {append => 1}, @data ) ;
+
+C<write_file> croaks if it cannot open the file. It returns true if it
+succeeded in writing out the file and undef if there was an
+error. (Yes, I know if it croaks it can't return anything but that is
+for when I add the options to select the error handling mode).
+
+=head3 err_mode
+
+You can use this option to control how C<write_file> behaves when an
+error occurs. This option defaults to 'croak'. You can set it to
+'carp' or to 'quiet to have no error handling. If the first call to
+C<write_file> fails it will carp and then write to another file. If the
+second call to C<write_file> fails, it will croak.
+
+       unless ( write_file( $file, { err_mode => 'carp', \$data ) ;
+
+               # write a different file but croak if not found
+               write_file( $other_file, \$data ) ;
+       }
+
+=head2 overwrite_file
+
+This sub is just a typeglob alias to write_file since write_file
+always overwrites an existing file. This sub is supported for
+backwards compatibility with the original version of this module. See
+write_file for its API and behavior.
+
+=head2 append_file
+
+This sub will write its data to the end of the file. It is a wrapper
+around write_file and it has the same API so see that for the full
+documentation. These calls are equivilent:
+
+       append_file( $file, @data ) ;
+       write_file( $file, {append => 1}, @data ) ;
+
+=head2 read_dir
+
+This sub reads all the file names from directory and returns them to
+the caller but C<.> and C<..> are removed.
+
+       my @files = read_dir( '/path/to/dir' ) ;
+
+It croaks if it cannot open the directory.
+
+=head2 EXPORT
+
+  read_file write_file overwrite_file append_file read_dir
+
+=head1 AUTHOR
+
+Uri Guttman, E<lt>uri@stemsystems.comE<gt>
+
+=cut
diff --git a/slurp_talk/Slurp.pm.html b/slurp_talk/Slurp.pm.html

new file mode 100644 (file)

index 0000000..2aefb11
--- /dev/null
+++ b/slurp_talk/Slurp.pm.html
@@ -0,0 +1,277 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html>
+ <head>
+  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
+  <link rel="stylesheet" href="/s/style.css" type="text/css">
+  <title>search.cpan.org: File::Slurp - Efficient Reading/Writing of Complete Files</title>
+ </head>
+ <body  id=cpansearch class=std>
+
+<table width="100%"><tr><td rowspan=2 width="5%">
+<a href="/"><img src="/s/img/cpan_banner.png" alt="The CPAN Search Site"></a>
+</td>
+<td>
+<div class=menubar>&nbsp;
+ <a class=m href="/">Home</a> &nbsp;&nbsp;
+ <a class=m href="/author/">Authors</a> &nbsp;&nbsp;
+ <a class=m href="/recent">Recent</a> &nbsp;&nbsp;
+ <a class=m href="/site.html">About</a> &nbsp;&nbsp;
+ <a class=m href="/mirror">Mirrors</a> &nbsp;&nbsp;
+ <a class=m href="/faq.html">FAQ</a> &nbsp;&nbsp;
+ <a class=m href="/feedback">Feedback</a>
+&nbsp;</div>
+
+</td></tr><tr><td>
+ 
+<form method=get action="/search" name=f><table><tr><td>
+<input type="text" name="query" value="" size=35 ></td></tr><tr><td>in <select name="mode">
+ <option value="all">All</option>
+ <option value="module" >Modules</option>
+ <option value="dist" >Distributions</option>
+ <option value="author" >Authors</option>
+</select>&nbsp;<input type="submit" value="CPAN Search">
+</td></tr></table>
+</form>
+
+</td></tr></table>
+
+  
+
+ <a name="_top"></a>
+ 
+ <div class=path>
+  <a href="/~uri/">Uri Guttman</a> >
+  <a href="/~uri/File-Slurp-9999.01/">File-Slurp-9999.01</a> >
+  File::Slurp
+ </div>
+
+  Module Version:  9999.01 &nbsp;
+  <a href="/src/URI/File-Slurp-9999.01/lib/File/Slurp.pm">Source</a> &nbsp;
+ <p /><div class=pod>
+<a name="_top"></a>
+<div class=pod>
+<div class=toc><ul><li><a href="#NAME">NAME</a>
+<li><a href="#SYNOPSIS">SYNOPSIS</a>
+<li><a href="#DESCRIPTION">DESCRIPTION</a>
+<ul><li><a href="#read_file">read_file</a>
+<ul><li><a href="#binmode">binmode</a>
+<li><a href="#array_ref">array_ref</a>
+<li><a href="#scalar_ref">scalar_ref</a>
+<li><a href="#buf_ref">buf_ref</a>
+<li><a href="#blk_size">blk_size</a>
+<li><a href="#err_mode">err_mode</a>
+</ul>
+<li><a href="#write_file">write_file</a>
+<ul><li><a href="#binmode">binmode</a>
+<li><a href="#buf_ref">buf_ref</a>
+<li><a href="#append">append</a>
+<li><a href="#err_mode">err_mode</a>
+</ul>
+<li><a href="#overwrite_file">overwrite_file</a>
+<li><a href="#append_file">append_file</a>
+<li><a href="#read_dir">read_dir</a>
+<li><a href="#EXPORT">EXPORT</a>
+</ul>
+<li><a href="#AUTHOR">AUTHOR</a>
+</ul>
+</div>
+<!-- generated by TUCS::Pod2HTML v, using Pod::Simple::PullParser v2.02, under Perl v5.006001 at Tue Dec  9 05:21:57 2003 GMT -->
+<!-- start doc -->
+
+<h1><a name="NAME"></a
+>NAME <a href='#_top'><img alt='^' src='/s/img/up.gif'></a></h1>
+
+<p>File::Slurp &#45; Efficient Reading/Writing of Complete Files</p>
+
+<h1><a name="SYNOPSIS"></a
+>SYNOPSIS <a href='#_top'><img alt='^' src='/s/img/up.gif'></a></h1>
+
+<pre>  use File::Slurp;
+
+  my $text = read_file( &#39;filename&#39; ) ;
+  my @lines = read_file( &#39;filename&#39; ) ;
+
+  write_file( &#39;filename&#39;, @lines ) ;</pre>
+
+<h1><a name="DESCRIPTION"></a
+>DESCRIPTION <a href='#_top'><img alt='^' src='/s/img/up.gif'></a></h1>
+
+<p>This module provides subs that allow you to read or write entire files with one simple call. They are designed to be simple to use, have flexible ways to pass in or get the file contents and to be very efficient. There is also a sub to read in all the files in a directory other than <code>.</code> and <code>..</code></p>
+
+<p>Note that these slurp/spew subs work only for files and not for pipes or stdio. If you want to slurp the latter, use the standard techniques such as setting $/ to undef, reading &#60;&#62; in a list context, or printing all you want to STDOUT.</p>
+
+<h2><a name="read_file"></a
+><b>read_file</b></h2>
+
+<p>This sub reads in an entire file and returns its contents to the caller. In list context it will return a list of lines (using the current value of $/ as the separator. In scalar context it returns the entire file as a single scalar.</p>
+
+<pre>  my $text = read_file( &#39;filename&#39; ) ;
+  my @lines = read_file( &#39;filename&#39; ) ;</pre>
+
+<p>The first argument to <code>read_file</code> is the filename and the rest of the arguments are key/value pairs which are optional and which modify the behavior of the call. Other than binmode the options all control how the slurped file is returned to the caller.</p>
+
+<p>If the first argument is a file handle reference or I/O object (if fileno returns a defined value), then that handle is slurped in. This mode is supported so you slurp handles such as &#60;DATA&#62;, \*STDIN. See the test handle.t for an example that does <code>open( &#39;&#45;|&#39; )</code> and child process spews data to the parant which slurps it in. All of the options that control how the data is returned to the caller still work in this case.</p>
+
+<p>The options are:</p>
+
+<h3><a name="binmode"></a
+>binmode</h3>
+
+<p>If you set the binmode option, then the file will be slurped in binary mode.</p>
+
+<pre>        my $bin_data = read_file( $bin_file, binmode =&#62; &#39;:raw&#39; ) ;</pre>
+
+<p>NOTE: this actually sets the O_BINARY mode flag for sysopen. It probably should call binmode and pass its argument to support other file modes.</p>
+
+<h3><a name="array_ref"></a
+>array_ref</h3>
+
+<p>If this boolean option is set, the return value (only in scalar context) will be an array reference which contains the lines of the slurped file. The following two calls are equivilent:</p>
+
+<pre>        my $lines_ref = read_file( $bin_file, array_ref =&#62; 1 ) ;
+        my $lines_ref = [ read_file( $bin_file ) ] ;</pre>
+
+<h3><a name="scalar_ref"></a
+>scalar_ref</h3>
+
+<p>If this boolean option is set, the return value (only in scalar context) will be an scalar reference to a string which is the contents of the slurped file. This will usually be faster than returning the plain scalar.</p>
+
+<pre>        my $text_ref = read_file( $bin_file, scalar_ref =&#62; 1 ) ;</pre>
+
+<h3><a name="buf_ref"></a
+>buf_ref</h3>
+
+<p>You can use this option to pass in a scalar reference and the slurped file contents will be stored in the scalar. This can be used in conjunction with any of the other options.</p>
+
+<pre>        my $text_ref = read_file( $bin_file, buf_ref =&#62; \$buffer,
+                                             array_ref =&#62; 1 ) ;
+        my @lines = read_file( $bin_file, buf_ref =&#62; \$buffer ) ;</pre>
+
+<h3><a name="blk_size"></a
+>blk_size</h3>
+
+<p>You can use this option to set the block size used when slurping from an already open handle (like \*STDIN). It defaults to 1MB.</p>
+
+<pre>        my $text_ref = read_file( $bin_file, blk_size =&#62; 10_000_000,
+                                             array_ref =&#62; 1 ) ;</pre>
+
+<h3><a name="err_mode"></a
+>err_mode</h3>
+
+<p>You can use this option to control how read_file behaves when an error occurs. This option defaults to &#39;croak&#39;. You can set it to &#39;carp&#39; or to &#39;quiet to have no error handling. This code wants to carp and then read abother file if it fails.</p>
+
+<pre>        my $text_ref = read_file( $file, err_mode =&#62; &#39;carp&#39; ) ;
+        unless ( $text_ref ) {
+
+                # read a different file but croak if not found
+                $text_ref = read_file( $another_file ) ;
+        }
+        
+        # process ${$text_ref}</pre>
+
+<h2><a name="write_file"></a
+><b>write_file</b></h2>
+
+<p>This sub writes out an entire file in one call.</p>
+
+<pre>  write_file( &#39;filename&#39;, @data ) ;</pre>
+
+<p>The first argument to <code>write_file</code> is the filename. The next argument is an optional hash reference and it contains key/values that can modify the behavior of <code>write_file</code>. The rest of the argument list is the data to be written to the file.</p>
+
+<pre>  write_file( &#39;filename&#39;, {append =&#62; 1 }, @data ) ;
+  write_file( &#39;filename&#39;, {binmode =&#62; &#39;:raw&#39; }, $buffer ) ;</pre>
+
+<p>As a shortcut if the first data argument is a scalar or array reference, it is used as the only data to be written to the file. Any following arguments in @_ are ignored. This is a faster way to pass in the output to be written to the file and is equivilent to the <code>buf_ref</code> option. These following pairs are equivilent but the pass by reference call will be faster in most cases (especially with larger files).</p>
+
+<pre>  write_file( &#39;filename&#39;, \$buffer ) ;
+  write_file( &#39;filename&#39;, $buffer ) ;
+
+  write_file( &#39;filename&#39;, \@lines ) ;
+  write_file( &#39;filename&#39;, @lines ) ;</pre>
+
+<p>If the first argument is a file handle reference or I/O object (if fileno returns a defined value), then that handle is slurped in. This mode is supported so you spew to handles such as \*STDOUT. See the test handle.t for an example that does <code>open( &#39;&#45;|&#39; )</code> and child process spews data to the parant which slurps it in. All of the options that control how the data is passes into <code>write_file</code> still work in this case.</p>
+
+<p>The options are:</p>
+
+<h3><a name="binmode"></a
+>binmode</h3>
+
+<p>If you set the binmode option, then the file will be written in binary mode.</p>
+
+<pre>        write_file( $bin_file, {binmode =&#62; &#39;:raw&#39;}, @data ) ;</pre>
+
+<p>NOTE: this actually sets the O_BINARY mode flag for sysopen. It probably should call binmode and pass its argument to support other file modes.</p>
+
+<h3><a name="buf_ref"></a
+>buf_ref</h3>
+
+<p>You can use this option to pass in a scalar reference which has the data to be written. If this is set then any data arguments (including the scalar reference shortcut) in @_ will be ignored. These are equivilent:</p>
+
+<pre>        write_file( $bin_file, { buf_ref =&#62; \$buffer } ) ;
+        write_file( $bin_file, \$buffer ) ;
+        write_file( $bin_file, $buffer ) ;</pre>
+
+<h3><a name="append"></a
+>append</h3>
+
+<p>If you set this boolean option, the data will be written at the end of the current file.</p>
+
+<pre>        write_file( $file, {append =&#62; 1}, @data ) ;</pre>
+
+<p><code>write_file</code> croaks if it cannot open the file. It returns true if it succeeded in writing out the file and undef if there was an error. (Yes, I know if it croaks it can&#39;t return anything but that is for when I add the options to select the error handling mode).</p>
+
+<h3><a name="err_mode"></a
+>err_mode</h3>
+
+<p>You can use this option to control how <code>write_file</code> behaves when an error occurs. This option defaults to &#39;croak&#39;. You can set it to &#39;carp&#39; or to &#39;quiet to have no error handling. If the first call to <code>write_file</code> fails it will carp and then write to another file. If the second call to <code>write_file</code> fails, it will croak.</p>
+
+<pre>        unless ( write_file( $file, { err_mode =&#62; &#39;carp&#39;, \$data ) ;
+
+                # write a different file but croak if not found
+                write_file( $other_file, \$data ) ;
+        }</pre>
+
+<h2><a name="overwrite_file"></a
+>overwrite_file</h2>
+
+<p>This sub is just a typeglob alias to write_file since write_file always overwrites an existing file. This sub is supported for backwards compatibility with the original version of this module. See write_file for its API and behavior.</p>
+
+<h2><a name="append_file"></a
+>append_file</h2>
+
+<p>This sub will write its data to the end of the file. It is a wrapper around write_file and it has the same API so see that for the full documentation. These calls are equivilent:</p>
+
+<pre>        append_file( $file, @data ) ;
+        write_file( $file, {append =&#62; 1}, @data ) ;</pre>
+
+<h2><a name="read_dir"></a
+>read_dir</h2>
+
+<p>This sub reads all the file names from directory and returns them to the caller but <code>.</code> and <code>..</code> are removed.</p>
+
+<pre>        my @files = read_dir( &#39;/path/to/dir&#39; ) ;</pre>
+
+<p>It croaks if it cannot open the directory.</p>
+
+<h2><a name="EXPORT"></a
+>EXPORT</h2>
+
+<pre>  read_file write_file overwrite_file append_file read_dir</pre>
+
+<h1><a name="AUTHOR"></a
+>AUTHOR <a href='#_top'><img alt='^' src='/s/img/up.gif'></a></h1>
+
+<p>Uri Guttman, &#60;uri@stemsystems.com&#62;</p>
+<!-- end doc -->
+
+</div>
+
+
+</div>
+
+<!-- Tue Dec  9 05:21:57 2003 GMT (0.239951014518738) -->
+ </body>
+</html>
+
+
+
diff --git a/slurp_talk/slurp_article.html b/slurp_talk/slurp_article.html

new file mode 100644 (file)

index 0000000..7adcaf8
--- /dev/null
+++ b/slurp_talk/slurp_article.html
@@ -0,0 +1,723 @@
+<HTML>
+<HEAD>
+<TITLE>Perl Slurp Ease</TITLE>
+<LINK REV="made" HREF="mailto:steve@dewitt.vnet.net">
+</HEAD>
+
+<BODY>
+
+<A NAME="__index__"></A>
+<!-- INDEX BEGIN -->
+
+<UL>
+
+       <LI><A HREF="#perl slurp ease">Perl Slurp Ease</A></LI>
+       <UL>
+
+               <LI><A HREF="#introduction">Introduction</A></LI>
+               <LI><A HREF="#global operations">Global Operations</A></LI>
+               <LI><A HREF="#traditional slurping">Traditional Slurping</A></LI>
+               <LI><A HREF="#write slurping">Write Slurping</A></LI>
+               <LI><A HREF="#slurp on the cpan">Slurp on the CPAN</A></LI>
+               <LI><A HREF="#slurping api design">Slurping API Design</A></LI>
+               <LI><A HREF="#fast slurping">Fast Slurping</A></LI>
+               <UL>
+
+                       <LI><A HREF="#scalar slurp of short file">Scalar Slurp of Short File</A></LI>
+                       <LI><A HREF="#scalar slurp of long file">Scalar Slurp of Long File</A></LI>
+                       <LI><A HREF="#list slurp of short file">List Slurp of Short File</A></LI>
+                       <LI><A HREF="#list slurp of long file">List Slurp of Long File</A></LI>
+                       <LI><A HREF="#scalar spew of short file">Scalar Spew of Short File</A></LI>
+                       <LI><A HREF="#scalar spew of long file">Scalar Spew of Long File</A></LI>
+                       <LI><A HREF="#list spew of short file">List Spew of Short File</A></LI>
+                       <LI><A HREF="#list spew of long file">List Spew of Long File</A></LI>
+                       <LI><A HREF="#benchmark conclusion">Benchmark Conclusion</A></LI>
+               </UL>
+
+               <LI><A HREF="#error handling">Error Handling</A></LI>
+               <LI><A HREF="#file::fastslurp">File::FastSlurp</A></LI>
+               <LI><A HREF="#slurping in perl 6">Slurping in Perl 6</A></LI>
+               <LI><A HREF="#in summary">In Summary</A></LI>
+               <LI><A HREF="#acknowledgements">Acknowledgements</A></LI>
+       </UL>
+
+</UL>
+<!-- INDEX END -->
+
+<HR>
+<P>
+<H1><A NAME="perl slurp ease">Perl Slurp Ease</A></H1>
+<P>
+<H2><A NAME="introduction">Introduction</A></H2>
+<P>One of the common Perl idioms is processing text files line by line:</P>
+<PRE>
+        while( &lt;FH&gt; ) {
+                do something with $_
+        }</PRE>
+<P>This idiom has several variants, but the key point is that it reads in
+only one line from the file in each loop iteration. This has several
+advantages, including limiting memory use to one line, the ability to
+handle any size file (including data piped in via STDIN), and it is
+easily taught and understood to Perl newbies. In fact newbies are the
+ones who do silly things like this:</P>
+<PRE>
+        while( &lt;FH&gt; ) {
+                push @lines, $_ ;
+        }</PRE>
+<PRE>
+        foreach ( @lines ) {
+                do something with $_
+        }</PRE>
+<P>Line by line processing is fine, but it isn't the only way to deal with
+reading files. The other common style is reading the entire file into a
+scalar or array, and that is commonly known as slurping. Now, slurping has
+somewhat of a poor reputation, and this article is an attempt at
+rehabilitating it. Slurping files has advantages and limitations, and is
+not something you should just do when line by line processing is fine.
+It is best when you need the entire file in memory for processing all at
+once. Slurping with in memory processing can be faster and lead to
+simpler code than line by line if done properly.</P>
+<P>The biggest issue to watch for with slurping is file size. Slurping very
+large files or unknown amounts of data from STDIN can be disastrous to
+your memory usage and cause swap disk thrashing.  You can slurp STDIN if
+you know that you can handle the maximum size input without
+detrimentally affecting your memory usage. So I advocate slurping only
+disk files and only when you know their size is reasonable and you have
+a real reason to process the file as a whole.  Note that reasonable size
+these days is larger than the bad old days of limited RAM. Slurping in a
+megabyte is not an issue on most systems. But most of the
+files I tend to slurp in are much smaller than that. Typical files that
+work well with slurping are configuration files, (mini-)language scripts,
+some data (especially binary) files, and other files of known sizes
+which need fast processing.</P>
+<P>Another major win for slurping over line by line is speed. Perl's IO
+system (like many others) is slow. Calling <CODE>&lt;&gt;</CODE> for each line
+requires a check for the end of line, checks for EOF, copying a line,
+munging the internal handle structure, etc. Plenty of work for each line
+read in. Whereas slurping, if done correctly, will usually involve only
+one I/O call and no extra data copying. The same is true for writing
+files to disk, and we will cover that as well (even though the term
+slurping is traditionally a read operation, I use the term ``slurp'' for
+the concept of doing I/O with an entire file in one operation).</P>
+<P>Finally, when you have slurped the entire file into memory, you can do
+operations on the data that are not possible or easily done with line by
+line processing. These include global search/replace (without regard for
+newlines), grabbing all matches with one call of <CODE>//g</CODE>, complex parsing
+(which in many cases must ignore newlines), processing *ML (where line
+endings are just white space) and performing complex transformations
+such as template expansion.</P>
+<P>
+<H2><A NAME="global operations">Global Operations</A></H2>
+<P>Here are some simple global operations that can be done quickly and
+easily on an entire file that has been slurped in. They could also be
+done with line by line processing but that would be slower and require
+more code.</P>
+<P>A common problem is reading in a file with key/value pairs. There are
+modules which do this but who needs them for simple formats? Just slurp
+in the file and do a single parse to grab all the key/value pairs.</P>
+<PRE>
+        my $text = read_file( $file ) ;
+        my %config = $test =~ /^(\w+)=(.+)$/mg ;</PRE>
+<P>That matches a key which starts a line (anywhere inside the string
+because of the <CODE>/m</CODE> modifier), the '=' char and the text to the end of the
+line (again, <CODE>/m</CODE> makes that work). In fact the ending <CODE>$</CODE> is not even needed
+since <CODE>.</CODE> will not normally match a newline. Since the key and value are
+grabbed and the <CODE>m//</CODE> is in list context with the <CODE>/g</CODE> modifier, it will
+grab all key/value pairs and return them. The <CODE>%config</CODE>hash will be
+assigned this list and now you have the file fully parsed into a hash.</P>
+<P>Various projects I have worked on needed some simple templating and I
+wasn't in the mood to use a full module (please, no flames about your
+favorite template module :-). So I rolled my own by slurping in the
+template file, setting up a template hash and doing this one line:</P>
+<PRE>
+        $text =~ s/&lt;%(.+?)%&gt;/$template{$1}/g ;</PRE>
+<P>That only works if the entire file was slurped in. With a little
+extra work it can handle chunks of text to be expanded:</P>
+<PRE>
+        $text =~ s/&lt;%(\w+)_START%&gt;(.+?)&lt;%\1_END%&gt;/ template($1, $2)/sge ;</PRE>
+<P>Just supply a <CODE>template</CODE> sub to expand the text between the markers and
+you have yourself a simple system with minimal code. Note that this will
+work and grab over multiple lines due the the <CODE>/s</CODE> modifier. This is
+something that is much trickier with line by line processing.</P>
+<P>Note that this is a very simple templating system, and it can't directly
+handle nested tags and other complex features. But even if you use one
+of the myriad of template modules on the CPAN, you will gain by having
+speedier ways to read and write files.</P>
+<P>Slurping in a file into an array also offers some useful advantages. 
+One simple example is reading in a flat database where each record has
+fields separated by a character such as <CODE>:</CODE>:</P>
+<PRE>
+        my @pw_fields = map [ split /:/ ], read_file( '/etc/passwd' ) ;</PRE>
+<P>Random access to any line of the slurped file is another advantage. Also
+a line index could be built to speed up searching the array of lines.</P>
+<P>
+<H2><A NAME="traditional slurping">Traditional Slurping</A></H2>
+<P>Perl has always supported slurping files with minimal code. Slurping of
+a file to a list of lines is trivial, just call the <CODE>&lt;&gt;</CODE> operator
+in a list context:</P>
+<PRE>
+        my @lines = &lt;FH&gt; ;</PRE>
+<P>and slurping to a scalar isn't much more work. Just set the built in
+variable <CODE>$/</CODE> (the input record separator to the undefined value and read
+in the file with <CODE>&lt;&gt;</CODE>:</P>
+<PRE>
+        {
+                local( $/, *FH ) ;
+                open( FH, $file ) or die &quot;sudden flaming death\n&quot;
+                $text = &lt;FH&gt;
+        }</PRE>
+<P>Notice the use of <CODE>local()</CODE>. It sets <CODE>$/</CODE> to <CODE>undef</CODE> for you and when
+the scope exits it will revert <CODE>$/</CODE> back to its previous value (most
+likely ``\n'').</P>
+<P>Here is a Perl idiom that allows the <CODE>$text</CODE> variable to be declared,
+and there is no need for a tightly nested block. The <CODE>do</CODE> block will
+execute <CODE>&lt;FH&gt;</CODE> in a scalar context and slurp in the file named by
+<CODE>$text</CODE>:</P>
+<PRE>
+        local( *FH ) ;
+        open( FH, $file ) or die &quot;sudden flaming death\n&quot;
+        my $text = do { local( $/ ) ; &lt;FH&gt; } ;</PRE>
+<P>Both of those slurps used localized filehandles to be compatible with
+5.005. Here they are with 5.6.0 lexical autovivified handles:</P>
+<PRE>
+        {
+                local( $/ ) ;
+                open( my $fh, $file ) or die &quot;sudden flaming death\n&quot;
+                $text = &lt;$fh&gt;
+        }</PRE>
+<PRE>
+        open( my $fh, $file ) or die &quot;sudden flaming death\n&quot;
+        my $text = do { local( $/ ) ; &lt;$fh&gt; } ;</PRE>
+<P>And this is a variant of that idiom that removes the need for the open
+call:</P>
+<PRE>
+        my $text = do { local( @ARGV, $/ ) = $file ; &lt;&gt; } ;</PRE>
+<P>The filename in <CODE>$file</CODE> is assigned to a localized <CODE>@ARGV</CODE> and the
+null filehandle is used which reads the data from the files in <CODE>@ARGV</CODE>.</P>
+<P>Instead of assigning to a scalar, all the above slurps can assign to an
+array and it will get the file but split into lines (using <CODE>$/</CODE> as the
+end of line marker).</P>
+<P>There is one common variant of those slurps which is very slow and not
+good code. You see it around, and it is almost always cargo cult code:</P>
+<PRE>
+        my $text = join( '', &lt;FH&gt; ) ;</PRE>
+<P>That needlessly splits the input file into lines (<CODE>join</CODE> provides a
+list context to <CODE>&lt;FH&gt;</CODE>) and then joins up those lines again. The
+original coder of this idiom obviously never read <EM>perlvar</EM> and learned
+how to use <CODE>$/</CODE> to allow scalar slurping.</P>
+<P>
+<H2><A NAME="write slurping">Write Slurping</A></H2>
+<P>While reading in entire files at one time is common, writing out entire
+files is also done. We call it ``slurping'' when we read in files, but
+there is no commonly accepted term for the write operation. I asked some
+Perl colleagues and got two interesting nominations. Peter Scott said to
+call it ``burping'' (rhymes with ``slurping'' and suggests movement in
+the opposite direction). Others suggested ``spewing'' which has a
+stronger visual image :-) Tell me your favorite or suggest your own. I
+will use both in this section so you can see how they work for you.</P>
+<P>Spewing a file is a much simpler operation than slurping. You don't have
+context issues to worry about and there is no efficiency problem with
+returning a buffer. Here is a simple burp subroutine:</P>
+<PRE>
+        sub burp {
+                my( $file_name ) = shift ;
+                open( my $fh, &quot;&gt;$file_name&quot; ) || 
+                                 die &quot;can't create $file_name $!&quot; ;
+                print $fh @_ ;
+        }</PRE>
+<P>Note that it doesn't copy the input text but passes @_ directly to
+print. We will look at faster variations of that later on.</P>
+<P>
+<H2><A NAME="slurp on the cpan">Slurp on the CPAN</A></H2>
+<P>As you would expect there are modules in the CPAN that will slurp files
+for you. The two I found are called Slurp.pm (by Rob Casey - ROBAU on
+CPAN) and File::Slurp.pm (by David Muir Sharnoff - MUIR on CPAN).</P>
+<P>Here is the code from Slurp.pm:</P>
+<PRE>
+    sub slurp { 
+        local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+        return &lt;ARGV&gt;;
+    }</PRE>
+<PRE>
+    sub to_array {
+        my @array = slurp( @_ );
+        return wantarray ? @array : \@array;
+    }</PRE>
+<PRE>
+    sub to_scalar {
+        my $scalar = slurp( @_ );
+        return $scalar;
+    }</PRE>
+<P>+The subroutine <CODE>slurp()</CODE> uses the magic undefined value of <CODE>$/</CODE> and
+the magic file +handle <CODE>ARGV</CODE> to support slurping into a scalar or
+array. It also provides two wrapper subs that allow the caller to
+control the context of the slurp. And the <CODE>to_array()</CODE> subroutine will
+return the list of slurped lines or a anonymous array of them according
+to its caller's context by checking <CODE>wantarray</CODE>. It has 'slurp' in
+<CODE>@EXPORT</CODE> and all three subroutines in <CODE>@EXPORT_OK</CODE>.</P>
+<P>&lt;Footnote: Slurp.pm is poorly named and it shouldn't be in the top level
+namespace.&gt;</P>
+<P>The original File::Slurp.pm has this code:</P>
+<P>sub read_file
+{
+       my ($file) = @_;</P>
+<PRE>
+        local($/) = wantarray ? $/ : undef;
+        local(*F);
+        my $r;
+        my (@r);</PRE>
+<PRE>
+        open(F, &quot;&lt;$file&quot;) || croak &quot;open $file: $!&quot;;
+        @r = &lt;F&gt;;
+        close(F) || croak &quot;close $file: $!&quot;;</PRE>
+<PRE>
+        return $r[0] unless wantarray;
+        return @r;
+}</PRE>
+<P>This module provides several subroutines including <CODE>read_file()</CODE> (more
+on the others later). <CODE>read_file()</CODE> behaves simularly to
+<CODE>Slurp::slurp()</CODE> in that it will slurp a list of lines or a single
+scalar depending on the caller's context. It also uses the magic
+undefined value of <CODE>$/</CODE> for scalar slurping but it uses an explicit
+open call rather than using a localized <CODE>@ARGV</CODE> and the other module
+did. Also it doesn't provide a way to get an anonymous array of the
+lines but that can easily be rectified by calling it inside an anonymous
+array constuctor <CODE>[]</CODE>.</P>
+<P>Both of these modules make it easier for Perl coders to slurp in
+files. They both use the magic <CODE>$/</CODE> to slurp in scalar mode and the
+natural behavior of <CODE>&lt;&gt;</CODE> in list context to slurp as lines. But
+neither is optmized for speed nor can they handle <CODE>binmode()</CODE> to
+support binary or unicode files. See below for more on slurp features
+and speedups.</P>
+<P>
+<H2><A NAME="slurping api design">Slurping API Design</A></H2>
+<P>The slurp modules on CPAN are have a very simple API and don't support
+<CODE>binmode()</CODE>. This section will cover various API design issues such as
+efficient return by reference, <CODE>binmode()</CODE> and calling variations.</P>
+<P>Let's start with the call variations. Slurped files can be returned in
+four formats: as a single scalar, as a reference to a scalar, as a list
+of lines or as an anonymous array of lines. But the caller can only
+provide two contexts: scalar or list. So we have to either provide an
+API with more than one subroutine (as Slurp.pm did) or just provide one
+subroutine which only returns a scalar or a list (not an anonymous
+array) as File::Slurp does.</P>
+<P>I have used my own <CODE>read_file()</CODE> subroutine for years and it has the
+same API as File::Slurp: a single subroutine that returns a scalar or a
+list of lines depending on context. But I recognize the interest of
+those that want an anonymous array for line slurping. For one thing, it
+is easier to pass around to other subs and for another, it eliminates
+the extra copying of the lines via <CODE>return</CODE>. So my module provides only
+one slurp subroutine that returns the file data based on context and any
+format options passed in. There is no need for a specific
+slurp-in-as-a-scalar or list subroutine as the general <CODE>read_file()</CODE>
+sub will do that by default in the appropriate context. If you want
+<CODE>read_file()</CODE> to return a scalar reference or anonymous array of lines,
+you can request those formats with options. You can even pass in a
+reference to a scalar (e.g. a previously allocated buffer) and have that
+filled with the slurped data (and that is one of the fastest slurp
+modes. see the benchmark section for more on that).  If you want to
+slurp a scalar into an array, just select the desired array element and
+that will provide scalar context to the <CODE>read_file()</CODE> subroutine.</P>
+<P>The next area to cover is what to name the slurp sub. I will go with
+<CODE>read_file()</CODE>. It is descriptive and keeps compatibilty with the
+current simple and don't use the 'slurp' nickname (though that nickname
+is in the module name). Also I decided to keep the  File::Slurp
+namespace which was graciously handed over to me by its current owner,
+David Muir.</P>
+<P>Another critical area when designing APIs is how to pass in
+arguments. The <CODE>read_file()</CODE> subroutine takes one required argument
+which is the file name. To support <CODE>binmode()</CODE> we need another optional
+argument. A third optional argument is needed to support returning a
+slurped scalar by reference. My first thought was to design the API with
+3 positional arguments - file name, buffer reference and binmode. But if
+you want to set the binmode and not pass in a buffer reference, you have
+to fill the second argument with <CODE>undef</CODE> and that is ugly. So I decided
+to make the filename argument positional and the other two named. The
+subroutine starts off like this:</P>
+<PRE>
+        sub read_file {</PRE>
+<PRE>
+                my( $file_name, %args ) = @_ ;</PRE>
+<PRE>
+                my $buf ;
+                my $buf_ref = $args{'buf'} || \$buf ;</PRE>
+<P>The other sub (<CODE>read_file_lines()</CODE>) will only take an optional binmode
+(so you can read files with binary delimiters). It doesn't need a buffer
+reference argument since it can return an anonymous array if the called
+in a scalar context. So this subroutine could use positional arguments,
+but to keep its API similar to the API of <CODE>read_file()</CODE>, it will also
+use pass by name for the optional arguments. This also means that new
+optional arguments can be added later without breaking any legacy
+code. A bonus with keeping the API the same for both subs will be seen
+how the two subs are optimized to work together.</P>
+<P>Write slurping (or spewing or burping :-)) needs to have its API
+designed as well. The biggest issue is not only needing to support
+optional arguments but a list of arguments to be written is needed. Perl
+6 will be able to handle that with optional named arguments and a final
+slurp argument. Since this is Perl 5 we have to do it using some
+cleverness. The first argument is the file name and it will be
+positional as with the <CODE>read_file</CODE> subroutine. But how can we pass in
+the optional arguments and also a list of data? The solution lies in the
+fact that the data list should never contain a reference.
+Burping/spewing works only on plain data. So if the next argument is a
+hash reference, we can assume it cointains the optional arguments and
+the rest of the arguments is the data list. So the <CODE>write_file()</CODE>
+subroutine will start off like this:</P>
+<PRE>
+        sub write_file {</PRE>
+<PRE>
+                my $file_name = shift ;</PRE>
+<PRE>
+                my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;</PRE>
+<P>Whether or not optional arguments are passed in, we leave the data list
+in <CODE>@_</CODE> to minimize any more copying. You call <CODE>write_file()</CODE> like this:</P>
+<PRE>
+        write_file( 'foo', { binmode =&gt; ':raw' }, @data ) ;
+        write_file( 'junk', { append =&gt; 1 }, @more_junk ) ;
+        write_file( 'bar', @spew ) ;</PRE>
+<P>
+<H2><A NAME="fast slurping">Fast Slurping</A></H2>
+<P>Somewhere along the line, I learned about a way to slurp files faster
+than by setting $/ to undef. The method is very simple, you do a single
+read call with the size of the file (which the -s operator provides).
+This bypasses the I/O loop inside perl that checks for EOF and does all
+sorts of processing. I then decided to experiment and found that
+sysread is even faster as you would expect. sysread bypasses all of
+Perl's stdio and reads the file from the kernel buffers directly into a
+Perl scalar. This is why the slurp code in File::Slurp uses
+sysopen/sysread/syswrite. All the rest of the code is just to support
+the various options and data passing techniques.</P>
+<P></P>
+<PRE>
+
+=head2 Benchmarks</PRE>
+<P>Benchmarks can be enlightening, informative, frustrating and
+deceiving. It would make no sense to create a new and more complex slurp
+module unless it also gained signifigantly in speed. So I created a
+benchmark script which compares various slurp methods with differing
+file sizes and calling contexts. This script can be run from the main
+directory of the tarball like this:</P>
+<PRE>
+        perl -Ilib extras/slurp_bench.pl</PRE>
+<P>If you pass in an argument on the command line, it will be passed to
+<CODE>timethese()</CODE> and it will control the duration. It defaults to -2 which
+makes each benchmark run to at least 2 seconds of cpu time.</P>
+<P>The following numbers are from a run I did on my 300Mhz sparc. You will
+most likely get much faster counts on your boxes but the relative speeds
+shouldn't change by much. If you see major differences on your
+benchmarks, please send me the results and your Perl and OS
+versions. Also you can play with the benchmark script and add more slurp
+variations or data files.</P>
+<P>The rest of this section will be discussing the results of the
+benchmarks. You can refer to extras/slurp_bench.pl to see the code for
+the individual benchmarks. If the benchmark name starts with cpan_, it
+is either from Slurp.pm or File::Slurp.pm. Those starting with new_ are
+from the new File::Slurp.pm. Those that start with file_contents_ are
+from a client's code base. The rest are variations I created to
+highlight certain aspects of the benchmarks.</P>
+<P>The short and long file data is made like this:</P>
+<PRE>
+        my @lines = ( 'abc' x 30 . &quot;\n&quot;)  x 100 ;
+        my $text = join( '', @lines ) ;</PRE>
+<PRE>
+        @lines = ( 'abc' x 40 . &quot;\n&quot;)  x 1000 ;
+        $text = join( '', @lines ) ;</PRE>
+<P>So the short file is 9,100 bytes and the long file is 121,000
+bytes.</P>
+<P>
+<H3><A NAME="scalar slurp of short file">Scalar Slurp of Short File</A></H3>
+<PRE>
+        file_contents        651/s
+        file_contents_no_OO  828/s
+        cpan_read_file      1866/s
+        cpan_slurp          1934/s
+        read_file           2079/s
+        new                 2270/s
+        new_buf_ref         2403/s
+        new_scalar_ref      2415/s
+        sysread_file        2572/s</PRE>
+<P>
+<H3><A NAME="scalar slurp of long file">Scalar Slurp of Long File</A></H3>
+<PRE>
+        file_contents_no_OO 82.9/s
+        file_contents       85.4/s
+        cpan_read_file       250/s
+        cpan_slurp           257/s
+        read_file            323/s
+        new                  468/s
+        sysread_file         489/s
+        new_scalar_ref       766/s
+        new_buf_ref          767/s</PRE>
+<P>The primary inference you get from looking at the mumbers above is that
+when slurping a file into a scalar, the longer the file, the more time
+you save by returning the result via a scalar reference. The time for
+the extra buffer copy can add up. The new module came out on top overall
+except for the very simple sysread_file entry which was added to
+highlight the overhead of the more flexible new module which isn't that
+much. The file_contents entries are always the worst since they do a
+list slurp and then a join, which is a classic newbie and cargo culted
+style which is extremely slow. Also the OO code in file_contents slows
+it down even more (I added the file_contents_no_OO entry to show this).
+The two CPAN modules are decent with small files but they are laggards
+compared to the new module when the file gets much larger.</P>
+<P>
+<H3><A NAME="list slurp of short file">List Slurp of Short File</A></H3>
+<PRE>
+        cpan_read_file          589/s
+        cpan_slurp_to_array     620/s
+        read_file               824/s
+        new_array_ref           824/s
+        sysread_file            828/s
+        new                     829/s
+        new_in_anon_array       833/s
+        cpan_slurp_to_array_ref 836/s</PRE>
+<P>
+<H3><A NAME="list slurp of long file">List Slurp of Long File</A></H3>
+<PRE>
+        cpan_read_file          62.4/s
+        cpan_slurp_to_array     62.7/s
+        read_file               92.9/s
+        sysread_file            94.8/s
+        new_array_ref           95.5/s
+        new                     96.2/s
+        cpan_slurp_to_array_ref 96.3/s
+        new_in_anon_array       97.2/s</PRE>
+<P>This is perhaps the most interesting result of this benchmark. Five
+different entries have effectively tied for the lead. The logical
+conclusion is that splitting the input into lines is the bounding
+operation, no matter how the file gets slurped. This is the only
+benchmark where the new module isn't the clear winner (in the long file
+entries - it is no worse than a close second in the short file
+entries).</P>
+<P>Note: In the benchmark information for all the spew entries, the extra
+number at the end of each line is how many wallclock seconds the whole
+entry took. The benchmarks were run for at least 2 CPU seconds per
+entry. The unusually large wallclock times will be discussed below.</P>
+<P>
+<H3><A NAME="scalar spew of short file">Scalar Spew of Short File</A></H3>
+<PRE>
+        cpan_write_file 1035/s  38
+        print_file      1055/s  41
+        syswrite_file   1135/s  44
+        new             1519/s  2
+        print_join_file 1766/s  2
+        new_ref         1900/s  2
+        syswrite_file2  2138/s  2</PRE>
+<P>
+<H3><A NAME="scalar spew of long file">Scalar Spew of Long File</A></H3>
+<PRE>
+        cpan_write_file 164/s   20
+        print_file      211/s   26
+        syswrite_file   236/s   25
+        print_join_file 277/s   2
+        new             295/s   2
+        syswrite_file2  428/s   2
+        new_ref         608/s   2</PRE>
+<P>In the scalar spew entries, the new module API wins when it is passed a
+reference to the scalar buffer. The <CODE>syswrite_file2</CODE> entry beats it
+with the shorter file due to its simpler code. The old CPAN module is
+the slowest due to its extra copying of the data and its use of print.</P>
+<P>
+<H3><A NAME="list spew of short file">List Spew of Short File</A></H3>
+<PRE>
+        cpan_write_file  794/s  29
+        syswrite_file   1000/s  38
+        print_file      1013/s  42
+        new             1399/s  2
+        print_join_file 1557/s  2</PRE>
+<P>
+<H3><A NAME="list spew of long file">List Spew of Long File</A></H3>
+<PRE>
+        cpan_write_file 112/s   12
+        print_file      179/s   21
+        syswrite_file   181/s   19
+        print_join_file 205/s   2
+        new             228/s   2</PRE>
+<P>Again, the simple <CODE>print_join_file</CODE> entry beats the new module when
+spewing a short list of lines to a file. But is loses to the new module
+when the file size gets longer. The old CPAN module lags behind the
+others since it first makes an extra copy of the lines and then it calls
+<CODE>print</CODE> on the output list and that is much slower than passing to
+<CODE>print</CODE> a single scalar generated by join. The <CODE>print_file</CODE> entry
+shows the advantage of directly printing <CODE>@_</CODE> and the
+<CODE>print_join_file</CODE> adds the join optimization.</P>
+<P>Now about those long wallclock times. If you look carefully at the
+benchmark code of all the spew entries, you will find that some always
+write to new files and some overwrite existing files. When I asked David
+Muir why the old File::Slurp module had an <CODE>overwrite</CODE> subroutine, he
+answered that by overwriting a file, you always guarantee something
+readable is in the file. If you create a new file, there is a moment
+when the new file is created but has no data in it. I feel this is not a
+good enough answer. Even when overwriting, you can write a shorter file
+than the existing file and then you have to truncate the file to the new
+size. There is a small race window there where another process can slurp
+in the file with the new data followed by leftover junk from the
+previous version of the file. This reinforces the point that the only
+way to ensure consistant file data is the proper use of file locks.</P>
+<P>But what about those long times? Well it is all about the difference
+between creating files and overwriting existing ones. The former have to
+allocate new inodes (or the equivilent on other file systems) and the
+latter can reuse the exising inode. This mean the overwrite will save on
+disk seeks as well as on cpu time. In fact when running this benchmark,
+I could hear my disk going crazy allocating inodes during the spew
+operations. This speedup in both cpu and wallclock is why the new module
+always does overwriting when spewing files. It also does the proper
+truncate (and this is checked in the tests by spewing shorter files
+after longer ones had previously been written). The <CODE>overwrite</CODE>
+subroutine is just an typeglob alias to <CODE>write_file</CODE> and is there for
+backwards compatibilty with the old File::Slurp module.</P>
+<P>
+<H3><A NAME="benchmark conclusion">Benchmark Conclusion</A></H3>
+<P>Other than a few cases where a simpler entry beat it out, the new
+File::Slurp module is either the speed leader or among the leaders. Its
+special APIs for passing buffers by reference prove to be very useful
+speedups. Also it uses all the other optimizations including using
+<CODE>sysread/syswrite</CODE> and joining output lines. I expect many projects
+that extensively use slurping will notice the speed improvements,
+especially if they rewrite their code to take advantage of the new API
+features. Even if they don't touch their code and use the simple API
+they will get a significant speedup.</P>
+<P>
+<H2><A NAME="error handling">Error Handling</A></H2>
+<P>Slurp subroutines are subject to conditions such as not being able to
+open the file, or I/O errors. How these errors are handled, and what the
+caller will see, are important aspects of the design of an API. The
+classic error handling for slurping has been to call <CODE>die()</CODE> or even
+better, <CODE>croak()</CODE>. But sometimes you want the slurp to either
+<CODE>warn()</CODE>/<CODE>carp()</CODE> or allow your code to handle the error. Sure, this
+can be done by wrapping the slurp in a <CODE>eval</CODE> block to catch a fatal
+error, but not everyone wants all that extra code. So I have added
+another option to all the subroutines which selects the error
+handling. If the 'err_mode' option is 'croak' (which is also the
+default), the called subroutine will croak. If set to 'carp' then carp
+will be called. Set to any other string (use 'quiet' when you want to
+be explicit) and no error handler is called. Then the caller can use the
+error status from the call.</P>
+<P><CODE>write_file()</CODE> doesn't use the return value for data so it can return a
+false status value in-band to mark an error. <CODE>read_file()</CODE> does use its
+return value for data, but we can still make it pass back the error
+status. A successful read in any scalar mode will return either a
+defined data string or a reference to a scalar or array. So a bare
+return would work here. But if you slurp in lines by calling it in a
+list context, a bare <CODE>return</CODE> will return an empty list, which is the
+same value it would get from an existing but empty file. So now,
+<CODE>read_file()</CODE> will do something I normally strongly advocate against,
+i.e., returning an explicit <CODE>undef</CODE> value. In the scalar context this
+still returns a error, and in list context, the returned first value
+will be <CODE>undef</CODE>, and that is not legal data for the first element. So
+the list context also gets a error status it can detect:</P>
+<PRE>
+        my @lines = read_file( $file_name, err_mode =&gt; 'quiet' ) ;
+        your_handle_error( &quot;$file_name can't be read\n&quot; ) unless
+                                        @lines &amp;&amp; defined $lines[0] ;</PRE>
+<P>
+<H2><A NAME="file::fastslurp">File::FastSlurp</A></H2>
+<PRE>
+        sub read_file {</PRE>
+<PRE>
+                my( $file_name, %args ) = @_ ;</PRE>
+<PRE>
+                my $buf ;
+                my $buf_ref = $args{'buf_ref'} || \$buf ;</PRE>
+<PRE>
+                my $mode = O_RDONLY ;
+                $mode |= O_BINARY if $args{'binmode'} ;</PRE>
+<PRE>
+                local( *FH ) ;
+                sysopen( FH, $file_name, $mode ) or
+                                        carp &quot;Can't open $file_name: $!&quot; ;</PRE>
+<PRE>
+                my $size_left = -s FH ;</PRE>
+<PRE>
+                while( $size_left &gt; 0 ) {</PRE>
+<PRE>
+                        my $read_cnt = sysread( FH, ${$buf_ref},
+                                        $size_left, length ${$buf_ref} ) ;</PRE>
+<PRE>
+                        unless( $read_cnt ) {</PRE>
+<PRE>
+                                carp &quot;read error in file $file_name: $!&quot; ;
+                                last ;
+                        }</PRE>
+<PRE>
+                        $size_left -= $read_cnt ;
+                }</PRE>
+<PRE>
+        # handle void context (return scalar by buffer reference)</PRE>
+<PRE>
+                return unless defined wantarray ;</PRE>
+<PRE>
+        # handle list context</PRE>
+<PRE>
+                return split m|?&lt;$/|g, ${$buf_ref} if wantarray ;</PRE>
+<PRE>
+        # handle scalar context</PRE>
+<PRE>
+                return ${$buf_ref} ;
+        }</PRE>
+<PRE>
+        sub write_file {</PRE>
+<PRE>
+                my $file_name = shift ;</PRE>
+<PRE>
+                my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+                my $buf = join '', @_ ;</PRE>
+<PRE>
+                my $mode = O_WRONLY ;
+                $mode |= O_BINARY if $args-&gt;{'binmode'} ;
+                $mode |= O_APPEND if $args-&gt;{'append'} ;</PRE>
+<PRE>
+                local( *FH ) ;
+                sysopen( FH, $file_name, $mode ) or
+                                        carp &quot;Can't open $file_name: $!&quot; ;</PRE>
+<PRE>
+                my $size_left = length( $buf ) ;
+                my $offset = 0 ;</PRE>
+<PRE>
+                while( $size_left &gt; 0 ) {</PRE>
+<PRE>
+                        my $write_cnt = syswrite( FH, $buf,
+                                        $size_left, $offset ) ;</PRE>
+<PRE>
+                        unless( $write_cnt ) {</PRE>
+<PRE>
+                                carp &quot;write error in file $file_name: $!&quot; ;
+                                last ;
+                        }</PRE>
+<PRE>
+                        $size_left -= $write_cnt ;
+                        $offset += $write_cnt ;
+                }</PRE>
+<PRE>
+                return ;
+        }</PRE>
+<P>
+<H2><A NAME="slurping in perl 6">Slurping in Perl 6</A></H2>
+<P>As usual with Perl 6, much of the work in this article will be put to
+pasture. Perl 6 will allow you to set a 'slurp' property on file handles
+and when you read from such a handle, the file is slurped. List and
+scalar context will still be supported so you can slurp into lines or a
+&lt;scalar. I would expect that support for slurping in Perl 6 will be
+optimized and bypass the stdio subsystem since it can use the slurp
+property to trigger a call to special code. Otherwise some enterprising
+individual will just create a File::FastSlurp module for Perl 6. The
+code in the Perl 5 module could easily be modified to Perl 6 syntax and
+semantics. Any volunteers?</P>
+<P>
+<H2><A NAME="in summary">In Summary</A></H2>
+<P>We have compared classic line by line processing with munging a whole
+file in memory. Slurping files can speed up your programs and simplify
+your code if done properly. You must still be aware to not slurp
+humongous files (logs, DNA sequences, etc.) or STDIN where you don't
+know how much data you will read in. But slurping megabyte sized files
+is not an major issue on today's systems with the typical amount of RAM
+installed. When Perl was first being used in depth (Perl 4), slurping
+was limited by the smaller RAM size of 10 years ago. Now, you should be
+able to slurp almost any reasonably sized file, whether it contains
+configuration, source code, data, etc.</P>
+<P>
+<H2><A NAME="acknowledgements">Acknowledgements</A></H2>
+
+</BODY>
+
+</HTML>
diff --git a/slurp_talk/slurp_article.pod b/slurp_talk/slurp_article.pod

new file mode 100644 (file)

index 0000000..e602bcc
--- /dev/null
+++ b/slurp_talk/slurp_article.pod
@@ -0,0 +1,743 @@
+=head1 Perl Slurp Ease
+
+=head2 Introduction
+ 
+
+One of the common Perl idioms is processing text files line by line:
+
+       while( <FH> ) {
+               do something with $_
+       }
+
+This idiom has several variants, but the key point is that it reads in
+only one line from the file in each loop iteration. This has several
+advantages, including limiting memory use to one line, the ability to
+handle any size file (including data piped in via STDIN), and it is
+easily taught and understood to Perl newbies. In fact newbies are the
+ones who do silly things like this:
+
+       while( <FH> ) {
+               push @lines, $_ ;
+       }
+
+       foreach ( @lines ) {
+               do something with $_
+       }
+
+Line by line processing is fine, but it isn't the only way to deal with
+reading files. The other common style is reading the entire file into a
+scalar or array, and that is commonly known as slurping. Now, slurping has
+somewhat of a poor reputation, and this article is an attempt at
+rehabilitating it. Slurping files has advantages and limitations, and is
+not something you should just do when line by line processing is fine.
+It is best when you need the entire file in memory for processing all at
+once. Slurping with in memory processing can be faster and lead to
+simpler code than line by line if done properly.
+
+The biggest issue to watch for with slurping is file size. Slurping very
+large files or unknown amounts of data from STDIN can be disastrous to
+your memory usage and cause swap disk thrashing.  You can slurp STDIN if
+you know that you can handle the maximum size input without
+detrimentally affecting your memory usage. So I advocate slurping only
+disk files and only when you know their size is reasonable and you have
+a real reason to process the file as a whole.  Note that reasonable size
+these days is larger than the bad old days of limited RAM. Slurping in a
+megabyte is not an issue on most systems. But most of the
+files I tend to slurp in are much smaller than that. Typical files that
+work well with slurping are configuration files, (mini-)language scripts,
+some data (especially binary) files, and other files of known sizes
+which need fast processing.
+
+Another major win for slurping over line by line is speed. Perl's IO
+system (like many others) is slow. Calling C<< <> >> for each line
+requires a check for the end of line, checks for EOF, copying a line,
+munging the internal handle structure, etc. Plenty of work for each line
+read in. Whereas slurping, if done correctly, will usually involve only
+one I/O call and no extra data copying. The same is true for writing
+files to disk, and we will cover that as well (even though the term
+slurping is traditionally a read operation, I use the term ``slurp'' for
+the concept of doing I/O with an entire file in one operation).
+
+Finally, when you have slurped the entire file into memory, you can do
+operations on the data that are not possible or easily done with line by
+line processing. These include global search/replace (without regard for
+newlines), grabbing all matches with one call of C<//g>, complex parsing
+(which in many cases must ignore newlines), processing *ML (where line
+endings are just white space) and performing complex transformations
+such as template expansion.
+
+=head2 Global Operations
+
+Here are some simple global operations that can be done quickly and
+easily on an entire file that has been slurped in. They could also be
+done with line by line processing but that would be slower and require
+more code.
+
+A common problem is reading in a file with key/value pairs. There are
+modules which do this but who needs them for simple formats? Just slurp
+in the file and do a single parse to grab all the key/value pairs.
+
+       my $text = read_file( $file ) ;
+       my %config = $test =~ /^(\w+)=(.+)$/mg ;
+
+That matches a key which starts a line (anywhere inside the string
+because of the C</m> modifier), the '=' char and the text to the end of the
+line (again, C</m> makes that work). In fact the ending C<$> is not even needed
+since C<.> will not normally match a newline. Since the key and value are
+grabbed and the C<m//> is in list context with the C</g> modifier, it will
+grab all key/value pairs and return them. The C<%config>hash will be
+assigned this list and now you have the file fully parsed into a hash.
+
+Various projects I have worked on needed some simple templating and I
+wasn't in the mood to use a full module (please, no flames about your
+favorite template module :-). So I rolled my own by slurping in the
+template file, setting up a template hash and doing this one line:
+
+       $text =~ s/<%(.+?)%>/$template{$1}/g ;
+
+That only works if the entire file was slurped in. With a little
+extra work it can handle chunks of text to be expanded:
+
+       $text =~ s/<%(\w+)_START%>(.+?)<%\1_END%>/ template($1, $2)/sge ;
+
+Just supply a C<template> sub to expand the text between the markers and
+you have yourself a simple system with minimal code. Note that this will
+work and grab over multiple lines due the the C</s> modifier. This is
+something that is much trickier with line by line processing.
+
+Note that this is a very simple templating system, and it can't directly
+handle nested tags and other complex features. But even if you use one
+of the myriad of template modules on the CPAN, you will gain by having
+speedier ways to read and write files.
+
+Slurping in a file into an array also offers some useful advantages. 
+One simple example is reading in a flat database where each record has
+fields separated by a character such as C<:>:
+
+       my @pw_fields = map [ split /:/ ], read_file( '/etc/passwd' ) ;
+
+Random access to any line of the slurped file is another advantage. Also
+a line index could be built to speed up searching the array of lines.
+
+
+=head2 Traditional Slurping
+
+Perl has always supported slurping files with minimal code. Slurping of
+a file to a list of lines is trivial, just call the C<< <> >> operator
+in a list context:
+
+       my @lines = <FH> ;
+
+and slurping to a scalar isn't much more work. Just set the built in
+variable C<$/> (the input record separator to the undefined value and read
+in the file with C<< <> >>:
+
+       {
+               local( $/, *FH ) ;
+               open( FH, $file ) or die "sudden flaming death\n"
+               $text = <FH>
+       }
+
+Notice the use of C<local()>. It sets C<$/> to C<undef> for you and when
+the scope exits it will revert C<$/> back to its previous value (most
+likely "\n").
+
+Here is a Perl idiom that allows the C<$text> variable to be declared,
+and there is no need for a tightly nested block. The C<do> block will
+execute C<< <FH> >> in a scalar context and slurp in the file named by
+C<$text>:
+
+       local( *FH ) ;
+       open( FH, $file ) or die "sudden flaming death\n"
+       my $text = do { local( $/ ) ; <FH> } ;
+
+Both of those slurps used localized filehandles to be compatible with
+5.005. Here they are with 5.6.0 lexical autovivified handles:
+
+       {
+               local( $/ ) ;
+               open( my $fh, $file ) or die "sudden flaming death\n"
+               $text = <$fh>
+       }
+
+       open( my $fh, $file ) or die "sudden flaming death\n"
+       my $text = do { local( $/ ) ; <$fh> } ;
+
+And this is a variant of that idiom that removes the need for the open
+call:
+
+       my $text = do { local( @ARGV, $/ ) = $file ; <> } ;
+
+The filename in C<$file> is assigned to a localized C<@ARGV> and the
+null filehandle is used which reads the data from the files in C<@ARGV>.
+
+Instead of assigning to a scalar, all the above slurps can assign to an
+array and it will get the file but split into lines (using C<$/> as the
+end of line marker).
+
+There is one common variant of those slurps which is very slow and not
+good code. You see it around, and it is almost always cargo cult code:
+
+       my $text = join( '', <FH> ) ;
+
+That needlessly splits the input file into lines (C<join> provides a
+list context to C<< <FH> >>) and then joins up those lines again. The
+original coder of this idiom obviously never read I<perlvar> and learned
+how to use C<$/> to allow scalar slurping.
+
+=head2 Write Slurping
+
+While reading in entire files at one time is common, writing out entire
+files is also done. We call it ``slurping'' when we read in files, but
+there is no commonly accepted term for the write operation. I asked some
+Perl colleagues and got two interesting nominations. Peter Scott said to
+call it ``burping'' (rhymes with ``slurping'' and suggests movement in
+the opposite direction). Others suggested ``spewing'' which has a
+stronger visual image :-) Tell me your favorite or suggest your own. I
+will use both in this section so you can see how they work for you.
+
+Spewing a file is a much simpler operation than slurping. You don't have
+context issues to worry about and there is no efficiency problem with
+returning a buffer. Here is a simple burp subroutine:
+
+       sub burp {
+               my( $file_name ) = shift ;
+               open( my $fh, ">$file_name" ) || 
+                                die "can't create $file_name $!" ;
+               print $fh @_ ;
+       }
+
+Note that it doesn't copy the input text but passes @_ directly to
+print. We will look at faster variations of that later on.
+
+=head2 Slurp on the CPAN
+
+As you would expect there are modules in the CPAN that will slurp files
+for you. The two I found are called Slurp.pm (by Rob Casey - ROBAU on
+CPAN) and File::Slurp.pm (by David Muir Sharnoff - MUIR on CPAN).
+
+Here is the code from Slurp.pm:
+
+    sub slurp { 
+       local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+       return <ARGV>;
+    }
+
+    sub to_array {
+       my @array = slurp( @_ );
+       return wantarray ? @array : \@array;
+    }
+
+    sub to_scalar {
+       my $scalar = slurp( @_ );
+       return $scalar;
+    }
+
++The subroutine C<slurp()> uses the magic undefined value of C<$/> and
+the magic file +handle C<ARGV> to support slurping into a scalar or
+array. It also provides two wrapper subs that allow the caller to
+control the context of the slurp. And the C<to_array()> subroutine will
+return the list of slurped lines or a anonymous array of them according
+to its caller's context by checking C<wantarray>. It has 'slurp' in
+C<@EXPORT> and all three subroutines in C<@EXPORT_OK>.
+
+<Footnote: Slurp.pm is poorly named and it shouldn't be in the top level
+namespace.>
+
+The original File::Slurp.pm has this code:
+
+sub read_file
+{
+       my ($file) = @_;
+
+       local($/) = wantarray ? $/ : undef;
+       local(*F);
+       my $r;
+       my (@r);
+
+       open(F, "<$file") || croak "open $file: $!";
+       @r = <F>;
+       close(F) || croak "close $file: $!";
+
+       return $r[0] unless wantarray;
+       return @r;
+}
+
+This module provides several subroutines including C<read_file()> (more
+on the others later). C<read_file()> behaves simularly to
+C<Slurp::slurp()> in that it will slurp a list of lines or a single
+scalar depending on the caller's context. It also uses the magic
+undefined value of C<$/> for scalar slurping but it uses an explicit
+open call rather than using a localized C<@ARGV> and the other module
+did. Also it doesn't provide a way to get an anonymous array of the
+lines but that can easily be rectified by calling it inside an anonymous
+array constuctor C<[]>.
+
+Both of these modules make it easier for Perl coders to slurp in
+files. They both use the magic C<$/> to slurp in scalar mode and the
+natural behavior of C<< <> >> in list context to slurp as lines. But
+neither is optmized for speed nor can they handle C<binmode()> to
+support binary or unicode files. See below for more on slurp features
+and speedups.
+
+=head2 Slurping API Design
+
+The slurp modules on CPAN are have a very simple API and don't support
+C<binmode()>. This section will cover various API design issues such as
+efficient return by reference, C<binmode()> and calling variations.
+
+Let's start with the call variations. Slurped files can be returned in
+four formats: as a single scalar, as a reference to a scalar, as a list
+of lines or as an anonymous array of lines. But the caller can only
+provide two contexts: scalar or list. So we have to either provide an
+API with more than one subroutine (as Slurp.pm did) or just provide one
+subroutine which only returns a scalar or a list (not an anonymous
+array) as File::Slurp does.
+
+I have used my own C<read_file()> subroutine for years and it has the
+same API as File::Slurp: a single subroutine that returns a scalar or a
+list of lines depending on context. But I recognize the interest of
+those that want an anonymous array for line slurping. For one thing, it
+is easier to pass around to other subs and for another, it eliminates
+the extra copying of the lines via C<return>. So my module provides only
+one slurp subroutine that returns the file data based on context and any
+format options passed in. There is no need for a specific
+slurp-in-as-a-scalar or list subroutine as the general C<read_file()>
+sub will do that by default in the appropriate context. If you want
+C<read_file()> to return a scalar reference or anonymous array of lines,
+you can request those formats with options. You can even pass in a
+reference to a scalar (e.g. a previously allocated buffer) and have that
+filled with the slurped data (and that is one of the fastest slurp
+modes. see the benchmark section for more on that).  If you want to
+slurp a scalar into an array, just select the desired array element and
+that will provide scalar context to the C<read_file()> subroutine.
+
+The next area to cover is what to name the slurp sub. I will go with
+C<read_file()>. It is descriptive and keeps compatibilty with the
+current simple and don't use the 'slurp' nickname (though that nickname
+is in the module name). Also I decided to keep the  File::Slurp
+namespace which was graciously handed over to me by its current owner,
+David Muir.
+
+Another critical area when designing APIs is how to pass in
+arguments. The C<read_file()> subroutine takes one required argument
+which is the file name. To support C<binmode()> we need another optional
+argument. A third optional argument is needed to support returning a
+slurped scalar by reference. My first thought was to design the API with
+3 positional arguments - file name, buffer reference and binmode. But if
+you want to set the binmode and not pass in a buffer reference, you have
+to fill the second argument with C<undef> and that is ugly. So I decided
+to make the filename argument positional and the other two named. The
+subroutine starts off like this:
+
+       sub read_file {
+
+               my( $file_name, %args ) = @_ ;
+
+               my $buf ;
+               my $buf_ref = $args{'buf'} || \$buf ;
+
+The other sub (C<read_file_lines()>) will only take an optional binmode
+(so you can read files with binary delimiters). It doesn't need a buffer
+reference argument since it can return an anonymous array if the called
+in a scalar context. So this subroutine could use positional arguments,
+but to keep its API similar to the API of C<read_file()>, it will also
+use pass by name for the optional arguments. This also means that new
+optional arguments can be added later without breaking any legacy
+code. A bonus with keeping the API the same for both subs will be seen
+how the two subs are optimized to work together.
+
+Write slurping (or spewing or burping :-)) needs to have its API
+designed as well. The biggest issue is not only needing to support
+optional arguments but a list of arguments to be written is needed. Perl
+6 will be able to handle that with optional named arguments and a final
+slurp argument. Since this is Perl 5 we have to do it using some
+cleverness. The first argument is the file name and it will be
+positional as with the C<read_file> subroutine. But how can we pass in
+the optional arguments and also a list of data? The solution lies in the
+fact that the data list should never contain a reference.
+Burping/spewing works only on plain data. So if the next argument is a
+hash reference, we can assume it cointains the optional arguments and
+the rest of the arguments is the data list. So the C<write_file()>
+subroutine will start off like this:
+
+       sub write_file {
+
+               my $file_name = shift ;
+
+               my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+
+Whether or not optional arguments are passed in, we leave the data list
+in C<@_> to minimize any more copying. You call C<write_file()> like this:
+
+       write_file( 'foo', { binmode => ':raw' }, @data ) ;
+       write_file( 'junk', { append => 1 }, @more_junk ) ;
+       write_file( 'bar', @spew ) ;
+
+=head2 Fast Slurping
+
+Somewhere along the line, I learned about a way to slurp files faster
+than by setting $/ to undef. The method is very simple, you do a single
+read call with the size of the file (which the -s operator provides).
+This bypasses the I/O loop inside perl that checks for EOF and does all
+sorts of processing. I then decided to experiment and found that
+sysread is even faster as you would expect. sysread bypasses all of
+Perl's stdio and reads the file from the kernel buffers directly into a
+Perl scalar. This is why the slurp code in File::Slurp uses
+sysopen/sysread/syswrite. All the rest of the code is just to support
+the various options and data passing techniques.
+
+ 
+=head2 Benchmarks
+
+Benchmarks can be enlightening, informative, frustrating and
+deceiving. It would make no sense to create a new and more complex slurp
+module unless it also gained signifigantly in speed. So I created a
+benchmark script which compares various slurp methods with differing
+file sizes and calling contexts. This script can be run from the main
+directory of the tarball like this:
+
+       perl -Ilib extras/slurp_bench.pl
+
+If you pass in an argument on the command line, it will be passed to
+timethese() and it will control the duration. It defaults to -2 which
+makes each benchmark run to at least 2 seconds of cpu time.
+
+The following numbers are from a run I did on my 300Mhz sparc. You will
+most likely get much faster counts on your boxes but the relative speeds
+shouldn't change by much. If you see major differences on your
+benchmarks, please send me the results and your Perl and OS
+versions. Also you can play with the benchmark script and add more slurp
+variations or data files.
+
+The rest of this section will be discussing the results of the
+benchmarks. You can refer to extras/slurp_bench.pl to see the code for
+the individual benchmarks. If the benchmark name starts with cpan_, it
+is either from Slurp.pm or File::Slurp.pm. Those starting with new_ are
+from the new File::Slurp.pm. Those that start with file_contents_ are
+from a client's code base. The rest are variations I created to
+highlight certain aspects of the benchmarks.
+
+The short and long file data is made like this:
+
+       my @lines = ( 'abc' x 30 . "\n")  x 100 ;
+       my $text = join( '', @lines ) ;
+
+       @lines = ( 'abc' x 40 . "\n")  x 1000 ;
+       $text = join( '', @lines ) ;
+
+So the short file is 9,100 bytes and the long file is 121,000
+bytes. 
+
+=head3         Scalar Slurp of Short File
+
+       file_contents        651/s
+       file_contents_no_OO  828/s
+       cpan_read_file      1866/s
+       cpan_slurp          1934/s
+       read_file           2079/s
+       new                 2270/s
+       new_buf_ref         2403/s
+       new_scalar_ref      2415/s
+       sysread_file        2572/s
+
+=head3         Scalar Slurp of Long File
+
+       file_contents_no_OO 82.9/s
+       file_contents       85.4/s
+       cpan_read_file       250/s
+       cpan_slurp           257/s
+       read_file            323/s
+       new                  468/s
+       sysread_file         489/s
+       new_scalar_ref       766/s
+       new_buf_ref          767/s
+
+The primary inference you get from looking at the mumbers above is that
+when slurping a file into a scalar, the longer the file, the more time
+you save by returning the result via a scalar reference. The time for
+the extra buffer copy can add up. The new module came out on top overall
+except for the very simple sysread_file entry which was added to
+highlight the overhead of the more flexible new module which isn't that
+much. The file_contents entries are always the worst since they do a
+list slurp and then a join, which is a classic newbie and cargo culted
+style which is extremely slow. Also the OO code in file_contents slows
+it down even more (I added the file_contents_no_OO entry to show this).
+The two CPAN modules are decent with small files but they are laggards
+compared to the new module when the file gets much larger.
+
+=head3         List Slurp of Short File
+
+       cpan_read_file          589/s
+       cpan_slurp_to_array     620/s
+       read_file               824/s
+       new_array_ref           824/s
+       sysread_file            828/s
+       new                     829/s
+       new_in_anon_array       833/s
+       cpan_slurp_to_array_ref 836/s
+
+=head3         List Slurp of Long File
+
+       cpan_read_file          62.4/s
+       cpan_slurp_to_array     62.7/s
+       read_file               92.9/s
+       sysread_file            94.8/s
+       new_array_ref           95.5/s
+       new                     96.2/s
+       cpan_slurp_to_array_ref 96.3/s
+       new_in_anon_array       97.2/s
+
+This is perhaps the most interesting result of this benchmark. Five
+different entries have effectively tied for the lead. The logical
+conclusion is that splitting the input into lines is the bounding
+operation, no matter how the file gets slurped. This is the only
+benchmark where the new module isn't the clear winner (in the long file
+entries - it is no worse than a close second in the short file
+entries). 
+
+
+Note: In the benchmark information for all the spew entries, the extra
+number at the end of each line is how many wallclock seconds the whole
+entry took. The benchmarks were run for at least 2 CPU seconds per
+entry. The unusually large wallclock times will be discussed below.
+
+=head3         Scalar Spew of Short File
+
+       cpan_write_file 1035/s  38
+       print_file      1055/s  41
+       syswrite_file   1135/s  44
+       new             1519/s  2
+       print_join_file 1766/s  2
+       new_ref         1900/s  2
+       syswrite_file2  2138/s  2
+
+=head3         Scalar Spew of Long File
+
+       cpan_write_file 164/s   20
+       print_file      211/s   26
+       syswrite_file   236/s   25
+       print_join_file 277/s   2
+       new             295/s   2
+       syswrite_file2  428/s   2
+       new_ref         608/s   2
+
+In the scalar spew entries, the new module API wins when it is passed a
+reference to the scalar buffer. The C<syswrite_file2> entry beats it
+with the shorter file due to its simpler code. The old CPAN module is
+the slowest due to its extra copying of the data and its use of print.
+
+=head3 List Spew of Short File
+
+       cpan_write_file  794/s  29
+       syswrite_file   1000/s  38
+       print_file      1013/s  42
+       new             1399/s  2
+       print_join_file 1557/s  2
+
+=head3         List Spew of Long File
+
+       cpan_write_file 112/s   12
+       print_file      179/s   21
+       syswrite_file   181/s   19
+       print_join_file 205/s   2
+       new             228/s   2
+
+Again, the simple C<print_join_file> entry beats the new module when
+spewing a short list of lines to a file. But is loses to the new module
+when the file size gets longer. The old CPAN module lags behind the
+others since it first makes an extra copy of the lines and then it calls
+C<print> on the output list and that is much slower than passing to
+C<print> a single scalar generated by join. The C<print_file> entry
+shows the advantage of directly printing C<@_> and the
+C<print_join_file> adds the join optimization.
+
+Now about those long wallclock times. If you look carefully at the
+benchmark code of all the spew entries, you will find that some always
+write to new files and some overwrite existing files. When I asked David
+Muir why the old File::Slurp module had an C<overwrite> subroutine, he
+answered that by overwriting a file, you always guarantee something
+readable is in the file. If you create a new file, there is a moment
+when the new file is created but has no data in it. I feel this is not a
+good enough answer. Even when overwriting, you can write a shorter file
+than the existing file and then you have to truncate the file to the new
+size. There is a small race window there where another process can slurp
+in the file with the new data followed by leftover junk from the
+previous version of the file. This reinforces the point that the only
+way to ensure consistant file data is the proper use of file locks.
+
+But what about those long times? Well it is all about the difference
+between creating files and overwriting existing ones. The former have to
+allocate new inodes (or the equivilent on other file systems) and the
+latter can reuse the exising inode. This mean the overwrite will save on
+disk seeks as well as on cpu time. In fact when running this benchmark,
+I could hear my disk going crazy allocating inodes during the spew
+operations. This speedup in both cpu and wallclock is why the new module
+always does overwriting when spewing files. It also does the proper
+truncate (and this is checked in the tests by spewing shorter files
+after longer ones had previously been written). The C<overwrite>
+subroutine is just an typeglob alias to C<write_file> and is there for
+backwards compatibilty with the old File::Slurp module.
+
+=head3 Benchmark Conclusion
+
+Other than a few cases where a simpler entry beat it out, the new
+File::Slurp module is either the speed leader or among the leaders. Its
+special APIs for passing buffers by reference prove to be very useful
+speedups. Also it uses all the other optimizations including using
+C<sysread/syswrite> and joining output lines. I expect many projects
+that extensively use slurping will notice the speed improvements,
+especially if they rewrite their code to take advantage of the new API
+features. Even if they don't touch their code and use the simple API
+they will get a significant speedup.
+
+=head2 Error Handling
+
+Slurp subroutines are subject to conditions such as not being able to
+open the file, or I/O errors. How these errors are handled, and what the
+caller will see, are important aspects of the design of an API. The
+classic error handling for slurping has been to call C<die()> or even
+better, C<croak()>. But sometimes you want the slurp to either
+C<warn()>/C<carp()> or allow your code to handle the error. Sure, this
+can be done by wrapping the slurp in a C<eval> block to catch a fatal
+error, but not everyone wants all that extra code. So I have added
+another option to all the subroutines which selects the error
+handling. If the 'err_mode' option is 'croak' (which is also the
+default), the called subroutine will croak. If set to 'carp' then carp
+will be called. Set to any other string (use 'quiet' when you want to
+be explicit) and no error handler is called. Then the caller can use the
+error status from the call.
+
+C<write_file()> doesn't use the return value for data so it can return a
+false status value in-band to mark an error. C<read_file()> does use its
+return value for data, but we can still make it pass back the error
+status. A successful read in any scalar mode will return either a
+defined data string or a reference to a scalar or array. So a bare
+return would work here. But if you slurp in lines by calling it in a
+list context, a bare C<return> will return an empty list, which is the
+same value it would get from an existing but empty file. So now,
+C<read_file()> will do something I normally strongly advocate against,
+i.e., returning an explicit C<undef> value. In the scalar context this
+still returns a error, and in list context, the returned first value
+will be C<undef>, and that is not legal data for the first element. So
+the list context also gets a error status it can detect:
+
+       my @lines = read_file( $file_name, err_mode => 'quiet' ) ;
+       your_handle_error( "$file_name can't be read\n" ) unless
+                                       @lines && defined $lines[0] ;
+
+
+=head2 File::FastSlurp
+
+       sub read_file {
+
+               my( $file_name, %args ) = @_ ;
+
+               my $buf ;
+               my $buf_ref = $args{'buf_ref'} || \$buf ;
+
+               my $mode = O_RDONLY ;
+               $mode |= O_BINARY if $args{'binmode'} ;
+
+               local( *FH ) ;
+               sysopen( FH, $file_name, $mode ) or
+                                       carp "Can't open $file_name: $!" ;
+
+               my $size_left = -s FH ;
+
+               while( $size_left > 0 ) {
+
+                       my $read_cnt = sysread( FH, ${$buf_ref},
+                                       $size_left, length ${$buf_ref} ) ;
+
+                       unless( $read_cnt ) {
+
+                               carp "read error in file $file_name: $!" ;
+                               last ;
+                       }
+
+                       $size_left -= $read_cnt ;
+               }
+
+       # handle void context (return scalar by buffer reference)
+
+               return unless defined wantarray ;
+
+       # handle list context
+
+               return split m|?<$/|g, ${$buf_ref} if wantarray ;
+
+       # handle scalar context
+
+               return ${$buf_ref} ;
+       }
+
+       sub write_file {
+
+               my $file_name = shift ;
+
+               my $args = ( ref $_[0] eq 'HASH' ) ? shift : {} ;
+               my $buf = join '', @_ ;
+
+
+               my $mode = O_WRONLY ;
+               $mode |= O_BINARY if $args->{'binmode'} ;
+               $mode |= O_APPEND if $args->{'append'} ;
+
+               local( *FH ) ;
+               sysopen( FH, $file_name, $mode ) or
+                                       carp "Can't open $file_name: $!" ;
+
+               my $size_left = length( $buf ) ;
+               my $offset = 0 ;
+
+               while( $size_left > 0 ) {
+
+                       my $write_cnt = syswrite( FH, $buf,
+                                       $size_left, $offset ) ;
+
+                       unless( $write_cnt ) {
+
+                               carp "write error in file $file_name: $!" ;
+                               last ;
+                       }
+
+                       $size_left -= $write_cnt ;
+                       $offset += $write_cnt ;
+               }
+
+               return ;
+       }
+
+=head2 Slurping in Perl 6
+
+As usual with Perl 6, much of the work in this article will be put to
+pasture. Perl 6 will allow you to set a 'slurp' property on file handles
+and when you read from such a handle, the file is slurped. List and
+scalar context will still be supported so you can slurp into lines or a
+<scalar. I would expect that support for slurping in Perl 6 will be
+optimized and bypass the stdio subsystem since it can use the slurp
+property to trigger a call to special code. Otherwise some enterprising
+individual will just create a File::FastSlurp module for Perl 6. The
+code in the Perl 5 module could easily be modified to Perl 6 syntax and
+semantics. Any volunteers?
+
+=head2 In Summary
+
+We have compared classic line by line processing with munging a whole
+file in memory. Slurping files can speed up your programs and simplify
+your code if done properly. You must still be aware to not slurp
+humongous files (logs, DNA sequences, etc.) or STDIN where you don't
+know how much data you will read in. But slurping megabyte sized files
+is not an major issue on today's systems with the typical amount of RAM
+installed. When Perl was first being used in depth (Perl 4), slurping
+was limited by the smaller RAM size of 10 years ago. Now, you should be
+able to slurp almost any reasonably sized file, whether it contains
+configuration, source code, data, etc.
+
+=head2 Acknowledgements
+
+
+
+
+
diff --git a/slurp_talk/slurp_bench.pl b/slurp_talk/slurp_bench.pl

new file mode 100755 (executable)

index 0000000..376a75b
--- /dev/null
+++ b/slurp_talk/slurp_bench.pl
@@ -0,0 +1,351 @@
+#!/usr/local/bin/perl
+
+use strict ;
+
+use Benchmark qw( timethese cmpthese ) ;
+use Carp ;
+use FileHandle ;
+use Fcntl qw( :DEFAULT :seek );
+
+use File::Slurp () ;
+
+my $dur = shift || -2 ;
+
+my $file = 'slurp_data' ;
+
+my @lines = ( 'abc' x 30 . "\n")  x 100 ;
+my $text = join( '', @lines ) ;
+
+bench_list_spew( 'SHORT' ) ;
+bench_scalar_spew( 'SHORT' ) ;
+
+File::Slurp::write_file( $file, $text ) ;
+
+bench_scalar_slurp( 'SHORT' ) ;
+bench_list_slurp( 'SHORT' ) ;
+
+@lines = ( 'abc' x 40 . "\n")  x 1000 ;
+$text = join( '', @lines ) ;
+
+bench_list_spew( 'LONG' ) ;
+bench_scalar_spew( 'LONG' ) ;
+
+File::Slurp::write_file( $file, $text ) ;
+
+bench_scalar_slurp( 'LONG' ) ;
+bench_list_slurp( 'LONG' ) ;
+
+exit ;
+
+sub bench_list_spew {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nList Spew of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { File::Slurp::write_file( $file, @lines ) },
+
+               print_file =>
+                       sub { print_file( $file, @lines ) },
+
+               print_join_file =>
+                       sub { print_join_file( $file, @lines ) },
+
+               syswrite_file =>
+                       sub { syswrite_file( $file, @lines ) },
+
+               cpan_write_file =>
+                       sub { cpan_write_file( $file, @lines ) },
+
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_scalar_spew {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nScalar Spew of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { File::Slurp::write_file( $file, $text ) },
+
+               new_ref =>
+                       sub { File::Slurp::write_file( $file, \$text ) },
+
+               print_file =>
+                       sub { print_file( $file, $text ) },
+
+               print_join_file =>
+                       sub { print_join_file( $file, $text ) },
+
+               syswrite_file =>
+                       sub { syswrite_file( $file, $text ) },
+
+               syswrite_file2 =>
+                       sub { syswrite_file2( $file, $text ) },
+
+               cpan_write_file =>
+                       sub { cpan_write_file( $file, $text ) },
+
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_scalar_slurp {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nScalar Slurp of $size file\n\n" ;
+
+       my $buffer ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { my $text = File::Slurp::read_file( $file ) },
+
+               new_buf_ref =>
+                       sub { my $text ;
+                          File::Slurp::read_file( $file, buf_ref => \$text ) },
+               new_buf_ref2 =>
+                       sub { 
+                          File::Slurp::read_file( $file, buf_ref => \$buffer ) },
+               new_scalar_ref =>
+                       sub { my $text =
+                           File::Slurp::read_file( $file, scalar_ref => 1 ) },
+
+               read_file =>
+                       sub { my $text = read_file( $file ) },
+
+               sysread_file =>
+                       sub { my $text = sysread_file( $file ) },
+
+               cpan_read_file =>
+                       sub { my $text = cpan_read_file( $file ) },
+
+               cpan_slurp =>
+                       sub { my $text = cpan_slurp_to_scalar( $file ) },
+
+               file_contents =>
+                       sub { my $text = file_contents( $file ) },
+
+               file_contents_no_OO =>
+                       sub { my $text = file_contents_no_OO( $file ) },
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+sub bench_list_slurp {
+
+       my ( $size ) = @_ ;
+
+       print "\n\nList Slurp of $size file\n\n" ;
+
+       my $result = timethese( $dur, {
+
+               new =>
+                       sub { my @lines = File::Slurp::read_file( $file ) },
+
+               new_array_ref =>
+                       sub { my $lines_ref =
+                            File::Slurp::read_file( $file, array_ref => 1 ) },
+
+               new_in_anon_array =>
+                       sub { my $lines_ref =
+                            [ File::Slurp::read_file( $file ) ] },
+
+               read_file =>
+                       sub { my @lines = read_file( $file ) },
+
+               sysread_file =>
+                       sub { my @lines = sysread_file( $file ) },
+
+               cpan_read_file =>
+                       sub { my @lines = cpan_read_file( $file ) },
+
+               cpan_slurp_to_array =>
+                       sub { my @lines = cpan_slurp_to_array( $file ) },
+
+               cpan_slurp_to_array_ref =>
+                       sub { my $lines_ref = cpan_slurp_to_array( $file ) },
+       } ) ;
+
+       cmpthese( $result ) ;
+}
+
+######################################
+# uri's old fast slurp
+
+sub read_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+       open( FH, $file_name ) || carp "can't open $file_name $!" ;
+
+       return <FH> if wantarray ;
+
+       my $buf ;
+
+       read( FH, $buf, -s FH ) ;
+       return $buf ;
+}
+
+sub sysread_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+       open( FH, $file_name ) || carp "can't open $file_name $!" ;
+
+       return <FH> if wantarray ;
+
+       my $buf ;
+
+       sysread( FH, $buf, -s FH ) ;
+       return $buf ;
+}
+
+######################################
+# from File::Slurp.pm on cpan
+
+sub cpan_read_file
+{
+       my ($file) = @_;
+
+       local($/) = wantarray ? $/ : undef;
+       local(*F);
+       my $r;
+       my (@r);
+
+       open(F, "<$file") || croak "open $file: $!";
+       @r = <F>;
+       close(F) || croak "close $file: $!";
+
+       return $r[0] unless wantarray;
+       return @r;
+}
+
+sub cpan_write_file
+{
+       my ($f, @data) = @_;
+
+       local(*F);
+
+       open(F, ">$f") || croak "open >$f: $!";
+       (print F @data) || croak "write $f: $!";
+       close(F) || croak "close $f: $!";
+       return 1;
+}
+
+
+######################################
+# from Slurp.pm on cpan
+
+sub slurp { 
+    local( $/, @ARGV ) = ( wantarray ? $/ : undef, @_ ); 
+    return <ARGV>;
+}
+
+sub cpan_slurp_to_array {
+    my @array = slurp( @_ );
+    return wantarray ? @array : \@array;
+}
+
+sub cpan_slurp_to_scalar {
+    my $scalar = slurp( @_ );
+    return $scalar;
+}
+
+######################################
+# very slow slurp code used by a client
+
+sub file_contents {
+    my $file = shift;
+    my $fh = new FileHandle $file or
+        warn("Util::file_contents:Can't open file $file"), return '';
+    return join '', <$fh>;
+}
+
+# same code but doesn't use FileHandle.pm
+
+sub file_contents_no_OO {
+    my $file = shift;
+
+       local( *FH ) ;
+       open( FH, $file ) || carp "can't open $file $!" ;
+
+    return join '', <FH>;
+}
+
+##########################
+
+sub print_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH @_ ;
+}
+
+sub print_file2 {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       my $mode = ( -e $file_name ) ? '<' : '>' ;
+
+       open( FH, "+$mode$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH @_ ;
+}
+
+sub print_join_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       my $mode = ( -e $file_name ) ? '<' : '>' ;
+
+       open( FH, "+$mode$file_name" ) || carp "can't create $file_name $!" ;
+
+       print FH join( '', @_ ) ;
+}
+
+
+sub syswrite_file {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || carp "can't create $file_name $!" ;
+
+       syswrite( FH, join( '', @_ ) ) ;
+}
+
+sub syswrite_file2 {
+
+       my( $file_name ) = shift ;
+
+       local( *FH ) ;
+
+       sysopen( FH, $file_name, O_WRONLY | O_CREAT ) ||
+                               carp "can't create $file_name $!" ;
+
+       syswrite( FH, join( '', @_ ) ) ;
+}
diff --git a/t/append_null.t b/t/append_null.t

new file mode 100644 (file)

index 0000000..1a3a5e4
--- /dev/null
+++ b/t/append_null.t
@@ -0,0 +1,28 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Test::More tests => 2 ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', qw( read_file write_file append_file ) ) ;
+}
+
+my $data = <<TEXT ;
+line 1
+more text
+TEXT
+
+my $file = 'xxx' ;
+
+unlink $file ;
+
+my $err = write_file( $file, $data ) ;
+append_file( $file, '' ) ;
+
+my $read_data = read_file( $file ) ;
+
+is( $data, $read_data ) ;
+
+
+unlink $file ;
diff --git a/t/data_list.t b/t/data_list.t

new file mode 100644 (file)

index 0000000..c793a5d
--- /dev/null
+++ b/t/data_list.t
@@ -0,0 +1,65 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Carp ;
+use POSIX qw( :fcntl_h ) ;
+use Test::More tests => 2 ;
+
+# in case SEEK_SET isn't defined in older perls. it seems to always be 0
+
+BEGIN {
+
+       *SEEK_SET = sub { 0 } unless eval { SEEK_SET() } ;
+}
+
+BEGIN{ 
+       use_ok( 'File::Slurp', ) ;
+}
+
+SKIP: {
+
+       eval { require B } ;
+
+       skip <<TEXT, 1 if $@ ;
+B.pm not found in this Perl. This will cause slurping of
+the DATA handle to fail.
+TEXT
+
+       test_data_list_slurp() ;
+}
+
+exit ;
+
+
+sub test_data_list_slurp {
+
+       my $data_seek = tell( \*DATA );
+
+# first slurp in the lines
+ 
+       my @slurp_lines = read_file( \*DATA ) ;
+
+# now seek back and read all the lines with the <> op and we make
+# golden data sets
+
+       seek( \*DATA, $data_seek, SEEK_SET ) || die "seek $!" ;
+       my @data_lines = <DATA> ;
+
+# test the array slurp
+
+       ok( eq_array( \@data_lines, \@slurp_lines ), 'list slurp of DATA' ) ;
+}
+
+__DATA__
+line one
+second line
+more lines
+still more
+
+enough lines
+
+we can't test long handle slurps from DATA since i would have to type
+too much stuff
+
+so we will stop here
diff --git a/t/data_scalar.t b/t/data_scalar.t

new file mode 100644 (file)

index 0000000..0a841ed
--- /dev/null
+++ b/t/data_scalar.t
@@ -0,0 +1,65 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Carp ;
+use POSIX qw( :fcntl_h ) ;
+use Test::More tests => 2 ;
+
+# in case SEEK_SET isn't defined in older perls. it seems to always be 0
+
+BEGIN {
+
+       *SEEK_SET = sub { 0 } unless eval { SEEK_SET() } ;
+}
+
+BEGIN{ 
+       use_ok( 'File::Slurp', ) ;
+}
+
+eval { require B } ;
+
+SKIP: {
+
+       skip <<TEXT, 1 if $@ ;
+B.pm not found in this Perl. Note this will cause slurping of
+the DATA handle to fail.
+TEXT
+
+       test_data_scalar_slurp() ;
+}
+
+exit ;
+
+
+
+exit ;
+
+sub test_data_scalar_slurp {
+
+       my $data_seek = tell( \*DATA );
+
+# first slurp in the text
+ 
+       my $slurp_text = read_file( \*DATA ) ;
+
+# now we need to get the golden data
+
+       seek( \*DATA, $data_seek, SEEK_SET ) || die "seek $!" ;
+       my $data_text = join( '', <DATA> ) ;
+
+       is( $slurp_text, $data_text, 'scalar slurp of DATA' ) ;
+}
+
+__DATA__
+line one
+second line
+more lines
+still more
+
+enough lines
+
+we can't test long handle slurps from DATA since i would have to type
+too much stuff
+
+so we will stop here
diff --git a/t/error.t b/t/error.t

new file mode 100644 (file)

index 0000000..7333d09
--- /dev/null
+++ b/t/error.t
@@ -0,0 +1,65 @@
+##!/usr/local/bin/perl -w
+
+use strict ;
+
+use Test::More ;
+use Carp ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', ) ;
+}
+use File::Slurp ;
+
+
+my $file = 'missing/file' ;
+unlink $file ;
+
+plan tests => 9 ;
+
+my %modes = (
+       'croak' => \&test_croak,
+       'carp' => \&test_carp,
+       'quiet' => \&test_quiet,
+) ;
+
+while( my( $mode, $sub ) = each %modes ) {
+
+       $sub->( 'read_file', \&read_file, $file, err_mode => $mode ) ;
+       $sub->( 'write_file', \&write_file, $file,
+                                       { err_mode => $mode }, 'junk' ) ;
+       $sub->( 'read_dir', \&read_dir, $file, err_mode => $mode ) ;
+}
+
+
+sub test_croak {
+
+       my ( $name, $sub, @args ) = @_ ;
+
+       eval {
+               $sub->( @args ) ;
+       } ;
+
+       ok( $@, "$name can croak" ) ;
+}
+
+sub test_carp {
+
+       my ( $name, $sub, @args ) = @_ ;
+
+       local $SIG{__WARN__} = sub { ok( 1, "$name can carp" ) } ;
+
+       $sub->( @args ) ;
+}
+
+sub test_quiet {
+
+       my ( $name, $sub, @args ) = @_ ;
+
+       local $SIG{__WARN__} = sub { ok( 0, "$name can be quiet" ) } ;
+
+       eval {
+               $sub->( @args ) ;
+       } ;
+
+       ok( !$@, "$name can be quiet" ) ;
+}
diff --git a/t/foo.pl b/t/foo.pl

new file mode 100644 (file)

index 0000000..bdf59a8
--- /dev/null
+++ b/t/foo.pl
@@ -0,0 +1,30 @@
+#!/usr/local/bin/perl
+
+
+                sub pipe_from_fork ($) {
+                     my $parent = shift;
+                     pipe $parent, my $child or die;
+                     my $pid = fork();
+                     die "fork() failed: $!" unless defined $pid;
+                     if ($pid) {
+                         close $child;
+                     }
+                     else {
+                         close $parent;
+                         open(STDOUT, ">&=" . fileno($child)) or die;
+                     }
+                     $pid;
+                 }
+
+                 if (pipe_from_fork('BAR')) {
+                     # parent
+                     while (<BAR>) { print; }
+                     close BAR;
+                 }
+                 else {
+                     # child
+                     print "pipe_from_fork\n";
+                     close STDOUT;
+                     exit(0);
+                 }
+
diff --git a/t/foo2.pl b/t/foo2.pl

new file mode 100644 (file)

index 0000000..00b1740
--- /dev/null
+++ b/t/foo2.pl
@@ -0,0 +1,27 @@
+#!/usr/local/bin/perl
+ sub pipe_to_fork ($) {
+                     my $parent = shift;
+                     pipe my $child, $parent or die;
+                     my $pid = fork();
+                     die "fork() failed: $!" unless defined $pid;
+                     if ($pid) {
+                         close $child;
+                     }
+                     else {
+                         close $parent;
+                         open(STDIN, "<&=" . fileno($child)) or die;
+                     }
+                     $pid;
+                 }
+
+                 if (pipe_to_fork('FOO')) {
+                     # parent
+                     print FOO "pipe_to_fork\n";
+                     close FOO;
+                 }
+                 else {
+                     # child
+                     while (<STDIN>) { print; }
+                     close STDIN;
+                     exit(0);
+                 }
diff --git a/t/handle.t b/t/handle.t

new file mode 100644 (file)

index 0000000..8987375
--- /dev/null
+++ b/t/handle.t
@@ -0,0 +1,227 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Carp ;
+use POSIX qw( :fcntl_h ) ;
+use Socket ;
+use Symbol ;
+use Test::More ;
+
+# in case SEEK_SET isn't defined in older perls. it seems to always be 0
+
+BEGIN {
+       *SEEK_SET = sub { 0 } unless eval { SEEK_SET() } ;
+}
+
+my @pipe_data = (
+       '',
+       'abc',
+       'abc' x 100_000,
+       'abc' x 1_000_000,
+) ;
+
+#plan( tests => 2 + @pipe_data ) ;
+plan( tests => scalar @pipe_data ) ;
+
+
+BEGIN{ 
+       use_ok( 'File::Slurp', )  ;
+}
+
+#test_data_slurp() ;
+
+#test_fork_pipe_slurp() ;
+
+SKIP: {
+
+       eval { test_socketpair_slurp() } ;
+
+       skip "socketpair not found in this Perl", scalar( @pipe_data ) if $@ ;
+}
+
+sub test_socketpair_slurp {
+
+       foreach my $data ( @pipe_data ) {
+
+               my $size = length( $data ) ;
+
+               my $read_fh = gensym ;
+               my $write_fh = gensym ;
+
+               socketpair( $read_fh, $write_fh,
+                               AF_UNIX, SOCK_STREAM, PF_UNSPEC);
+                
+               if ( fork() ) {
+
+#warn "PARENT SOCKET\n" ;
+                       close( $write_fh ) ;
+                       my $read_buf = read_file( $read_fh ) ;
+
+                       is( $read_buf, $data,
+                               "socket slurp/spew of $size bytes" ) ;
+
+               }
+               else {
+
+#child
+#warn "CHILD SOCKET\n" ;
+                       close( $read_fh ) ;
+                       write_file( $write_fh, $data ) ;
+                       exit() ;
+               }
+       }
+}
+
+sub test_data_slurp {
+
+       my $data_seek = tell( \*DATA );
+
+# first slurp in the lines 
+       my @slurp_lines = read_file( \*DATA ) ;
+
+# now seek back and read all the lines with the <> op and we make
+# golden data sets
+
+       seek( \*DATA, $data_seek, SEEK_SET ) || die "seek $!" ;
+       my @data_lines = <DATA> ;
+       my $data_text = join( '', @data_lines ) ;
+
+# now slurp in as one string and test
+
+       sysseek( \*DATA, $data_seek, SEEK_SET ) || die "seek $!" ;
+       my $slurp_text = read_file( \*DATA ) ;
+       is( $slurp_text, $data_text, 'scalar slurp DATA' ) ;
+
+# test the array slurp
+
+       ok( eq_array( \@data_lines, \@slurp_lines ), 'list slurp of DATA' ) ;
+}
+
+sub test_fork_pipe_slurp {
+
+       foreach my $data ( @pipe_data ) {
+
+               test_to_pipe( $data ) ;
+               test_from_pipe( $data ) ;
+       }
+}
+
+
+sub test_from_pipe {
+
+       my( $data ) = @_ ;
+
+       my $size = length( $data ) ;
+
+       if ( pipe_from_fork( \*READ_FH ) ) {
+
+# parent
+               my $read_buf = read_file( \*READ_FH ) ;
+warn "PARENT read\n" ;
+
+               is( $read_buf, $data, "pipe slurp/spew of $size bytes" ) ;
+
+               close \*READ_FH ;
+#              return ;
+       }
+       else {
+# child
+warn "CHILD write\n" ;
+       #       write_file( \*STDOUT, $data ) ;
+               syswrite( \*STDOUT, $data, length( $data ) ) ;
+
+               close \*STDOUT;
+               exit(0);
+       }
+}
+
+
+sub pipe_from_fork {
+
+       my ( $parent_fh ) = @_ ;
+
+       my $child = gensym ;
+
+       pipe( $parent_fh, $child ) or die;
+
+       my $pid = fork();
+       die "fork() failed: $!" unless defined $pid;
+
+       if ($pid) {
+
+warn "PARENT\n" ;
+               close $child;
+               return $pid ;
+       }
+
+warn "CHILD FILENO ", fileno($child), "\n" ;
+       close $parent_fh ;
+       open(STDOUT, ">&=" . fileno($child)) or die "no fileno" ;
+
+       return ;
+}
+
+
+sub test_to_pipe {
+
+       my( $data ) = @_ ;
+
+       my $size = length( $data ) ;
+
+       if ( pipe_to_fork( \*WRITE_FH ) ) {
+
+# parent
+               syswrite( \*WRITE_FH, $data, length( $data ) ) ;
+#              write_file( \*WRITE_FH, $data ) ;
+warn "PARENT write\n" ;
+
+#              is( $read_buf, $data, "pipe slurp/spew of $size bytes" ) ;
+
+               close \*WRITE_FH ;
+#              return ;
+       }
+       else {
+# child
+warn "CHILD read FILENO ", fileno(\*STDIN), "\n" ;
+
+               my $read_buf = read_file( \*STDIN ) ;
+               is( $read_buf, $data, "pipe slurp/spew of $size bytes" ) ;
+               close \*STDIN;
+               exit(0);
+       }
+}
+
+sub pipe_to_fork {
+       my ( $parent_fh ) = @_ ;
+
+       my $child = gensym ;
+
+       pipe( $child, $parent_fh ) or die ;
+
+       my $pid = fork();
+       die "fork() failed: $!" unless defined $pid;
+
+       if ( $pid ) {
+               close $child;
+               return $pid ;
+       }
+
+       close $parent_fh ;
+       open(STDIN, "<&=" . fileno($child)) or die;
+
+       return ;
+}
+
+__DATA__
+line one
+second line
+more lines
+still more
+
+enough lines
+
+we don't test long handle slurps from DATA since i would have to type
+too much stuff :-)
+
+so we will stop here
diff --git a/t/inode.t b/t/inode.t

new file mode 100644 (file)

index 0000000..878e841
--- /dev/null
+++ b/t/inode.t
@@ -0,0 +1,46 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Carp ;
+use Socket ;
+use Symbol ;
+use Test::More ;
+
+BEGIN{ 
+
+       if( $^O =~ '32' ) {
+               plan skip_all => 'skip inode test on windows';
+               exit ;
+       }
+       else {
+               plan tests => 3 ;
+       }
+
+       use_ok( 'File::Slurp', ) ;
+}
+
+my $data = <<TEXT ;
+line 1
+more text
+TEXT
+
+my $file = 'inode' ;
+
+write_file( $file, $data ) ;
+my $inode_num = (stat $file)[1] ;
+write_file( $file, $data ) ;
+my $inode_num2 = (stat $file)[1] ;
+
+#print "I1 $inode_num I2 $inode_num2\n" ;
+
+ok( $inode_num == $inode_num2, 'same inode' ) ;
+
+write_file( $file, {atomic => 1}, $data ) ;
+$inode_num2 = (stat $file)[1] ;
+
+#print "I1 $inode_num I2 $inode_num2\n" ;
+
+ok( $inode_num != $inode_num2, 'different inode' ) ;
+
+unlink $file ;
diff --git a/t/large.t b/t/large.t

new file mode 100644 (file)

index 0000000..aaa8c27
--- /dev/null
+++ b/t/large.t
@@ -0,0 +1,169 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Test::More ;
+use Carp ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', ) ;
+}
+
+my $file = 'slurp.data' ;
+unlink $file ;
+
+my @text_data = (
+       [],
+       [ 'a' x 8 ],
+       [ ("\n") x 5 ],
+       [ map( "aaaaaaaa\n", 1 .. 3 ) ],
+       [ map( "aaaaaaaa\n", 1 .. 3 ), 'aaaaaaaa' ],
+       [ map ( 'a' x 100 . "\n", 1 .. 1024 ) ],
+       [ map ( 'a' x 100 . "\n", 1 .. 1024 ), 'a' x 100 ],
+       [ map ( 'a' x 1024 . "\n", 1 .. 1024 ) ],
+       [ map ( 'a' x 1024 . "\n", 1 .. 1024 ), 'a' x 10240 ],
+       [],
+) ;
+
+my @bin_sizes = ( 1000, 1024 * 1024 ) ;
+
+my @bin_stuff = ( "\012", "\015", "\012\015", "\015\012",
+               map chr, 0 .. 32 ) ;
+
+my @bin_data ;
+
+foreach my $size ( @bin_sizes ) {
+
+       my $data = '' ;
+
+       while ( length( $data ) < $size ) {
+
+               $data .= $bin_stuff[ rand @bin_stuff ] ;
+       }
+
+       push @bin_data, $data ;
+}
+
+plan( tests => 16 * @text_data + 8 * @bin_data ) ;
+
+#print "# text slurp\n" ;
+
+foreach my $data ( @text_data ) {
+
+       test_text_slurp( $data ) ;
+}
+
+#print "# BIN slurp\n" ;
+
+foreach my $data ( @bin_data ) {
+
+       test_bin_slurp( $data ) ;
+}
+
+unlink $file ;
+
+exit ;
+
+sub test_text_slurp {
+
+       my( $data_ref ) = @_ ;
+
+       my @data_lines = @{$data_ref} ;
+       my $data_text = join( '', @data_lines ) ;
+
+
+       my $err = write_file( $file, $data_text ) ;
+       ok( $err, 'write_file - ' . length $data_text ) ;
+       my $text = read_file( $file ) ;
+       ok( $text eq $data_text, 'scalar read_file - ' . length $data_text ) ;
+
+       $err = write_file( $file, \$data_text ) ;
+       ok( $err, 'write_file ref arg - ' . length $data_text ) ;
+       $text = read_file( $file ) ;
+       ok( $text eq $data_text, 'scalar read_file - ' . length $data_text ) ;
+
+       $err = write_file( $file, { buf_ref => \$data_text } ) ;
+       ok( $err, 'write_file buf ref opt - ' . length $data_text ) ;
+       $text = read_file( $file ) ;
+       ok( $text eq $data_text, 'scalar read_file - ' . length $data_text ) ;
+
+       my $text_ref = read_file( $file, scalar_ref => 1 ) ;
+       ok( ${$text_ref} eq $data_text,
+               'scalar ref read_file - ' . length $data_text ) ;
+
+       read_file( $file, buf_ref => \my $buffer ) ;
+       ok( $buffer eq $data_text,
+               'buf_ref read_file - ' . length $data_text ) ;
+
+#      my @data_lines = split( m|(?<=$/)|, $data_text ) ;
+
+       $err = write_file( $file, \@data_lines ) ;
+       ok( $err, 'write_file list ref arg - ' . length $data_text ) ;
+       $text = read_file( $file ) ;
+       ok( $text eq $data_text, 'scalar read_file - ' . length $data_text ) ;
+
+#print map "[$_]\n", @data_lines ;
+#print "DATA <@data_lines>\n" ;
+
+       my @array = read_file( $file ) ;
+
+#print map "{$_}\n", @array ;
+#print "ARRAY <@array>\n" ;
+
+       ok( eq_array( \@array, \@data_lines ),
+                       'array read_file - ' . length $data_text ) ;
+
+       print "READ:\n", map( "[$_]\n", @array ),
+                "EXP:\n", map( "[$_]\n", @data_lines )
+                       unless eq_array( \@array, \@data_lines ) ;
+
+       my $array_ref = read_file( $file, array_ref => 1 ) ;
+       ok( eq_array( $array_ref, \@data_lines ),
+                       'array ref read_file - ' . length $data_text ) ;
+
+       $err = write_file( $file, { append => 1 }, $data_text ) ;
+       ok( $err, 'write_file append - ' . length $data_text ) ;
+
+       my $text2 = read_file( $file ) ;
+       ok( $text2 eq $data_text x 2, 'read_file append - ' . length $data_text ) ;
+
+       $err = append_file( $file, $data_text ) ;
+       ok( $err, 'append_file - ' . length $data_text ) ;
+
+       my $bin3 = read_file( $file ) ;
+       ok( $bin3 eq $data_text x 3, 'read_file append_file - ' . length $data_text ) ;
+
+       return ;
+}
+
+sub test_bin_slurp {
+
+       my( $data ) = @_ ;
+
+       my $err = write_file( $file, {'binmode' => ':raw'}, $data ) ;
+       ok( $err, 'write_file bin - ' . length $data ) ;
+
+       my $bin = read_file( $file, 'binmode' => ':raw' ) ;
+       ok( $bin eq $data, 'scalar read_file bin - ' . length $data ) ;
+
+       my $bin_ref = read_file( $file, scalar_ref => 1, 'binmode' => ':raw' ) ;
+       ok( ${$bin_ref} eq $data,
+                       'scalar ref read_file bin - ' . length $data ) ;
+
+       read_file( $file, buf_ref => \(my $buffer), 'binmode' => ':raw'  ) ;
+       ok( $buffer eq $data, 'buf_ref read_file bin - ' . length $data ) ;
+
+       $err = write_file( $file, { append => 1, 'binmode' => ':raw' }, $data ) ;
+       ok( $err, 'write_file append bin - ' . length $data ) ;
+
+       my $bin2 = read_file( $file, 'binmode' => ':raw' ) ;
+       ok( $bin2 eq $data x 2, 'read_file append bin - ' . length $data ) ;
+
+       $err = append_file( $file, { 'binmode' => ':raw' }, $data ) ;
+       ok( $err, 'append_file bin - ' . length $data ) ;
+
+       my $bin3 = read_file( $file, 'binmode' => ':raw' ) ;
+       ok( $bin3 eq $data x 3, 'read_file bin - ' . length $data ) ;
+
+       return ;
+}
diff --git a/t/newline.t b/t/newline.t

new file mode 100755 (executable)

index 0000000..eb6dc4c
--- /dev/null
+++ b/t/newline.t
@@ -0,0 +1,57 @@
+use Test::More tests => 2 + 2 ;
+
+use strict;
+
+BEGIN {
+    require_ok( 'File::Slurp' ) ;
+    use_ok('File::Slurp', qw(write_file read_file) ) ;
+}
+
+
+my $data = "\r\n\r\n\r\n" ;
+my $file_name = 'newline.txt' ;
+
+stdio_write_file( $file_name, $data ) ;
+my $slurped_data = read_file( $file_name ) ; 
+
+my $stdio_slurped_data = stdio_read_file( $file_name ) ; 
+
+
+print 'data ', unpack( 'H*', $data), "\n",
+'slurp ', unpack('H*',  $slurped_data), "\n",
+'stdio slurp ', unpack('H*',  $stdio_slurped_data), "\n";
+
+is( $data, $slurped_data, 'slurp' ) ;
+
+write_file( $file_name, $data ) ;
+$slurped_data = stdio_read_file( $file_name ) ; 
+
+is( $data, $slurped_data, 'spew' ) ;
+
+unlink $file_name ;
+
+sub stdio_write_file {
+
+       my( $file_name, $data ) = @_ ;
+
+       local( *FH ) ;
+
+       open( FH, ">$file_name" ) || die "Couldn't create $file_name: $!";
+
+       print FH $data ;
+}
+
+sub stdio_read_file {
+
+       my( $file_name ) = @_ ;
+
+       open( FH, $file_name ) || die "Couldn't open $file_name: $!";
+
+       local( $/ ) ;
+
+       my $data = <FH> ;
+
+       return $data ;
+}
+
+
diff --git a/t/no_clobber.t b/t/no_clobber.t

new file mode 100644 (file)

index 0000000..4941fea
--- /dev/null
+++ b/t/no_clobber.t
@@ -0,0 +1,28 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Test::More tests => 3 ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', qw( write_file ) ) ;
+}
+
+my $data = <<TEXT ;
+line 1
+more text
+TEXT
+
+my $file = 'xxx' ;
+
+unlink $file ;
+
+
+my $err = write_file( $file, { no_clobber => 1 }, $data ) ;
+ok( $err, 'new write_file' ) ;
+
+$err = write_file( $file, { no_clobber => 1, err_mode => 'quiet' }, $data ) ;
+
+ok( !$err, 'no_clobber write_file' ) ;
+
+unlink $file ;
diff --git a/t/original.t b/t/original.t

new file mode 100644 (file)

index 0000000..aa2a98f
--- /dev/null
+++ b/t/original.t
@@ -0,0 +1,55 @@
+#!/usr/bin/perl -I.
+
+# try to honor possible tempdirs
+$tmp = "file_$$";
+
+$short = <<END;
+small
+file
+END
+
+$long = <<END;
+This is a much longer bit of contents
+to store in a file.
+END
+
+print "1..7\n";
+
+use File::Slurp;
+
+&write_file($tmp, $long);
+if (&read_file($tmp) eq $long) {print "ok 1\n";} else {print "not ok 1\n";}
+
+@x = &read_file($tmp);
+@y = grep( $_ ne '', split(/(.*?\n)/, $long));
+while (@x && @y) {
+       last unless $x[0] eq $y[0];
+       shift @x;
+       shift @y;
+}
+if (@x == @y && (@x ? $x[0] eq $y[0] : 1)) { print "ok 2\n";} else {print "not ok 2\n"}
+
+&append_file($tmp, $short);
+if (&read_file($tmp) eq "$long$short") {print "ok 3\n";} else {print "not ok 3\n";}
+
+$iold = (stat($tmp))[1];
+&overwrite_file($tmp, $short);
+$inew = (stat($tmp))[1];
+
+if (&read_file($tmp) eq $short) {print "ok 4\n";} else {print "not ok 4\n";}
+
+if ($inew == $iold) {print "ok 5\n";} else {print "not ok 5\n";}
+
+unlink($tmp);
+
+&overwrite_file($tmp, $long);
+if (&read_file($tmp) eq $long) {print "ok 6\n";} else {print "not ok 6\n";}
+
+unlink($tmp);
+
+&append_file($tmp, $short);
+if (&read_file($tmp) eq $short) {print "ok 7\n";} else {print "not ok 7\n";}
+
+unlink($tmp);
+
+
diff --git a/t/paragraph.t b/t/paragraph.t

new file mode 100644 (file)

index 0000000..ddaa2b5
--- /dev/null
+++ b/t/paragraph.t
@@ -0,0 +1,66 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Test::More ;
+use Carp ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', ) ;
+}
+
+my $file = 'slurp.data' ;
+unlink $file ;
+
+my @text_data = (
+       [],
+       [ 'a' x 8 ],
+       [ "\n" x 5 ],
+       [ map( "aaaaaaaa\n\n", 1 .. 3 ) ],
+       [ map( "aaaaaaaa\n\n", 1 .. 3 ), 'aaaaaaaa' ],
+       [ map( "aaaaaaaa" . ( "\n"  x (2 + rand 3) ), 1 .. 100 ) ],
+       [ map( "aaaaaaaa" . ( "\n"  x (2 + rand 3) ), 1 .. 100 ), 'aaaaaaaa' ],
+       [],
+) ;
+
+plan( tests => 3 * @text_data ) ;
+
+#print "# text slurp\n" ;
+
+foreach my $data ( @text_data ) {
+
+       test_text_slurp( $data ) ;
+}
+
+
+unlink $file ;
+
+exit ;
+
+sub test_text_slurp {
+
+       my( $data_ref ) = @_ ;
+
+       my @data_lines = @{$data_ref} ;
+       my $data_text = join( '', @data_lines ) ;
+
+       local( $/ ) = '' ;
+
+       my $err = write_file( $file, $data_text ) ;
+       ok( $err, 'write_file - ' . length $data_text ) ;
+
+
+       my @array = read_file( $file ) ;
+       ok( eq_array( \@array, \@data_lines ),
+                       'array read_file - ' . length $data_text ) ;
+
+       print "READ:\n", map( "[$_]\n", @array ),
+                "EXP:\n", map( "[$_]\n", @data_lines )
+                       unless eq_array( \@array, \@data_lines ) ;
+
+       my $array_ref = read_file( $file, array_ref => 1 ) ;
+       ok( eq_array( $array_ref, \@data_lines ),
+                       'array ref read_file - ' . length $data_text ) ;
+
+       return ;
+}
diff --git a/t/pod.t b/t/pod.t

new file mode 100644 (file)

index 0000000..01bf016
--- /dev/null
+++ b/t/pod.t
@@ -0,0 +1,13 @@
+#!/usr/local/bin/perl
+
+use Test::More;
+
+eval 'use Test::Pod 1.14' ;
+plan skip_all =>
+       'Test::Pod 1.14 required for testing PODe' if $@ ;
+
+all_pod_files_ok(
+#      {
+#              trustme =>      [ qr/slurp/ ]
+#      }
+) ;
diff --git a/t/pod_coverage.t b/t/pod_coverage.t

new file mode 100644 (file)

index 0000000..0026d96
--- /dev/null
+++ b/t/pod_coverage.t
@@ -0,0 +1,24 @@
+#!/usr/local/bin/perl
+
+use Test::More;
+
+eval 'use Test::Pod::Coverage 1.04' ;
+plan skip_all =>
+       'Test::Pod::Coverage 1.04 required for testing POD coverage' if $@ ;
+
+all_pod_coverage_ok(
+       {
+               trustme =>      [
+                       'slurp',
+                       'O_APPEND',
+                       'O_BINARY',
+                       'O_CREAT',
+                       'O_EXCL',
+                       'O_RDONLY',
+                       'O_WRONLY',
+                       'SEEK_CUR',
+                       'SEEK_END',
+                       'SEEK_SET',
+               ],
+       }
+) ;
diff --git a/t/pseudo.t b/t/pseudo.t

new file mode 100644 (file)

index 0000000..9653277
--- /dev/null
+++ b/t/pseudo.t
@@ -0,0 +1,33 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Carp ;
+use Test::More ;
+
+plan( tests => 1 ) ; 
+
+my $proc_file = "/proc/$$/auxv" ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp' ) ;
+}
+
+SKIP: {
+
+       unless ( -r $proc_file ) {
+
+               skip "can't find pseudo file $proc_file", 1 ;
+       }
+
+       test_pseudo_file() ;
+}
+
+sub test_pseudo_file {
+
+       my $data_do = do{ local( @ARGV, $/ ) = $proc_file; <> } ;
+
+       my $data_slurp = read_file( $proc_file ) ;
+
+       is( $data_do, $data_slurp, 'pseudo' ) ;
+}
diff --git a/t/read_dir.t b/t/read_dir.t

new file mode 100644 (file)

index 0000000..d0b372c
--- /dev/null
+++ b/t/read_dir.t
@@ -0,0 +1,57 @@
+#!/usr/bin/perl -w -I.
+
+use strict ;
+use Test::More tests => 7 ;
+
+use File::Slurp ;
+
+# try to honor possible tempdirs
+
+my $test_dir = "read_dir_$$" ;
+
+mkdir( $test_dir, 0700) || die "mkdir $test_dir: $!" ;
+
+my @dir_entries = read_dir( $test_dir );
+
+ok( @dir_entries == 0, 'empty dir' ) ;
+
+@dir_entries = read_dir( $test_dir, keep_dot_dot => 1 ) ;
+
+ok( @dir_entries == 2, 'empty dir with . ..' ) ;
+
+write_file( "$test_dir/x", "foo\n" ) ;
+
+@dir_entries = read_dir( $test_dir ) ;
+
+ok( @dir_entries == 1, 'dir with 1 file' ) ;
+
+ok( $dir_entries[0] eq 'x', 'dir with file x' ) ;
+
+my $file_cnt = 23 ;
+
+my @expected_entries = sort( 'x', 1 .. $file_cnt ) ;
+
+for ( 1 .. $file_cnt ) {
+
+       write_file( "$test_dir/$_", "foo\n" ) ;
+}
+
+@dir_entries = read_dir( $test_dir ) ;
+@dir_entries = sort @dir_entries ;
+
+ok( eq_array( \@dir_entries, \@expected_entries ),
+       "dir with $file_cnt files" ) ;
+
+my $dir_entries_ref = read_dir( $test_dir ) ;
+@{$dir_entries_ref} = sort @{$dir_entries_ref} ;
+
+ok( eq_array( $dir_entries_ref, \@expected_entries ),
+       "dir in array ref" ) ;
+
+# clean up
+
+unlink map "$test_dir/$_", @dir_entries ;
+rmdir( $test_dir ) || die "rmdir $test_dir: $!";
+ok( 1, 'cleanup' ) ;
+
+__END__
diff --git a/t/slurp.t b/t/slurp.t

new file mode 100644 (file)

index 0000000..2d4f741
--- /dev/null
+++ b/t/slurp.t
@@ -0,0 +1,22 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Test::More tests => 2 ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', qw( write_file slurp ) ) ;
+}
+
+my $data = <<TEXT ;
+line 1
+more text
+TEXT
+
+my $file = 'xxx' ;
+
+write_file( $file, $data ) ;
+my $read_buf = slurp( $file ) ;
+is( $read_buf, $data, 'slurp alias' ) ;
+
+unlink $file ;
diff --git a/t/stdin.t b/t/stdin.t

new file mode 100644 (file)

index 0000000..9786118
--- /dev/null
+++ b/t/stdin.t
@@ -0,0 +1,26 @@
+#!/usr/local/bin/perl -w
+
+use strict ;
+
+use Carp ;
+use Socket ;
+use Symbol ;
+use Test::More tests => 7 ;
+
+BEGIN{ 
+       use_ok( 'File::Slurp', ) ;
+}
+
+my $data = <<TEXT ;
+line 1
+more text
+TEXT
+
+foreach my $file ( qw( stdin STDIN stdout STDOUT stderr STDERR ) ) {
+
+       write_file( $file, $data ) ;
+       my $read_buf = read_file( $file ) ;
+       is( $read_buf, $data, 'read/write of file [$file]' ) ;
+
+       unlink $file ;
+}
diff --git a/t/write_file_win32.t b/t/write_file_win32.t

new file mode 100644 (file)

index 0000000..f0d8638
--- /dev/null
+++ b/t/write_file_win32.t
@@ -0,0 +1,31 @@
+use Test::More tests => 2 + 1;
+use strict;
+BEGIN { $^W = 1 }
+
+BEGIN {
+    require_ok('File::Slurp');
+    use_ok('File::Slurp', 'write_file');
+}
+
+
+sub simple_write_file {
+    open my $fh, '>', $_[0] or die "Couldn't open $_[0] for write: $!";
+    print $fh $_[1];
+}
+
+sub newline_size {
+    my ($code) = @_;
+
+    my $file = __FILE__ . '.tmp';
+
+    local $\ = '';
+    $code->($file, "\n" x 3);
+
+    my $size = -s $file;
+
+    unlink $file;
+
+    return $size;
+}
+
+is(newline_size(\&write_file), newline_size(\&simple_write_file), 'newline');
Changes	[new file with mode: 0644]	patch \| blob
MANIFEST	[new file with mode: 0644]	patch \| blob
META.yml	[new file with mode: 0644]	patch \| blob
Makefile	[new file with mode: 0644]	patch \| blob
Makefile.PL	[new file with mode: 0644]	patch \| blob
Makefile.old	[new file with mode: 0644]	patch \| blob
README	[new file with mode: 0644]	patch \| blob
Slurp.pm	[new file with mode: 0755]	patch \| blob
TODO	[new file with mode: 0644]	patch \| blob
experiment/DATA_taint_check	[new file with mode: 0644]	patch \| blob
experiment/carp.pl	[new file with mode: 0644]	patch \| blob
experiment/seek.pl	[new file with mode: 0644]	patch \| blob
experiment/split.pl	[new file with mode: 0644]	patch \| blob
experiment/sysread.pl	[new file with mode: 0644]	patch \| blob
extras/new_text	[new file with mode: 0644]	patch \| blob
extras/slurp2.pod	[new file with mode: 0644]	patch \| blob
extras/slurp_article.pod	[new file with mode: 0644]	patch \| blob
extras/slurp_bench.pl	[new file with mode: 0755]	patch \| blob
extras/slurp_bench.pl.~1.2.~	[new file with mode: 0755]	patch \| blob
extras/slurp_data	[new file with mode: 0644]	patch \| blob
lib/File/Slurp.pm	[new file with mode: 0755]	patch \| blob
lib/File/Slurp.pm.~1.26.~	[new file with mode: 0755]	patch \| blob
pm_to_blib	[new file with mode: 0644]	patch \| blob
slurp_talk/Slurp.pm	[new file with mode: 0755]	patch \| blob
slurp_talk/Slurp.pm.html	[new file with mode: 0644]	patch \| blob
slurp_talk/slurp_article.html	[new file with mode: 0644]	patch \| blob
slurp_talk/slurp_article.pod	[new file with mode: 0644]	patch \| blob
slurp_talk/slurp_bench.pl	[new file with mode: 0755]	patch \| blob
t/append_null.t	[new file with mode: 0644]	patch \| blob
t/data_list.t	[new file with mode: 0644]	patch \| blob
t/data_scalar.t	[new file with mode: 0644]	patch \| blob
t/error.t	[new file with mode: 0644]	patch \| blob
t/foo.pl	[new file with mode: 0644]	patch \| blob
t/foo2.pl	[new file with mode: 0644]	patch \| blob
t/handle.t	[new file with mode: 0644]	patch \| blob
t/inode.t	[new file with mode: 0644]	patch \| blob
t/large.t	[new file with mode: 0644]	patch \| blob
t/newline.t	[new file with mode: 0755]	patch \| blob
t/no_clobber.t	[new file with mode: 0644]	patch \| blob
t/original.t	[new file with mode: 0644]	patch \| blob
t/paragraph.t	[new file with mode: 0644]	patch \| blob
t/pod.t	[new file with mode: 0644]	patch \| blob
t/pod_coverage.t	[new file with mode: 0644]	patch \| blob
t/pseudo.t	[new file with mode: 0644]	patch \| blob
t/read_dir.t	[new file with mode: 0644]	patch \| blob
t/slurp.t	[new file with mode: 0644]	patch \| blob
t/stdin.t	[new file with mode: 0644]	patch \| blob
t/write_file_win32.t	[new file with mode: 0644]	patch \| blob