perl 3.0: (no announcement message available)
[p5sagit/p5-mst-13.2.git] / x2p / a2p.man
CommitLineData
8d063cd8 1.rn '' }`
a687059c 2''' $Header: a2p.man,v 3.0 89/10/18 15:34:22 lwall Locked $
8d063cd8 3'''
4''' $Log: a2p.man,v $
a687059c 5''' Revision 3.0 89/10/18 15:34:22 lwall
6''' 3.0 baseline
7'''
8''' Revision 2.0.1.1 88/07/11 23:16:25 root
9''' patch2: changes related to 1985 awk
10'''
378cc40b 11''' Revision 2.0 88/06/05 00:15:36 root
12''' Baseline version 2.0.
8d063cd8 13'''
14'''
15.de Sh
16.br
17.ne 5
18.PP
19\fB\\$1\fR
20.PP
21..
22.de Sp
23.if t .sp .5v
24.if n .sp
25..
26.de Ip
27.br
28.ie \\n.$>=3 .ne \\$3
29.el .ne 3
30.IP "\\$1" \\$2
31..
32'''
33''' Set up \*(-- to give an unbreakable dash;
34''' string Tr holds user defined translation string.
35''' Bell System Logo is used as a dummy character.
36'''
378cc40b 37.tr \(*W-|\(bv\*(Tr
8d063cd8 38.ie n \{\
378cc40b 39.ds -- \(*W-
40.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
41.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
8d063cd8 42.ds L" ""
43.ds R" ""
44.ds L' '
45.ds R' '
46'br\}
47.el\{\
48.ds -- \(em\|
49.tr \*(Tr
50.ds L" ``
51.ds R" ''
52.ds L' `
53.ds R' '
54'br\}
55.TH A2P 1 LOCAL
56.SH NAME
57a2p - Awk to Perl translator
58.SH SYNOPSIS
59.B a2p [options] filename
60.SH DESCRIPTION
61.I A2p
62takes an awk script specified on the command line (or from standard input)
63and produces a comparable
64.I perl
65script on the standard output.
66.Sh "Options"
67Options include:
68.TP 5
69.B \-D<number>
70sets debugging flags.
71.TP 5
72.B \-F<character>
73tells a2p that this awk script is always invoked with this -F switch.
74.TP 5
75.B \-n<fieldlist>
76specifies the names of the input fields if input does not have to be split into
77an array.
78If you were translating an awk script that processes the password file, you
79might say:
80.sp
81 a2p -7 -nlogin.password.uid.gid.gcos.shell.home
82.sp
a687059c 83Any delimiter can be used to separate the field names.
8d063cd8 84.TP 5
85.B \-<number>
86causes a2p to assume that input will always have that many fields.
87.Sh "Considerations"
88A2p cannot do as good a job translating as a human would, but it usually
89does pretty well.
90There are some areas where you may want to examine the perl script produced
91and tweak it some.
92Here are some of them, in no particular order.
93.PP
8d063cd8 94There is an awk idiom of putting int() around a string expression to force
95numeric interpretation, even though the argument is always integer anyway.
96This is generally unneeded in perl, but a2p can't tell if the argument
97is always going to be integer, so it leaves it in.
98You may wish to remove it.
99.PP
100Perl differentiates numeric comparison from string comparison.
101Awk has one operator for both that decides at run time which comparison
102to do.
103A2p does not try to do a complete job of awk emulation at this point.
104Instead it guesses which one you want.
105It's almost always right, but it can be spoofed.
106All such guesses are marked with the comment \*(L"#???\*(R".
107You should go through and check them.
a687059c 108You might want to run at least once with the \-w switch to perl, which
109will warn you if you use == where you should have used eq.
8d063cd8 110.PP
111Perl does not attempt to emulate the behavior of awk in which nonexistent
112array elements spring into existence simply by being referenced.
113If somehow you are relying on this mechanism to create null entries for
114a subsequent for...in, they won't be there in perl.
115.PP
116If a2p makes a split line that assigns to a list of variables that looks
117like (Fld1, Fld2, Fld3...) you may want
118to rerun a2p using the \-n option mentioned above.
119This will let you name the fields throughout the script.
120If it splits to an array instead, the script is probably referring to the number
121of fields somewhere.
122.PP
123The exit statement in awk doesn't necessarily exit; it goes to the END
124block if there is one.
125Awk scripts that do contortions within the END block to bypass the block under
126such circumstances can be simplified by removing the conditional
127in the END block and just exiting directly from the perl script.
128.PP
129Perl has two kinds of array, numerically-indexed and associative.
130Awk arrays are usually translated to associative arrays, but if you happen
131to know that the index is always going to be numeric you could change
132the {...} to [...].
a687059c 133Iteration over an associative array is done using the keys() function, but
8d063cd8 134iteration over a numeric array is NOT.
a687059c 135You might need to modify any loop that is iterating over the array in question.
8d063cd8 136.PP
137Awk starts by assuming OFMT has the value %.6g.
138Perl starts by assuming its equivalent, $#, to have the value %.20g.
139You'll want to set $# explicitly if you use the default value of OFMT.
140.PP
141Near the top of the line loop will be the split operation that is implicit in
142the awk script.
143There are times when you can move this down past some conditionals that
144test the entire record so that the split is not done as often.
145.PP
8d063cd8 146For aesthetic reasons you may wish to change the array base $[ from 1 back
a687059c 147to perl's default of 0, but remember to change all array subscripts AND
8d063cd8 148all substr() and index() operations to match.
149.PP
a687059c 150Cute comments that say "# Here is a workaround because awk is dumb" are passed
151through unmodified.
8d063cd8 152.PP
153Awk scripts are often embedded in a shell script that pipes stuff into and
154out of awk.
155Often the shell script wrapper can be incorporated into the perl script, since
156perl can start up pipes into and out of itself, and can do other things that
157awk can't do by itself.
a687059c 158.PP
159Scripts that refer to the special variables RSTART and RLENGTH can often
160be simplified by referring to the variables $`, $& and $', as long as they
161are within the scope of the pattern match that sets them.
162.PP
163The produced perl script may have subroutines defined to deal with awk's
164semantics regarding getline and print.
165Since a2p usually picks correctness over efficiency.
166it is almost always possible to rewrite such code to be more efficient by
167discarding the semantic sugar.
168.PP
169For efficiency, you may wish to remove the keyword from any return statement
170that is the last statement executed in a subroutine.
171A2p catches the most common case, but doesn't analyze embedded blocks for
172subtler cases.
173.PP
174ARGV[0] translates to $ARGV0, but ARGV[n] translates to $ARGV[$n].
175A loop that tries to iterate over ARGV[0] won't find it.
8d063cd8 176.SH ENVIRONMENT
177A2p uses no environment variables.
178.SH AUTHOR
a687059c 179Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov>
8d063cd8 180.SH FILES
181.SH SEE ALSO
182perl The perl compiler/interpreter
183.br
184s2p sed to perl translator
185.SH DIAGNOSTICS
186.SH BUGS
187It would be possible to emulate awk's behavior in selecting string versus
188numeric operations at run time by inspection of the operands, but it would
189be gross and inefficient.
190Besides, a2p almost always guesses right.
191.PP
192Storage for the awk syntax tree is currently static, and can run out.
193.rn }` ''