Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
fe14fcc3 |
2 | ''' $Header: a2p.man,v 4.0 91/03/20 01:57:11 lwall Locked $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: a2p.man,v $ |
fe14fcc3 |
5 | ''' Revision 4.0 91/03/20 01:57:11 lwall |
6 | ''' 4.0 baseline. |
7 | ''' |
a687059c |
8 | ''' Revision 3.0 89/10/18 15:34:22 lwall |
9 | ''' 3.0 baseline |
10 | ''' |
11 | ''' Revision 2.0.1.1 88/07/11 23:16:25 root |
12 | ''' patch2: changes related to 1985 awk |
13 | ''' |
378cc40b |
14 | ''' Revision 2.0 88/06/05 00:15:36 root |
15 | ''' Baseline version 2.0. |
8d063cd8 |
16 | ''' |
17 | ''' |
18 | .de Sh |
19 | .br |
20 | .ne 5 |
21 | .PP |
22 | \fB\\$1\fR |
23 | .PP |
24 | .. |
25 | .de Sp |
26 | .if t .sp .5v |
27 | .if n .sp |
28 | .. |
29 | .de Ip |
30 | .br |
31 | .ie \\n.$>=3 .ne \\$3 |
32 | .el .ne 3 |
33 | .IP "\\$1" \\$2 |
34 | .. |
35 | ''' |
36 | ''' Set up \*(-- to give an unbreakable dash; |
37 | ''' string Tr holds user defined translation string. |
38 | ''' Bell System Logo is used as a dummy character. |
39 | ''' |
378cc40b |
40 | .tr \(*W-|\(bv\*(Tr |
8d063cd8 |
41 | .ie n \{\ |
378cc40b |
42 | .ds -- \(*W- |
43 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch |
44 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch |
8d063cd8 |
45 | .ds L" "" |
46 | .ds R" "" |
47 | .ds L' ' |
48 | .ds R' ' |
49 | 'br\} |
50 | .el\{\ |
51 | .ds -- \(em\| |
52 | .tr \*(Tr |
53 | .ds L" `` |
54 | .ds R" '' |
55 | .ds L' ` |
56 | .ds R' ' |
57 | 'br\} |
58 | .TH A2P 1 LOCAL |
59 | .SH NAME |
60 | a2p - Awk to Perl translator |
61 | .SH SYNOPSIS |
62 | .B a2p [options] filename |
63 | .SH DESCRIPTION |
64 | .I A2p |
65 | takes an awk script specified on the command line (or from standard input) |
66 | and produces a comparable |
67 | .I perl |
68 | script on the standard output. |
69 | .Sh "Options" |
70 | Options include: |
71 | .TP 5 |
72 | .B \-D<number> |
73 | sets debugging flags. |
74 | .TP 5 |
75 | .B \-F<character> |
76 | tells a2p that this awk script is always invoked with this -F switch. |
77 | .TP 5 |
78 | .B \-n<fieldlist> |
79 | specifies the names of the input fields if input does not have to be split into |
80 | an array. |
81 | If you were translating an awk script that processes the password file, you |
82 | might say: |
83 | .sp |
84 | a2p -7 -nlogin.password.uid.gid.gcos.shell.home |
85 | .sp |
a687059c |
86 | Any delimiter can be used to separate the field names. |
8d063cd8 |
87 | .TP 5 |
88 | .B \-<number> |
89 | causes a2p to assume that input will always have that many fields. |
90 | .Sh "Considerations" |
91 | A2p cannot do as good a job translating as a human would, but it usually |
92 | does pretty well. |
93 | There are some areas where you may want to examine the perl script produced |
94 | and tweak it some. |
95 | Here are some of them, in no particular order. |
96 | .PP |
8d063cd8 |
97 | There is an awk idiom of putting int() around a string expression to force |
98 | numeric interpretation, even though the argument is always integer anyway. |
99 | This is generally unneeded in perl, but a2p can't tell if the argument |
100 | is always going to be integer, so it leaves it in. |
101 | You may wish to remove it. |
102 | .PP |
103 | Perl differentiates numeric comparison from string comparison. |
104 | Awk has one operator for both that decides at run time which comparison |
105 | to do. |
106 | A2p does not try to do a complete job of awk emulation at this point. |
107 | Instead it guesses which one you want. |
108 | It's almost always right, but it can be spoofed. |
109 | All such guesses are marked with the comment \*(L"#???\*(R". |
110 | You should go through and check them. |
a687059c |
111 | You might want to run at least once with the \-w switch to perl, which |
112 | will warn you if you use == where you should have used eq. |
8d063cd8 |
113 | .PP |
114 | Perl does not attempt to emulate the behavior of awk in which nonexistent |
115 | array elements spring into existence simply by being referenced. |
116 | If somehow you are relying on this mechanism to create null entries for |
117 | a subsequent for...in, they won't be there in perl. |
118 | .PP |
119 | If a2p makes a split line that assigns to a list of variables that looks |
120 | like (Fld1, Fld2, Fld3...) you may want |
121 | to rerun a2p using the \-n option mentioned above. |
122 | This will let you name the fields throughout the script. |
123 | If it splits to an array instead, the script is probably referring to the number |
124 | of fields somewhere. |
125 | .PP |
126 | The exit statement in awk doesn't necessarily exit; it goes to the END |
127 | block if there is one. |
128 | Awk scripts that do contortions within the END block to bypass the block under |
129 | such circumstances can be simplified by removing the conditional |
130 | in the END block and just exiting directly from the perl script. |
131 | .PP |
132 | Perl has two kinds of array, numerically-indexed and associative. |
133 | Awk arrays are usually translated to associative arrays, but if you happen |
134 | to know that the index is always going to be numeric you could change |
135 | the {...} to [...]. |
a687059c |
136 | Iteration over an associative array is done using the keys() function, but |
8d063cd8 |
137 | iteration over a numeric array is NOT. |
a687059c |
138 | You might need to modify any loop that is iterating over the array in question. |
8d063cd8 |
139 | .PP |
140 | Awk starts by assuming OFMT has the value %.6g. |
141 | Perl starts by assuming its equivalent, $#, to have the value %.20g. |
142 | You'll want to set $# explicitly if you use the default value of OFMT. |
143 | .PP |
144 | Near the top of the line loop will be the split operation that is implicit in |
145 | the awk script. |
146 | There are times when you can move this down past some conditionals that |
147 | test the entire record so that the split is not done as often. |
148 | .PP |
8d063cd8 |
149 | For aesthetic reasons you may wish to change the array base $[ from 1 back |
a687059c |
150 | to perl's default of 0, but remember to change all array subscripts AND |
8d063cd8 |
151 | all substr() and index() operations to match. |
152 | .PP |
a687059c |
153 | Cute comments that say "# Here is a workaround because awk is dumb" are passed |
154 | through unmodified. |
8d063cd8 |
155 | .PP |
156 | Awk scripts are often embedded in a shell script that pipes stuff into and |
157 | out of awk. |
158 | Often the shell script wrapper can be incorporated into the perl script, since |
159 | perl can start up pipes into and out of itself, and can do other things that |
160 | awk can't do by itself. |
a687059c |
161 | .PP |
162 | Scripts that refer to the special variables RSTART and RLENGTH can often |
163 | be simplified by referring to the variables $`, $& and $', as long as they |
164 | are within the scope of the pattern match that sets them. |
165 | .PP |
166 | The produced perl script may have subroutines defined to deal with awk's |
167 | semantics regarding getline and print. |
168 | Since a2p usually picks correctness over efficiency. |
169 | it is almost always possible to rewrite such code to be more efficient by |
170 | discarding the semantic sugar. |
171 | .PP |
172 | For efficiency, you may wish to remove the keyword from any return statement |
173 | that is the last statement executed in a subroutine. |
174 | A2p catches the most common case, but doesn't analyze embedded blocks for |
175 | subtler cases. |
176 | .PP |
177 | ARGV[0] translates to $ARGV0, but ARGV[n] translates to $ARGV[$n]. |
178 | A loop that tries to iterate over ARGV[0] won't find it. |
8d063cd8 |
179 | .SH ENVIRONMENT |
180 | A2p uses no environment variables. |
181 | .SH AUTHOR |
a687059c |
182 | Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov> |
8d063cd8 |
183 | .SH FILES |
184 | .SH SEE ALSO |
185 | perl The perl compiler/interpreter |
186 | .br |
187 | s2p sed to perl translator |
188 | .SH DIAGNOSTICS |
189 | .SH BUGS |
190 | It would be possible to emulate awk's behavior in selecting string versus |
191 | numeric operations at run time by inspection of the operands, but it would |
192 | be gross and inefficient. |
193 | Besides, a2p almost always guesses right. |
194 | .PP |
195 | Storage for the awk syntax tree is currently static, and can run out. |
196 | .rn }` '' |