Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
a687059c |
2 | ''' $Header: a2p.man,v 3.0 89/10/18 15:34:22 lwall Locked $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: a2p.man,v $ |
a687059c |
5 | ''' Revision 3.0 89/10/18 15:34:22 lwall |
6 | ''' 3.0 baseline |
7 | ''' |
8 | ''' Revision 2.0.1.1 88/07/11 23:16:25 root |
9 | ''' patch2: changes related to 1985 awk |
10 | ''' |
378cc40b |
11 | ''' Revision 2.0 88/06/05 00:15:36 root |
12 | ''' Baseline version 2.0. |
8d063cd8 |
13 | ''' |
14 | ''' |
15 | .de Sh |
16 | .br |
17 | .ne 5 |
18 | .PP |
19 | \fB\\$1\fR |
20 | .PP |
21 | .. |
22 | .de Sp |
23 | .if t .sp .5v |
24 | .if n .sp |
25 | .. |
26 | .de Ip |
27 | .br |
28 | .ie \\n.$>=3 .ne \\$3 |
29 | .el .ne 3 |
30 | .IP "\\$1" \\$2 |
31 | .. |
32 | ''' |
33 | ''' Set up \*(-- to give an unbreakable dash; |
34 | ''' string Tr holds user defined translation string. |
35 | ''' Bell System Logo is used as a dummy character. |
36 | ''' |
378cc40b |
37 | .tr \(*W-|\(bv\*(Tr |
8d063cd8 |
38 | .ie n \{\ |
378cc40b |
39 | .ds -- \(*W- |
40 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch |
41 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch |
8d063cd8 |
42 | .ds L" "" |
43 | .ds R" "" |
44 | .ds L' ' |
45 | .ds R' ' |
46 | 'br\} |
47 | .el\{\ |
48 | .ds -- \(em\| |
49 | .tr \*(Tr |
50 | .ds L" `` |
51 | .ds R" '' |
52 | .ds L' ` |
53 | .ds R' ' |
54 | 'br\} |
55 | .TH A2P 1 LOCAL |
56 | .SH NAME |
57 | a2p - Awk to Perl translator |
58 | .SH SYNOPSIS |
59 | .B a2p [options] filename |
60 | .SH DESCRIPTION |
61 | .I A2p |
62 | takes an awk script specified on the command line (or from standard input) |
63 | and produces a comparable |
64 | .I perl |
65 | script on the standard output. |
66 | .Sh "Options" |
67 | Options include: |
68 | .TP 5 |
69 | .B \-D<number> |
70 | sets debugging flags. |
71 | .TP 5 |
72 | .B \-F<character> |
73 | tells a2p that this awk script is always invoked with this -F switch. |
74 | .TP 5 |
75 | .B \-n<fieldlist> |
76 | specifies the names of the input fields if input does not have to be split into |
77 | an array. |
78 | If you were translating an awk script that processes the password file, you |
79 | might say: |
80 | .sp |
81 | a2p -7 -nlogin.password.uid.gid.gcos.shell.home |
82 | .sp |
a687059c |
83 | Any delimiter can be used to separate the field names. |
8d063cd8 |
84 | .TP 5 |
85 | .B \-<number> |
86 | causes a2p to assume that input will always have that many fields. |
87 | .Sh "Considerations" |
88 | A2p cannot do as good a job translating as a human would, but it usually |
89 | does pretty well. |
90 | There are some areas where you may want to examine the perl script produced |
91 | and tweak it some. |
92 | Here are some of them, in no particular order. |
93 | .PP |
8d063cd8 |
94 | There is an awk idiom of putting int() around a string expression to force |
95 | numeric interpretation, even though the argument is always integer anyway. |
96 | This is generally unneeded in perl, but a2p can't tell if the argument |
97 | is always going to be integer, so it leaves it in. |
98 | You may wish to remove it. |
99 | .PP |
100 | Perl differentiates numeric comparison from string comparison. |
101 | Awk has one operator for both that decides at run time which comparison |
102 | to do. |
103 | A2p does not try to do a complete job of awk emulation at this point. |
104 | Instead it guesses which one you want. |
105 | It's almost always right, but it can be spoofed. |
106 | All such guesses are marked with the comment \*(L"#???\*(R". |
107 | You should go through and check them. |
a687059c |
108 | You might want to run at least once with the \-w switch to perl, which |
109 | will warn you if you use == where you should have used eq. |
8d063cd8 |
110 | .PP |
111 | Perl does not attempt to emulate the behavior of awk in which nonexistent |
112 | array elements spring into existence simply by being referenced. |
113 | If somehow you are relying on this mechanism to create null entries for |
114 | a subsequent for...in, they won't be there in perl. |
115 | .PP |
116 | If a2p makes a split line that assigns to a list of variables that looks |
117 | like (Fld1, Fld2, Fld3...) you may want |
118 | to rerun a2p using the \-n option mentioned above. |
119 | This will let you name the fields throughout the script. |
120 | If it splits to an array instead, the script is probably referring to the number |
121 | of fields somewhere. |
122 | .PP |
123 | The exit statement in awk doesn't necessarily exit; it goes to the END |
124 | block if there is one. |
125 | Awk scripts that do contortions within the END block to bypass the block under |
126 | such circumstances can be simplified by removing the conditional |
127 | in the END block and just exiting directly from the perl script. |
128 | .PP |
129 | Perl has two kinds of array, numerically-indexed and associative. |
130 | Awk arrays are usually translated to associative arrays, but if you happen |
131 | to know that the index is always going to be numeric you could change |
132 | the {...} to [...]. |
a687059c |
133 | Iteration over an associative array is done using the keys() function, but |
8d063cd8 |
134 | iteration over a numeric array is NOT. |
a687059c |
135 | You might need to modify any loop that is iterating over the array in question. |
8d063cd8 |
136 | .PP |
137 | Awk starts by assuming OFMT has the value %.6g. |
138 | Perl starts by assuming its equivalent, $#, to have the value %.20g. |
139 | You'll want to set $# explicitly if you use the default value of OFMT. |
140 | .PP |
141 | Near the top of the line loop will be the split operation that is implicit in |
142 | the awk script. |
143 | There are times when you can move this down past some conditionals that |
144 | test the entire record so that the split is not done as often. |
145 | .PP |
8d063cd8 |
146 | For aesthetic reasons you may wish to change the array base $[ from 1 back |
a687059c |
147 | to perl's default of 0, but remember to change all array subscripts AND |
8d063cd8 |
148 | all substr() and index() operations to match. |
149 | .PP |
a687059c |
150 | Cute comments that say "# Here is a workaround because awk is dumb" are passed |
151 | through unmodified. |
8d063cd8 |
152 | .PP |
153 | Awk scripts are often embedded in a shell script that pipes stuff into and |
154 | out of awk. |
155 | Often the shell script wrapper can be incorporated into the perl script, since |
156 | perl can start up pipes into and out of itself, and can do other things that |
157 | awk can't do by itself. |
a687059c |
158 | .PP |
159 | Scripts that refer to the special variables RSTART and RLENGTH can often |
160 | be simplified by referring to the variables $`, $& and $', as long as they |
161 | are within the scope of the pattern match that sets them. |
162 | .PP |
163 | The produced perl script may have subroutines defined to deal with awk's |
164 | semantics regarding getline and print. |
165 | Since a2p usually picks correctness over efficiency. |
166 | it is almost always possible to rewrite such code to be more efficient by |
167 | discarding the semantic sugar. |
168 | .PP |
169 | For efficiency, you may wish to remove the keyword from any return statement |
170 | that is the last statement executed in a subroutine. |
171 | A2p catches the most common case, but doesn't analyze embedded blocks for |
172 | subtler cases. |
173 | .PP |
174 | ARGV[0] translates to $ARGV0, but ARGV[n] translates to $ARGV[$n]. |
175 | A loop that tries to iterate over ARGV[0] won't find it. |
8d063cd8 |
176 | .SH ENVIRONMENT |
177 | A2p uses no environment variables. |
178 | .SH AUTHOR |
a687059c |
179 | Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov> |
8d063cd8 |
180 | .SH FILES |
181 | .SH SEE ALSO |
182 | perl The perl compiler/interpreter |
183 | .br |
184 | s2p sed to perl translator |
185 | .SH DIAGNOSTICS |
186 | .SH BUGS |
187 | It would be possible to emulate awk's behavior in selecting string versus |
188 | numeric operations at run time by inspection of the operands, but it would |
189 | be gross and inefficient. |
190 | Besides, a2p almost always guesses right. |
191 | .PP |
192 | Storage for the awk syntax tree is currently static, and can run out. |
193 | .rn }` '' |