Commit | Line | Data |
8d063cd8 |
1 | .rn '' }` |
79072805 |
2 | ''' $RCSfile: a2p.man,v $$Revision: 4.1 $$Date: 92/08/07 18:29:10 $ |
8d063cd8 |
3 | ''' |
4 | ''' $Log: a2p.man,v $ |
8d063cd8 |
5 | .de Sh |
6 | .br |
7 | .ne 5 |
8 | .PP |
9 | \fB\\$1\fR |
10 | .PP |
11 | .. |
12 | .de Sp |
13 | .if t .sp .5v |
14 | .if n .sp |
15 | .. |
16 | .de Ip |
17 | .br |
18 | .ie \\n.$>=3 .ne \\$3 |
19 | .el .ne 3 |
20 | .IP "\\$1" \\$2 |
21 | .. |
22 | ''' |
23 | ''' Set up \*(-- to give an unbreakable dash; |
24 | ''' string Tr holds user defined translation string. |
25 | ''' Bell System Logo is used as a dummy character. |
26 | ''' |
378cc40b |
27 | .tr \(*W-|\(bv\*(Tr |
8d063cd8 |
28 | .ie n \{\ |
378cc40b |
29 | .ds -- \(*W- |
30 | .if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch |
31 | .if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch |
8d063cd8 |
32 | .ds L" "" |
33 | .ds R" "" |
34 | .ds L' ' |
35 | .ds R' ' |
36 | 'br\} |
37 | .el\{\ |
38 | .ds -- \(em\| |
39 | .tr \*(Tr |
40 | .ds L" `` |
41 | .ds R" '' |
42 | .ds L' ` |
43 | .ds R' ' |
44 | 'br\} |
45 | .TH A2P 1 LOCAL |
46 | .SH NAME |
47 | a2p - Awk to Perl translator |
48 | .SH SYNOPSIS |
49 | .B a2p [options] filename |
50 | .SH DESCRIPTION |
51 | .I A2p |
52 | takes an awk script specified on the command line (or from standard input) |
53 | and produces a comparable |
54 | .I perl |
55 | script on the standard output. |
56 | .Sh "Options" |
57 | Options include: |
58 | .TP 5 |
59 | .B \-D<number> |
60 | sets debugging flags. |
61 | .TP 5 |
62 | .B \-F<character> |
63 | tells a2p that this awk script is always invoked with this -F switch. |
64 | .TP 5 |
65 | .B \-n<fieldlist> |
66 | specifies the names of the input fields if input does not have to be split into |
67 | an array. |
68 | If you were translating an awk script that processes the password file, you |
69 | might say: |
70 | .sp |
71 | a2p -7 -nlogin.password.uid.gid.gcos.shell.home |
72 | .sp |
a687059c |
73 | Any delimiter can be used to separate the field names. |
8d063cd8 |
74 | .TP 5 |
75 | .B \-<number> |
76 | causes a2p to assume that input will always have that many fields. |
77 | .Sh "Considerations" |
78 | A2p cannot do as good a job translating as a human would, but it usually |
79 | does pretty well. |
80 | There are some areas where you may want to examine the perl script produced |
81 | and tweak it some. |
82 | Here are some of them, in no particular order. |
83 | .PP |
8d063cd8 |
84 | There is an awk idiom of putting int() around a string expression to force |
85 | numeric interpretation, even though the argument is always integer anyway. |
86 | This is generally unneeded in perl, but a2p can't tell if the argument |
87 | is always going to be integer, so it leaves it in. |
88 | You may wish to remove it. |
89 | .PP |
90 | Perl differentiates numeric comparison from string comparison. |
91 | Awk has one operator for both that decides at run time which comparison |
92 | to do. |
93 | A2p does not try to do a complete job of awk emulation at this point. |
94 | Instead it guesses which one you want. |
95 | It's almost always right, but it can be spoofed. |
96 | All such guesses are marked with the comment \*(L"#???\*(R". |
97 | You should go through and check them. |
a687059c |
98 | You might want to run at least once with the \-w switch to perl, which |
99 | will warn you if you use == where you should have used eq. |
8d063cd8 |
100 | .PP |
101 | Perl does not attempt to emulate the behavior of awk in which nonexistent |
102 | array elements spring into existence simply by being referenced. |
103 | If somehow you are relying on this mechanism to create null entries for |
104 | a subsequent for...in, they won't be there in perl. |
105 | .PP |
106 | If a2p makes a split line that assigns to a list of variables that looks |
107 | like (Fld1, Fld2, Fld3...) you may want |
108 | to rerun a2p using the \-n option mentioned above. |
109 | This will let you name the fields throughout the script. |
110 | If it splits to an array instead, the script is probably referring to the number |
111 | of fields somewhere. |
112 | .PP |
113 | The exit statement in awk doesn't necessarily exit; it goes to the END |
114 | block if there is one. |
115 | Awk scripts that do contortions within the END block to bypass the block under |
116 | such circumstances can be simplified by removing the conditional |
117 | in the END block and just exiting directly from the perl script. |
118 | .PP |
119 | Perl has two kinds of array, numerically-indexed and associative. |
120 | Awk arrays are usually translated to associative arrays, but if you happen |
121 | to know that the index is always going to be numeric you could change |
122 | the {...} to [...]. |
a687059c |
123 | Iteration over an associative array is done using the keys() function, but |
8d063cd8 |
124 | iteration over a numeric array is NOT. |
a687059c |
125 | You might need to modify any loop that is iterating over the array in question. |
8d063cd8 |
126 | .PP |
127 | Awk starts by assuming OFMT has the value %.6g. |
128 | Perl starts by assuming its equivalent, $#, to have the value %.20g. |
129 | You'll want to set $# explicitly if you use the default value of OFMT. |
130 | .PP |
131 | Near the top of the line loop will be the split operation that is implicit in |
132 | the awk script. |
133 | There are times when you can move this down past some conditionals that |
134 | test the entire record so that the split is not done as often. |
135 | .PP |
8d063cd8 |
136 | For aesthetic reasons you may wish to change the array base $[ from 1 back |
a687059c |
137 | to perl's default of 0, but remember to change all array subscripts AND |
8d063cd8 |
138 | all substr() and index() operations to match. |
139 | .PP |
a687059c |
140 | Cute comments that say "# Here is a workaround because awk is dumb" are passed |
141 | through unmodified. |
8d063cd8 |
142 | .PP |
143 | Awk scripts are often embedded in a shell script that pipes stuff into and |
144 | out of awk. |
145 | Often the shell script wrapper can be incorporated into the perl script, since |
146 | perl can start up pipes into and out of itself, and can do other things that |
147 | awk can't do by itself. |
a687059c |
148 | .PP |
149 | Scripts that refer to the special variables RSTART and RLENGTH can often |
150 | be simplified by referring to the variables $`, $& and $', as long as they |
151 | are within the scope of the pattern match that sets them. |
152 | .PP |
153 | The produced perl script may have subroutines defined to deal with awk's |
154 | semantics regarding getline and print. |
155 | Since a2p usually picks correctness over efficiency. |
156 | it is almost always possible to rewrite such code to be more efficient by |
157 | discarding the semantic sugar. |
158 | .PP |
159 | For efficiency, you may wish to remove the keyword from any return statement |
160 | that is the last statement executed in a subroutine. |
161 | A2p catches the most common case, but doesn't analyze embedded blocks for |
162 | subtler cases. |
163 | .PP |
164 | ARGV[0] translates to $ARGV0, but ARGV[n] translates to $ARGV[$n]. |
165 | A loop that tries to iterate over ARGV[0] won't find it. |
8d063cd8 |
166 | .SH ENVIRONMENT |
167 | A2p uses no environment variables. |
168 | .SH AUTHOR |
a687059c |
169 | Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov> |
8d063cd8 |
170 | .SH FILES |
171 | .SH SEE ALSO |
172 | perl The perl compiler/interpreter |
173 | .br |
174 | s2p sed to perl translator |
175 | .SH DIAGNOSTICS |
176 | .SH BUGS |
177 | It would be possible to emulate awk's behavior in selecting string versus |
178 | numeric operations at run time by inspection of the operands, but it would |
179 | be gross and inefficient. |
180 | Besides, a2p almost always guesses right. |
181 | .PP |
182 | Storage for the awk syntax tree is currently static, and can run out. |
183 | .rn }` '' |