clarify what a "line" is
[p5sagit/p5-mst-13.2.git] / regcomp.sym
CommitLineData
d09b2d29 1# Format:
2# NAME \t TYPE, arg-description [num-args] [longjump-len] \t DESCRIPTION
3
4# Empty rows and #-comment rows are ignored.
5
6# Exit points
7END END, no End of program.
8SUCCEED END, no Return from a subroutine, basically.
9
10# Anchors:
11BOL BOL, no Match "" at beginning of line.
12MBOL BOL, no Same, assuming multiline.
13SBOL BOL, no Same, assuming singleline.
b85d18e9 14EOS EOL, no Match "" at end of string.
d09b2d29 15EOL EOL, no Match "" at end of line.
16MEOL EOL, no Same, assuming multiline.
17SEOL EOL, no Same, assuming singleline.
18BOUND BOUND, no Match "" at any word boundary
a0ed51b3 19BOUNDUTF8 BOUND, no Match "" at any word boundary
d09b2d29 20BOUNDL BOUND, no Match "" at any word boundary
a0ed51b3 21BOUNDLUTF8 BOUND, no Match "" at any word boundary
d09b2d29 22NBOUND NBOUND, no Match "" at any word non-boundary
a0ed51b3 23NBOUNDUTF8 NBOUND, no Match "" at any word non-boundary
d09b2d29 24NBOUNDL NBOUND, no Match "" at any word non-boundary
a0ed51b3 25NBOUNDLUTF8 NBOUND, no Match "" at any word non-boundary
d09b2d29 26GPOS GPOS, no Matches where last m//g left off.
27
28# [Special] alternatives
22c35a8c 29REG_ANY REG_ANY, no Match any one character (except newline).
30ANYUTF8 REG_ANY, no Match any one Unicode character (except newline).
31SANY REG_ANY, no Match any one character.
32SANYUTF8 REG_ANY, no Match any one Unicode character.
d09b2d29 33ANYOF ANYOF, sv Match character in (or not in) this class.
a0ed51b3 34ANYOFUTF8 ANYOF, sv 1 Match character in (or not in) this class.
d09b2d29 35ALNUM ALNUM, no Match any alphanumeric character
a0ed51b3 36ALNUMUTF8 ALNUM, no Match any alphanumeric character
d09b2d29 37ALNUML ALNUM, no Match any alphanumeric char in locale
a0ed51b3 38ALNUMLUTF8 ALNUM, no Match any alphanumeric char in locale
d09b2d29 39NALNUM NALNUM, no Match any non-alphanumeric character
a0ed51b3 40NALNUMUTF8 NALNUM, no Match any non-alphanumeric character
d09b2d29 41NALNUML NALNUM, no Match any non-alphanumeric char in locale
a0ed51b3 42NALNUMLUTF8 NALNUM, no Match any non-alphanumeric char in locale
d09b2d29 43SPACE SPACE, no Match any whitespace character
a0ed51b3 44SPACEUTF8 SPACE, no Match any whitespace character
d09b2d29 45SPACEL SPACE, no Match any whitespace char in locale
a0ed51b3 46SPACELUTF8 SPACE, no Match any whitespace char in locale
d09b2d29 47NSPACE NSPACE, no Match any non-whitespace character
a0ed51b3 48NSPACEUTF8 NSPACE, no Match any non-whitespace character
d09b2d29 49NSPACEL NSPACE, no Match any non-whitespace char in locale
a0ed51b3 50NSPACELUTF8 NSPACE, no Match any non-whitespace char in locale
d09b2d29 51DIGIT DIGIT, no Match any numeric character
a0ed51b3 52DIGITUTF8 DIGIT, no Match any numeric character
d09b2d29 53NDIGIT NDIGIT, no Match any non-numeric character
a0ed51b3 54NDIGITUTF8 NDIGIT, no Match any non-numeric character
55CLUMP CLUMP, no Match any combining character sequence
d09b2d29 56
57# BRANCH The set of branches constituting a single choice are hooked
58# together with their "next" pointers, since precedence prevents
59# anything being concatenated to any individual branch. The
60# "next" pointer of the last BRANCH in a choice points to the
61# thing following the whole choice. This is also where the
62# final "next" pointer of each individual branch points; each
63# branch starts with the operand node of a BRANCH node.
64#
65BRANCH BRANCH, node Match this alternative, or the next...
66
67# BACK Normal "next" pointers all implicitly point forward; BACK
68# exists to make loop structures possible.
69# not used
70BACK BACK, no Match "", "next" ptr points backward.
71
72# Literals
73EXACT EXACT, sv Match this string (preceded by length).
74EXACTF EXACT, sv Match this string, folded (prec. by length).
75EXACTFL EXACT, sv Match this string, folded in locale (w/len).
76
77# Do nothing
78NOTHING NOTHING,no Match empty string.
79# A variant of above which delimits a group, thus stops optimizations
80TAIL NOTHING,no Match empty string. Can jump here from outside.
81
82# STAR,PLUS '?', and complex '*' and '+', are implemented as circular
83# BRANCH structures using BACK. Simple cases (one character
84# per match) are implemented with STAR and PLUS for speed
85# and to minimize recursive plunges.
86#
87STAR STAR, node Match this (simple) thing 0 or more times.
88PLUS PLUS, node Match this (simple) thing 1 or more times.
89
90CURLY CURLY, sv 2 Match this simple thing {n,m} times.
91CURLYN CURLY, no 2 Match next-after-this simple thing
92# {n,m} times, set parenths.
93CURLYM CURLY, no 2 Match this medium-complex thing {n,m} times.
94CURLYX CURLY, sv 2 Match this complex thing {n,m} times.
95
96# This terminator creates a loop structure for CURLYX
97WHILEM WHILEM, no Do curly processing and see if rest matches.
98
99# OPEN,CLOSE,GROUPP ...are numbered at compile time.
100OPEN OPEN, num 1 Mark this point in input as start of #n.
101CLOSE CLOSE, num 1 Analogous to OPEN.
102
103REF REF, num 1 Match some already matched string
104REFF REF, num 1 Match already matched string, folded
105REFFL REF, num 1 Match already matched string, folded in loc.
106
107# grouping assertions
108IFMATCH BRANCHJ,off 1 2 Succeeds if the following matches.
109UNLESSM BRANCHJ,off 1 2 Fails if the following matches.
110SUSPEND BRANCHJ,off 1 1 "Independent" sub-RE.
111IFTHEN BRANCHJ,off 1 1 Switch, should be preceeded by switcher .
112GROUPP GROUPP, num 1 Whether the group matched.
113
114# Support for long RE
115LONGJMP LONGJMP,off 1 1 Jump far away.
116BRANCHJ BRANCHJ,off 1 1 BRANCH with long offset.
117
118# The heavy worker
119EVAL EVAL, evl 1 Execute some Perl code.
120
121# Modifiers
122MINMOD MINMOD, no Next operator is not greedy.
123LOGICAL LOGICAL,no Next opcode should set the flag only.
124
125# This is not used yet
126RENUM BRANCHJ,off 1 1 Group with independently numbered parens.
127
128# This is not really a node, but an optimized away piece of a "long" node.
129# To simplify debugging output, we mark it as if it were a node
130OPTIMIZED NOTHING,off Placeholder for dump.