P.MATCH - perform pattern-matching.

Usage:

B:
  %b/manif/pmatch
  length = p.match(string,pcode,[offset,prevchar,
                   optbits,strleng,morefunc);
  start = .null();
C:
  #include <pmatch.h>
  pm_match_result res;
  res = p_match(const char *string,const pm_code *pcode,
                [int offset, int prevchar, int optbits,
                 int strleng, char *(*morefunc)(int *));

Examples:

pm_code *pcode;
p.compile(&pcode, "^abc$");
if (0 > p.match(string, pcode))
     error("Pattern not found*n");
p.free(pcode);

Where:

string
points to the first byte of data to be scanned.
pcode
is the handle for a compiled pattern, as produced by P.COMPILE.
offset
is the logical offset of the byte pointed at by "string". This would be used when scanning for the second match in a line. A non-zero "offset" prevents the "^" pattern from matching anything. The value of "offset" also determines what an "@(N)" pattern will match. The default "offset" is zero.
prevchar
must be supplied if "offset" is non-zero and the pattern contains the "<" special character. The "<" stands for the beginning of a word, and the pattern matcher may want to check the character immediately before "offset" to see if the string itself begins in the middle of a word. To get around this, "prevchar" should contain the character at logical offset "offset-1". (For a C program, this is usually "string[-1]".)
optbits
may specify some bit options controlling how P.MATCH behaves. Bits are symbolized by symbol names which are OR'ed together. Possible symbol names are:
PM_Short_Match (0)
terminates P.MATCH as soon as any match is found.
PM_Long_Match (1)
looks for the first maximal match. See below for more explanation.
PM_C_String (0)
indicates that "string" points to a normal string (i.e. one whose end is marked by a NUL byte '*0').
PM_Random_String (2)
indicates that "string" points to the beginning of a chunk of data that is "strleng" bytes long.

The default value is "PM_Short_Match | PM_C_String" (zero).

strleng
is relevant whem PM_Random_String is turned on in "optbits". "strleng" gives the number of bytes in the first chunk of data.
morefunc
is relevant whem PM_Random_String is turned on in "optbits". "morefunc" identifies the function P.MATCH should call to obtain subsequent chunks of data.
res
P.MATCH returns a two-word C structure in the AQ register.
res.length
is the number of bytes of data in the matched string; it is returned in the Q register. This is the value directly returned to B programs. A length of -1 indicates no match was found. A length of -2 indicates P.MATCH rejected the call because of a mismatch between the compiled pattern and random mode.
res.start
is the logical offset of the first byte matched; it is returned in the A register. B programs can obtain this value with a call to .NULL() immediately after the call to P.MATCH.

Description:

P.MATCH scans a string and determines if the string contains a substring matching a particular pattern. Before you can use P.MATCH, you must compile the pattern using P.COMPILE.

You use the "optbits" parameter when you want to do more than the simple job of determining if a pattern appears in a string. For example, if you have the pattern "cdef|de" and the string "abcdefgh*n", a short match (PM_Short_Match) would match "de" because it is the first match successfully completed. A long match (PM_Long_Match) would find "cdef" because that started first in the line.

Normally, you call P.MATCH to search for a match within a regular string corresponding to a line of data. In this case, all the data is present and the total length of the string can be determined. However, you can also use P.MATCH to search an arbitrary stream of bytes by specifying PM_Random_String in "optbits". With this option, you don't have to have all the data directly present. Instead, P.MATCH examines the first chunk of data, then calls "morefunc" to obtain the next chunk of data. P.MATCH passes "morefunc" one argument: a C integer pointer (int *) telling where the length of the new chunk should be stored. "morefunc" should return a pointer to the new data as its value. "morefunc" should return a NULL value (0) to indicate the end of all data.

In order to use this type of "random string" matching, you must pass the PM_Random_String option to P.COMPILE when you compile the pattern, as well as passing PM_Random_String to P.MATCH when you actually search for the pattern. With random string matching, '*0' gets no special treatment (it's not taken as the end of the string), so you can match '*0' with a '\000' pattern. In addition, '$' only matches the end of data; it does not match the null string before a '*n' character. Since P.MATCH doesn't know the number of bytes in the string until all the data has been scanned, you cannot use the "@(-N)" pattern in random string matching.

See Also:

expl b lib p.compile
for more information about pattern matching.

Copyright © 1996, Thinkage Ltd.