OPTAB - definitions for parsing command lines.

Usage:

#include <optab.h>

Description:

This file contains a number of definitions that can be used for sophisticated parsing of a C program's command line.

To make use of C's parsing facilities, you must create an array called "optable" which describes the various options that may appear on the program's command line. The elements of this array are structures with the defined type "opt_def" (defined in <optab.h>). Each of these structures has a string (named "opt_name") giving the associated option name and an unsigned integer (named "opt_flags") describing the form of the option. Each possible form has an associated symbolic name that is #defined in <optab.h>.

Below we give the symbolic names and the associated option form.

COMM_KWD
Command name keyword: this stands for the first token on the command line.
DASH_KWD
This stands for arguments of the form -Keyword.
PLUS_KWD
This stands for arguments of the form +Keyword.
TOGL_KWD
This stands for arguments that can be either +Keyword or -Keyword.
BLNK_KWD
This stands for arguments of the form Keyword (not marked by any special characters).
DVAL_KWD
This stands for an argument of the form Keyword=N. N must be a integer, and it will be interpreted as a decimal number.
OVAL_KWD
This is the same as DVAL_KWD, except that N will be interpreted as an octal number.
NVAL_KWD
This is similar to DVAL_KWD and OVAL_KWD. N will be interpreted as an octal number if it begins with a zero (e.g. 003); otherwise, it will be interpreted as a decimal number.
MNVL_KWD
Multiple numeric value keyword: this matches keywords of the forms
Keyword=number,number,number...
       or
Keyword="number number number ..."

For example,

Keyword=1,2,3,4,5
Keyword="1 2 3 4 5"

are equivalent. The numbers are interpreted as octal or decimal according to the first digit rule described above.

MDVL_KWD
Multiple decimal value keyword: this is similar to MNVL_KWD, but the numbers are interpreted as decimal.
MOVL_KWD
Multiple octal value keyword: this is similar to MNVL_KWD, but the numbers are interpreted as octal.
SVAL_KWD
String-valued keyword: this matches keywords of the form
Keyword="string"
      or
Keyword='string'

The quotes enclosing the string will not be stripped as the command line is parsed.

The end of the "optable" array should be marked by a pair of zeroes.

Receiving the Parsed Command Line:

When a C program is invoked, a number of "set-up" routines perform various initialization operations before the user-written "main" function begins execution. One of the things these set-up routines do is parse the command line that invoked the program.

If there is no "optable" array (or if it is empty), the set-up routines pass two arguments to "main": "argc", an integer giving the number of tokens on the command line (including the command name itself); "argv", an array of char pointers that point to the individual tokens on the command line.

If there IS a non-empty "optable" array, the set-up routines create the same "argc" as above, but the form of "argv" is different. The "argv" vector is an array of structures of the type "opt_desc". Each of these structures contains the following fields.

unsigned opt_keynum;
indicates the keyword in "optable" that the corresponding command line token matched. Essentially, the Nth token on the command line matches
optable[ argv[N].opt_keynum - 1 ]
union {
char *opt_cp;
int *opt_ip;
} opt_data;
points to the data specified for Keyword=Value option forms. The type of the pointer depends on the type of the data (integer pointer for numeric Values, character pointer for string values). If there are multiple (as in Keyword=number,number,number), the pointer points to the beginning of an array that contains the list of values.
unsigned opt_type;
contains characters indicating the type of option this was. Possible values for "opt_type" are
'+'     -- +Keyword
'-'     -- -Keyword
'='     -- Keyword=value
'<'     -- <string
'>'     -- >string
'>>'    -- >>string

If a command line token does not match any of the possible keywords in "optable", the set-up routines look at the token and see if it is in one of the recognized token formats. If it is (e.g. +Something, even though "Something" is recognized as a keyword in "optable"), "opt_keynum" will point one entry past the last element of "optable" and "opt_data" will point to the token (with quote marks stripped if necessary). If the token does not have a recognized form, "opt_keynum" will be zero and "opt_data" will point to the token (again with quotes stripped).

An Example:

To show how this all goes together, let's define the following "optable".

opt_def optable[] {
    "Verbose", TOGL_KWD,
    "Level", DVAL_KWD,
    "Tabs", MNVL_KWD
    0,0    /* marks end of array */
};

This indicates that the program recognizes three keyword options. The first can be either +Verbose or -Verbose. The second has the form Level=N where N is taken as a decimal number. The third has one of the two forms

Tabs=N1,N2,N3,...
Tabs="N1 N2 N3 ..."

Suppose that the user invokes this program with a command line of the form

prog tabs=4,8 -verbose

Then the set-up routines would pass "main" an "argc" value of 3 (since there are three tokens on the command line) and an "argv" value that consists of 3 "opt_desc" structures. Below we should what each of these structures contains.

argv[0]:
    opt_keynum = 0; /* "prog" not in command list */
    opt_data.opt_cp = "prog";
    opt_type = COMM_KWD | BLNK_KWD;
      /* COMM_KWD because first on line;
         BLNK_KWD because no special format;
         OR'd together to show both */
argv[1]:
    opt_keynum = 3; /* third option in "optable" */
    opt_data.opt_ip = {4,8}
                    /* array containing values */
    opt_type = MNVL_KWD;
argv[2]:
    opt_keynum = 1 /* first option in "optable" */
    opt_data.opt_cp = NULL;
                   /* no value for keyword */
    opt_type = DASH-KWD;
                   /* indicates -Verbose */

In this way, the set-up routines have already done most of the word required to identify the tokens on the command line and extract the information that they contain.

Keyword Abbreviations:

When the set-up routines are determining whether a command line token matches a keyword in "optable", they use the "_abbrv" routine to see if the token is a valid abbreviation of the keyword. "expl c lib _abbrv" describes this is more detail, but we will give a rough outline here.

When you specify the keywords in "optable", you may put some letters in upper case and some in lower case. The letters in upper case must appear in the matching token, but any or all of the letters in lower case may be omitted. For example, the "Verbose" option mentioned above could be abbreviated to

v  vrb  vbose  vb

and so on. Similarly,

t=5,10

would be recognized as an abbreviation for the keyword "Tabs". The case of letters in the input token is ignored.

See Also:

expl b lib .bset

expl c lib _abbrv

Copyright © 1996, Thinkage Ltd.