/Vault/WebSites/test.thinkage.ca/gcos/expl/yaa/manu/manu.html YAA Assembler Reference Manual

YAA Assembler Reference Manual

Thinkage Ltd.
85 McIntyre Drive
Kitchener, Ontario
Canada N2R 1H6
Copyright © 2007 by Thinkage Ltd.

1. Introduction

The YAA assembler is a portable assembler, in the sense that it is not strongly tied to a particular machine or operating system. Its structure and its pseudo-ops are intended to support assembler programs for a variety of hardware types. Of course, the actual assembler operands will be those of a particular machine, but program structure is intended to be as system-independent as possible.

This version of the manual describes YAA as implemented for GCOS-8 running on DPS-8 machines. The output of YAA on this system is object code in a format recognized by the LD Link Editor (created by Thinkage Ltd.). LD can translate this object code into a format suitable for loading in the multi-segment GCOS-8 environment (i.e. into an OM) or into a format suitable for loading in the single segment GCOS-8 environment (i.e. into a Bstar file). Note that some assembler instructions are unique to the multi-segment environment and may not be used in single segment programs. The LD Link Editor is explained in "expl ld".

2. Basic Concepts

This chapter discusses the basic concepts for programming with YAA. We begin by describing the lexical elements of YAA, i.e. the pieces that go together to make up a YAA program. These pieces are often known as the tokens of the language.

2.1 Tokens

Tokens are the smallest meaningful pieces of input that the assembler recognizes. Tokens can be identifiers like

          x   lda   .data
constants like
          14    2.56    'ab'   4E-2
and string literals like
          "hello there"
Punctuation characters and operators like
          *    =    ++    !    (    )    [    ]
are each considered tokens as well.

2.2 Identifiers

Identifiers serve as names for items in a YAA program (e.g. data objects, opcodes, macros, etc.). Identifiers may be formed from the upper case letters 'A'- 'Z', the lower case letters 'a'- 'z', the digits '0'-'9', the underscore character '_', the dollar sign '$', and the dot '.'. The first character of an identifier may not be a digit.

Identifiers may be arbitrarily long. The case of letters in an identifier is significant; thus NAME, name, and Name are all different identifiers.

2.3 Integer Constants

There are several types of integer constants: decimal, octal, hexadecimal, binary, BCD, and ASCII. YAA's internal arithmetic is always performed with the same precision as the longest integer format on a particular machine. On the DPS-8, this means that integer arithmetic uses 36-bit integers. Of course, integer data objects in an assembled program may be given any format recognized by the machine.

2.3.1 Decimal Integers

A decimal integer is written as a sequence of digits with no leading zero, as in

          2400    5    87    1340932
The minus sign ('-') may be used to create negative integers, but it is actually considered an operator on the integer, not part of the constant itself.

2.3.2 Octal Integers

An octal integer is written as a sequence of digits with at least one leading zero. Only octal digits may be used (i.e. the digits '0' through '7'), as in

          01    007    000400    0777
Each octal digit represents three bits.

2.3.3 Hexadecimal Integers

A hexadecimal integer is written as a 0x or 0X followed by a sequence of hexadecimal digits. The hexadecimal digits are '0' through '9', plus the letters 'A' through 'F' standing for the (decimal) values from 10 to 15. Digits represented by letters may appear in either upper or lower case, as in

          0X10    0x10    0xc0BB    0XFFFF    0xFf
Each hexadecimal digit represents four bits.

2.3.4 Binary Integers

A binary integer is written as a 0b or 0B followed by a sequence of ones and zeroes, as in

          0b0000  0B1111  0B101010
Each binary digit represents one bit.

2.3.5 ASCII Character Constants

An ASCII character constant consists of one or more ASCII characters enclosed in single quotes, as in

          'a'    '0'    'ab'    '.'

ASCII character constants are integer constants. They occupy the same amount of memory as other integer constants. Therefore the maximum number of characters that may be specified in an ASCII character constant is the maximum number of ASCII characters that may be stored in an integer. On the DPS-8, this is four ASCII characters.

When an ASCII character constant contains fewer characters than can be stored in an integer, the characters that are specified are right-justified within the integer and padded on the left with 0-bits.

2.3.6 BCD Character Constants

A BCD character constant consists of one or more BCD characters enclosed in grave accents, as in

          `a`    `ab`    `012345`    `......`

BCD character constants are integer constants and occupy the same amount of memory as other integer constants. Therefore the maximum number of characters that may be specified in a BCD character constant is the maximum number of characters that may be stored in an integer. On the DPS- 8, this is six BCD characters.

When a BCD character constant contains fewer characters than can be stored in an integer, the characters that are specified are right-justified within the integer and padded on the left with 0-bits.

2.4 Floating Point Constants

A floating point constant represents a number that may have a fractional part or exponent.

A floating point constant is written as a sequence of digits, optionally followed by a decimal point, optionally followed by more digits, optionally followed by an exponent consisting of E or e followed by a signed decimal integer. If the decimal integer in the exponent is positive, the '+' sign may be omitted. Either the decimal point or the exponent (or both) must be present for a constant to be interpreted as a floating point number. Here are some examples of floating point constants.

          2.3    1.    3E5    4.E-2    0.34e7
Notice that the first character of a floating point constant must be a digit; input like .5 is not a valid floating point constant.

Internally, YAA uses the floating point format with the longest precision available on a particular machine. On the DPS-8, this means double precision. Of course, floating point data objects in an assembled program may be given any format recognized by the machine.

If you attempt to store a floating point constant in a memory area that is not long enough to hold a floating point value, the rightmost bits of the constant will be discarded in order to truncate the constant to the required size.

2.5 ASCII Strings

An ASCII string consists of zero or more ASCII characters enclosed in double quotes, as in

          "This is a string."
          ""
          "The above is a null string."
The string consists only of the characters specified. NOTE to C programmers: there is no special character added to mark the end of the string (so no '\0').

2.5.1 Escape Sequences

Special input sequences may be used in strings and character constants to represent unusual characters. Such sequences are known as escape sequences.

An escape sequence consists of a backslash (\) followed by one or more other characters. In an ASCII string or character constant, an escape sequence represents a single ASCII character. In a BCD character constant, an escape sequence represents a single BCD character. Below we list the escape sequences recognized by YAA. Some of these are only appropriate as ASCII characters, while others can be used for both ASCII and BCD.

\a
The alert character, also known as the "bell", octal 007. When printed on a terminal, this character will make the terminal beep.
\b
The backspace character, octal 010. When printed on a terminal, this character moves the cursor backwards one position.
\f
The formfeed character, octal 014. When printed on a printer, this character moves to a new page.
\n
The new-line character, octal 012. When printed to a terminal, this character moves the cursor to the next line.
\r
The carriage return character, octal 015. When printed to a terminal, this character moves the cursor to the beginning of the current line.
\t
The horizontal tab character, octal 011. When printed to a terminal, this character moves the cursor to the next horizontal tab stop (if any).
\v
The vertical tab character, octal 013. When printed to a terminal, this character moves the cursor to the next vertical tab stop (if any).
\'
The single quote character. This is used when you want an ASCII character constant to contain a single quote character, as in '\'', a character constant consisting of a single quote. The escape sequence is needed so that the quote character is not interpreted as the quote that ends the character constant.
\"
The double quote character. This is used when you want an ASCII string to contain a double quote character, as in
        "He said, \"Hello there.\"\n"
\`
The grave accent. This is used when you want a BCD character constant to contain a grave accent.
\ (space)
The space character (backslash followed by a space). This sequence is only meaningful in continuation lines (see Section 3.4).
\ (tab)
The tab character in source code (backslash followed by a tab). This sequence is only meaningful in continuation lines (see Section 3.4).
\\
The backslash character. This is used to represent the actual backslash character inside strings and character constants, as in
        "The backslash character is \\"
\nnn
A backslash followed by one to three octal digits as in '\0', '\01', '\777'. This stands for the ASCII character whose octal representation is given by the specified digits (right-justified in a byte and padded on the left with 0-bits). For example, 'A' is equivalent to '\101'.
\xnnn
A backslash, followed by an 'x' (or 'X'), followed by one to three hexadecimal digits. This stands for the ASCII character whose hexadecimal representation is given by the specified digits (right-justified in a byte and padded on the left with 0-bits). For example, 'A' is equivalent to '\x41'.

Notice that escape sequences are written with several characters but only represent a single character. Therefore a character constant like

          '\n'
is a single character, even though it is written with two characters.

3. Program Format

For the most part, YAA source code is "free-format". It differs from older assemblers in the following respects.

  1. More than one instruction may be given on an input line (if desired).
  2. A single instruction may be split over several input lines.
  3. Instruction fields need not begin at a particular position on the line.
  4. Extra white space may be added inside instructions to improve readability (e.g. to separate operands in an operand list).
  5. Blank lines may be inserted anywhere in the program to improve readability.
In the sections to come, we will describe the general format of YAA source code.

3.1 Terminology

YAA uses parentheses ( ) and square brackets [ ] to enclose various constructs. These will be grouped together under the name "Brackets" in this manual.

A Bracket-balanced token sequence is a sequence of tokens in which every opening Bracket has an appropriate closing Bracket of the same type, and Bracket pairs are properly nested. This rules out such constructs as

          A ( B        -- opening (, but no closing
          A (B [ C) ]  -- not properly nested

A Bracket-balanced token sequence may not contain a new- line character. Code blocks are described in Chapter 8.

The term white space refers to any sequence of blank or horizontal tab characters. Blanks and horizontal tabs are called white space characters.

Two consecutive tokens must be separated by white space if they would form a single token when joined together. For example, suppose your code had an identifier A followed by the floating point constant 1.0. They must be separated by at least one white space character, as in

          A 1.0
Without the white space character, they would read A1.0 which is a valid identifier token.

Tokens may always be separated by one or more white space characters, even if the white space is not necessary. Therefore an expression like

          A+B
could also be written
          A  +  B
if desired.

3.2 Statements

A statement consists of three fields: the label, the opcode, and the operand list.

3.2.1 Statement Labels

The statement label field consists of a Bracket-balanced token sequence, followed by a colon. The most common form of the statement label is

          identifier :
where the identifier is a normal YAA identifier. You may also have
          [ identifier,identifier, ... ] :
to put more than one label on the same statement. The labels are separated by commas and enclosed in square brackets.

The label field may also be an expression, provided that the result of the expression is a token sequence containing one or more comma-separated identifiers. For example,

          (X >> 0) ? [A] : [B] :
is a conditional expression, as explained in Chapter 5. The value of this label is [A] if X is greater than 0, and [B] otherwise.

Statement labels are optional. A statement may consist only of a label, as in

          name:
This associates the given name with the current location (i.e. the value of the instruction counter).

Input of the form

          name1: name2: name3: ...
is invalid. To put several labels on the same statement, give a list enclosed in square brackets (as shown previously).

3.2.2 Opcodes

The opcode field consists of the shortest possible Bracket-balanced token sequence following the statement label (if the statement has a label). In general, this will be a machine opcode or a YAA pseudo-op, as in

          lda
          .null

However, it could also be an expression, as in

          (.unquote("lda"))
Notice that the above expression had to be enclosed in parentheses. If we had just written
          .unquote("lda")
the opcode would have been taken to be .unquote, since this is the shortest possible Bracket-balanced token sequence.

3.2.3 Operand Lists

Anything following the opcode field is taken to be part of a list of operands for the operation. Operands in the list are Bracket-balanced token sequences that are separated by commas. For example,

          A+B,NAME,[tok1,tok2]
is a list consisting of three operands:
          A+B
          NAME
          [tok1,tok2]
The comma inside the square brackets does not count as an operand separator because operands must be Bracket-balanced.

Remember that any amount of white space may be used in an operand list. Thus the above operand list could have been written as

          A + B, NAME, [tok1, tok2]

The end of an operand list can be marked in several ways.

  1. With the semi-colon character (;). Thus
            tsx1 subroutine,modifier; tra error_rtn
    
    shows a line that contains two separate instructions. The semi-colon marks the end of the operand list of the first instruction, and therefore the end of the instruction itself. The second instruction starts after the semi-colon.
  2. At the new-line character which ends an input line. For example, in
            tra error_rtn
    
    the operand list (and therefore the instruction itself) ends at the end of the line.
  3. At a comment on the end of a line. We will discuss comments in Section 3.3.
Some types of statements require an operand list, others allow an operand list to be present or omitted, while still others result in errors if an operand list is present.

3.2.4 Empty Statements

An empty statement is a statement without opcode or operand list fields. For example, in

          tsx1 subroutine, modifier; ; tra error_rtn
there is an empty statement between the two other instructions. Such statements have no effect on generated code.

The empty statement form in the example above is not very useful. It is much more common to use empty statements which are simply blank lines between other instructions, as in


          tsx1 subroutine, modifier

          tra error_rtn
Such empty statements can be used to divide instructions into logical groups, making the program source code easier to read.

3.3 Comments

A YAA comment consists of a number sign (#), followed by zero or more other characters, followed by a new-line. In other words, a YAA comment begins at a '#' and extends to the end of the source code line. For example, you might have

          jump: tsx1  subroutine,modifier # Call subroutine
                tra   error_rtn           # Return location
                                          # if error
Notice that the final line of the above example does not have an opcode or operand list. Comments may occupy a line all on their own.

Number signs inside ASCII string or character constants do not count as the beginning of a comment. For example, in

          Str:  .data  "This song is in C#"  #Comment
the first number sign is part of the string. The comment does not begin until the second number sign.

Number signs inside token sequences may be confused with the beginning of a comment. In this case, put a backslash in front of the number sign, as in \#. This always indicates a literal number sign, not the start of a comment.

3.4 Continuation Lines

Statements may be split over several lines of source input. If a line of source input ends in a backslash, YAA will discard the backslash, the new-line character at the end of the source line, and any white space characters at the beginning of the next source line. For example,

          Str:   .data  \
                 "A string"
is changed to
          Str:   .data  "A string"
Similarly,
          Str:   .data  "A \
                 string"
is changed to
          Str:   .data  "A string"
as before. This shows how ASCII strings may be broken over more than one line of input.

In this kind of construction, the backslash doesn't have to be the last character on the line. It can be followed by any number of blanks or tabs. It can also be followed by at least one blank or tab, followed by a comment, as in

          Str:   .data  \    #Comment
                 "A string"
This is equivalent to
          Str:   .data "A string"
The continuation goes "around" the comment. However, you could not say
          Str:   .data  "A  \  #Comment
                 string"
because '#' is not recognized as the beginning of a comment when it is inside a string.

In this sort of construction, there must always be at least one blank or tab between the backslash and the '#'. Remember that the sequence \# is always interpreted as a literal number sign, not the beginning of a comment.

Notice that white space at the beginning of the second line is discarded. If you do not want YAA to discard white space at the beginning of a continued line, put a backslash in front of the first white space character that you want to be significant. For example,

          Str:   .data  "A \
          \       string"
is equivalent to
          Str:   .data  "A        string"

3.5 Reserved Words

High level programming languages usually have reserved words, i.e. symbols (identifiers) which can only be used for purposes defined by the language. For example, in C, the keyword if may only be used to begin a statement; it may not be used as a variable name.

YAA is not quite this restrictive. Symbols may be reserved in some contexts, but unreserved in others. Most importantly, the machine opcodes are reserved words when they appear in the opcode field, but not when they are used in the label or operand fields. As an example, suppose the machine has an opcode named lda. This is a reserved word in the opcode field; you could not create an opcode macro with the same name. However, it is not reserved in the label or operand fields; you could create a variable with the same name, provided that you didn't intend to use the variable to stand for an opcode. This feature makes sure that code doesn't "break" if a new opcode is added to your machine's instruction set, conflicting with a variable in some existing program.

As a second example, the symbol du is a reserved word when it appears in an operand list as an instruction tag field on the DPS-8 machine. However, in other contexts, it is not reserved.

YAA is defined so that all contextually reserved words must be typed in lower case in source code. This means that all opcodes, pseudo-ops, special operands, etc. must appear in lower case. Users who prefer to write such symbols in upper or mixed case may create appropriate macros for the symbols. For example, you might create a macro named LDA which stands for the lda instruction. Macro definition is described in Chapter 8.

YAA's machine-independent reserved words (i.e. pseudo-ops and special symbols used by the assembler) all start with the dot (.) character. To avoid conflicts with such symbols, users should avoid creating names that begin with a dot. Note that other reserved words (e.g. names of opcodes on a particular machine) may not begin with a dot.

3.6 Initialization Files

The YAA assembler accepts a command line option of the form

          INITialization=file
where file is the name of a file. If this is present, YAA will read in the contents of the given file and assemble them before any other code. A typical initialization file may contain definitions for macros (see Chapter 8), options controlling listing format (see Chapter 9), and .search directives setting up search rules (see Chapter 8).

If there is no explicit initialization file specified on the command line, YAA will use a default initialization file. If you do not want to use this default, specify

          init=
on the command line (without any file name after the '='.) See "expl yaa" for details.

4. Organization of Memory

The output produced by YAA is organized into sections. A section is a block of assembled material that should be considered as a unit for the purpose of linking. For example, the executable code of each function in a program might be considered to be a section. Similarly, each external data object is a section all on its own.

A section is either a data section or a code section. Data sections contain data objects and code sections contain executable code. Some operations (e.g. the .align pseudo-op described in Chapter 8) behave differently in data sections and code sections.

Sections may contain other sections. For example, a section may contain several external data objects which are themselves sections. Any level of nesting is allowed. If section A contains section B, A is said to be the parent section of B.

Sections may or may not have names. For example, a section that just contains an external data object has the name of the object.

4.1 External Symbols

For the most part, the symbols created in your program are simply for the convenience of assembly -- once the program has been assembled, they are forgotten. An external symbol is one whose name is retained until the program is linked. Special directives are needed to inform YAA that a particular symbol is external; these are explained in Chapter 8.

There are two types of external symbols.

  1. A SYMREF is an external symbol that is not defined in the code being assembled. For example, if your code calls a function that is not defined in your source code but will be linked in later, the function's name is a SYMREF. It is a reference to an external symbol.
  2. A SYMDEF is an external symbol that is defined in your program. This symbol may be referenced by other object units. For example, your source code may define a function which can then be linked in with other programs that reference the function.
External symbols are always the names of sections or of fixed offsets within sections.

4.2 Section Offset Modes

When a name is associated with a location in a section, the location is represented by an offset from the beginning of the section. Thus a location name has two associated quantities: an identifier indicating the section that contains the location; and an offset. This offset may be expressed in bits, bytes, or words.

At the time a section is created, YAA must decide whether subsequent location names should represent offsets in bits, bytes, or words. This means choosing the offset mode for the section. Possible offset modes are represented by the keywords bit, byte, and word. The default offset mode for GCOS-8 is word, meaning that subsequent location names are associated with word offsets from the beginning of the section. Different offset modes can be specified on the statement that creates the section.

The offset mode for a section can be changed part way through the section using the .usage pseudo-op (described in Chapter 8). After the offset mode has been changed, subsequent location names will be associated with offsets in the new units. However, the offsets will still be from the beginning of the section.

4.3 The Offset Mode of a Symbol

When a program defines a symbol referring to a location, the symbol is associated with an integer representing the offset of the location from the beginning of the section. The offset is measured in the units dictated by the section's current offset mode.

For example, suppose the current offset mode of a section is byte and that X is used to name a location in that section. X will be associated with an integer giving the byte offset of the location in the section. Now, what happens if you try an instruction like

          lda X

X is just a number (giving a byte offset in a section). But the lda instruction expects a word address. To resolve the conflict here, you must convert the byte offset X into words, as in

          lda X/4

In one sense then, a symbol associated with a location in a section is just an integer. In another sense, however, the symbol has its own "offset mode" inherited from the section's offset mode -- the symbol represents an offset in specific units.

4.3.1 The Type of a Symbol

YAA keeps track of each symbol's offset mode. This is considered to be the type of the symbol. For example, if a symbol is defined in a section that has a byte offset mode, that symbol is taken to have the byte type.

YAA will warn you if you try to create expressions which mix symbols of different types. For example, consider

              .usage word
          X:  .data  0
              .usage byte
          Y:  .data 0
              lda   X+Y,du
In this example, X has the word type and Y has the byte type. YAA will therefore warn you about mixing types in the lda instruction.

Note that YAA does not warn you about using the wrong type of operand inside a machine instruction. For example, you can write

          lda   Y,du
even though Y is a byte offset and lda expects a word offset. In this case, you would have to use an explicit type cast, as discussed in the next section.

4.3.2 Type Casting

You can cast symbols to different types using an expression of the form

          type :: symbol
For example,
              .usage byte
          Y:  .data  0
              lda   (word::Y),du
shows how to change a byte offset into a word offset when necessary.

You may also use the :: operator to cast the type of an expression. For example,

          word::(4+4)
represents an offset of eight words.

Cast operators associate from right to left, so that

          byte :: word :: (4+4)
is equivalent to
          byte :: (word::(4+4))
This is evaluated in two steps. word::(4+4) represents a word offset of 8. The byte:: then converts this to a byte offset, so the final result is a byte offset of 32 (which is equivalent to a word offset of 8).

4.3.3 The none Type

The reserved word none represents values with no type. For example, a constant expression would have the none type; such expressions are just numbers, not offsets. If you use such a value in a context where an offset is expected, YAA assumes that the numeric value represents an offset of the correct type. For example, consider

          lda  3+2,du
The result of 3+2 is 5 and its type is none. Since lda expects this operand to be a word offset, the expression denotes a word offset of 5.

You may use the none type in :: cast operations. For example,

          none::X
just stands for the value of X; the type of the result is none. To see what effect this has, consider
          byte :: none :: word :: 8
The expression word::8 stands for a word offset of 8; using none:: results in just the number 8; finally, applying byte:: gives a byte offset of 8. Contrast this with
          byte :: word :: 8
where word::8 is a word offset of 8 and byte:: converts this to a byte offset of 32.

4.4 The Instruction Counter

As statements are assembled, the generated code is stored in a section. A value called the instruction counter or IC measures how much code has been placed into the section. (Note that YAA's instruction counter is an artificial construct that is local to a section, and has no direct relation to the hardware's instruction counter.) The offset mode of the section determines the units in which the IC measures quantity of code (bits, bytes, or words).

If the offset mode of a section is changed, the IC will measure code in the new units. For example, suppose the offset mode starts out at word. The IC indicates the present location in the section, expressed as a word offset from the beginning of the section. If the offset mode changes to byte, the IC will then indicate the present location in the section, expressed as a byte offset from the beginning of the section.

The value of the IC may be obtained via the .ic function, described in Chapter 7. This may be used anywhere in the program to represent a location in a section (represented as an integer offset into the section). The asterisk '*' can also be used to refer to the value of the IC. For example, the TRA (transfer) instruction

          tra  *+2
jumps to the location two units beyond the current value of the IC. The units used are given by the offset mode of the section. For example, if the offset mode is word, *+2 is two words beyond the current value of the IC.

TECHNICAL NOTE: internally, the IC is always kept as a bit offset. The value of the IC is converted to words or bytes (as appropriate) whenever it is used in source code.

4.4.1 Instruction Alignment

All instructions automatically force the alignment they require. For example, the RPD instruction on GCOS-8 requires odd-word alignment. YAA will automatically generate NO-OP instructions to align RPD instructions in this way. For more on instruction alignment, see Chapter 8.

4.5 Relocation and the Link Editor

The link editing process groups all the sections of a program into one or more segments. On DPS-8 architecture machines, the segment is the basic organizing unit for memory. Each segment is referred to using a 12-bit quantity called a SEGID. A location in a program is completely specified by determining the SEGID of the segment that contains the location and the offset of the location from the beginning of that segment.

Since sections are not collected into segments until link editing, addresses cannot be fully determined at assembly time -- YAA cannot calculate either the SEGID of a location or the offset of that location from the beginning of the segment. Therefore in addition to assembled machine code, YAA outputs ""pseudo-addresses"" and relocation instructions telling the link editor how to resolve these pseudo-addresses into real address formats.

The format YAA uses for pseudo-addresses is transparent to the programmer -- the link editor does all the work of conversion. However, the programmer should be aware of the relocation instructions recognized by the link editor. Each of these is identified by a keyword.

bit
A bit relocation instruction indicates that the associated pseudo-address should be converted to a bit offset from the beginning of the segment containing the address. Negative bit offsets are allowed. The low order bits of a bit offset must fill the lower half of a machine word and may extend any number of bits into the upper half of the same word if necessary. Thus a bit offset must be at least 18 bits long and can be as much as 36 bits long.
byte
A byte relocation instruction indicates that the associated pseudo-address should be converted to a byte offset from the beginning of the segment containing the address. Negative byte offsets are allowed. The low order bits of a byte offset must fill the lower half of a machine word and may extend any number of bits into the upper half of the same word if necessary. Thus a byte offset must be at least 18 bits long and can be as much as 36 bits long.
word
A word relocation instruction indicates that the associated pseudo-address should be converted to a word offset from the beginning of the segment containing the address. Negative word offsets are allowed. There are two types of word relocation instructions: upper and lower. Upper word relocation asks the link editor to store the (relocated) word offset in the upper half of a word. This kind of offset is exactly 18 bits long. Lower word relocation asks the link editor to store the (relocated) word offset in the lower half of a word, possibly extending into the upper half of the same word if space is allotted. Thus a lower word offset must be at least 18 bits long and can be as much as 36 bits long.
pointer
A pointer relocation instruction indicates that the associated pseudo-address should be converted into a machine pointer. A machine pointer is always a single machine word, and therefore 36 bits long. The top 18 bits contains the word offset of the location (from the beginning of a segment), the next two bits gives a byte offset within the word, the next four bits give a bit offset within the byte, and the bottom 12 bits give the SEGID of the segment that contains the location.
segid
A segid relocation instruction indicates that the associated pseudo-address should be converted to the SEGID of the segment that contains the location. SEGIDs are always regarded as positive numbers. There are two types of segid relocation instructions: upper and lower. Upper segid relocation asks the link editor to store the appropriate SEGID in the bottom 12 bits of the upper half of a machine word. Lower segid relocation asks the link editor to store the appropriate SEGID in the bottom 12 bits of the lower half of a machine word. More than 12 bits can be allocated to hold a SEGID, but only 12 bits are used and those must be in the bottom of a half-word.
All of the above relocation instructions can be explicitly requested by the programmer. There is another type of relocation instruction that cannot be explicitly requested. When you use an offset from an ar register, YAA generates a pseudo-address that the link editor will turn into a 15-bit word offset from whatever value is stored in the ar register. This offset will be stored in the bottom 15 bits of the upper half of a machine word. Up to 18 bits can be allocated for the offset, but only 15 bits will be used.

In general, you do not have to specify a specific relocation type for a relocatable value -- YAA uses default types which are usually what you want. The default is dictated by the context in which the relocatable value is used. For example, the GCOS-8 lda instruction expects a word offset as an argument; therefore YAA will generate an (upper) word relocation instruction for lda's argument.

5. Expressions

YAA expressions may be used as labels, as opcodes, or as operands. While most expressions are likely to be simple (e.g. just the name of an opcode or a data object), YAA does allow the use of complicated expressions.

5.1 Expression Types

Every expression has a type. The type of an expression depends on the type of values used in the expression and the operations performed on those values. The possible types are described in the sections that follow.

5.1.1 Integer Expressions

An integer expression yields an integer value that does not stand for a location. Remember that this value will be expressed using the longest integer format available on the machine for which the program is assembled. On the DPS-8, this is a 36-bit integer.

There are a large number of expressions that yield an integer value: expressions that perform arithmetic with integer values; comparison expressions (e.g. A>>B which yields 1 if A is greater than B and 0 otherwise); logical and bit manipulation operations; and a number of special operations (e.g. an operation that determines the length of a string). All of these expressions will be discussed later in the chapter.

5.1.2 Location Expressions

The result of a location expression represents a memory location. A location value may be absolute (in which case it is just an integer standing for a bit, byte, or word offset from some other location) or relocatable (in which case it consists of an integer offset and one or more linkable symbols, i.e. symbols whose locations will be determined by the link editor). The most general form for relocatable values is

          I + LS1 - LS2 + LS3 ...
where I is an integer indicating a constant positive or negative offset, and LS1, LS2, LS3, etc. are all linkable symbols.

Notice that there is only one integer in a relocatable value. All other parts are linkable symbols. When an expression combines several relocatable values in any way, all the integer parts will be gathered together to yield a single integer offset. For example, if you subtract one location value from another, YAA will subtract the two integer parts first, and then calculate the remaining "linkable" part.

When performing arithmetic on the integer offsets of relocatable values, YAA pays no attention to the units of the offsets -- they're just integers. Thus if an expression contains offsets in different units, you must do the conversions yourself.

Operations with relocatable values may be restricted by the link editor's ability to resolve such operations.

5.1.3 Floating Point Expressions

A floating point expression yields a floating point value as its result. In this version of YAA, floating point expressions may contain addition, subtraction, multiplication, division, and the .max and .min functions (described in Chapter 7).

5.1.4 String Expressions

A string expression yields a string as its result. For example,

          .concat("abc","def")
is a string expression whose result is the concatenation of the two strings (i.e. the string "abcdef"). Other expressions with string results are described later in this chapter.

5.1.5 Token Sequence Expressions

A token sequence expression yields a sequence of tokens as its result. This sequence consists of zero or more tokens. The way in which these tokens are used depends on their context in the program being assembled.

As an example of a token sequence expression, the .unquote operator takes a string expression as its argument and returns a token sequence consisting of the tokens that appear in the string. This means that

          .unquote("1+2")
yields the token sequence
          1 + 2
This could then be re-evaluated as an expression using the .eval function (described in Chapter 7); the final result would be the value 3.

5.2 Immediate Expressions

An immediate expression is one that can be evaluated as soon as YAA encounters it in source code (on YAA's "first pass"). In general, this means that the operands in an immediate expression can only be constants, YAA variables (as described in Chapter 8), and symbols whose value has been previously determined. An immediate expression cannot refer to symbols whose value cannot yet be determined (e.g. symbols defined later in the program or outside the program).

An immediate expression can have any type (integer, string, location, etc.). Some opcodes and pseudo-ops require that their arguments be immediate expressions. In addition, some of the expressions described in this chapter require that sub-expressions be immediate.

Some expressions can be evaluated immediately, simply because of their forms. For example,

          A * 0
is always zero, regardless of the value of A. Similarly,
          A - A
is always zero. YAA doesn't even bother to determine the value of A in cases like these, since the value of A is not relevant to the result. In fact, the expressions will be evaluated properly even if A is never defined.

5.3 Order of Evaluation of Expressions

When an expression consists of several sub-expressions, the operations in the expression are evaluated according to a fixed order of precedence. For example, multiplication operations take place before addition operations (as in conventional arithmetic). The standard order of evaluation may be changed using parentheses in the usual way.

Some operations share the same precedence (e.g. addition and subtraction). The set of all operations with a given precedence form a precedence class. An operator X is said to have a higher precedence than an operator Y if operation X is performed before Y.

Each precedence class has its own binding. The binding tells the order in which operations of the class are performed. A class may bind right-to-left or left-to-right. For example, the addition/subtraction class binds left to right, which means that in the expression

          A - B + C
the left operation (subtraction) is performed before the right operation (addition).

In the sections to come, we will describe all the operations of YAA in order of precedence, from highest to lowest. Operations of equal precedence will be described as subsections of a common section.

5.4 Primary Expressions

Primary expressions have the highest precedence of evaluation. They are evaluated from left to right. Primary expressions have one of the following forms.

          identifier
          integer_constant
          floating_point_constant
          string_constant
          ( expression )
          [ token_sequence ]
          expression [ int_expression ]
          function ( expression )

The value of an identifier in an expression depends on the definition of the identifier, as described in later chapters. The value of an integer or floating point constant is just the constant's numeric value. The value of a string constant depends on the operation in which the string appears, as described in later sections. The value of a parenthesized expression is the value of the expression inside the parentheses. The values of the other expressions listed above are described in the subsections that follow.

5.4.1 The Token Sequence Operator

A sequence of tokens enclosed in square brackets is a token sequence expression whose value is the enclosed sequence of tokens. For example, the value of

          [ tok1, tok2 ]
is the sequence of three tokens
          tok1
          ,
          tok2
Notice that the comma is a separate token; it is not a delimiter separating the other tokens.

The token sequence operator may not contain a new-line character or a semicolon as one of the tokens in the sequence.

In the rest of this manual, we will usually write token sequence values by enclosing the token sequence in square brackets.

5.4.2 The Subscripting Operation

The first type of subscripting operation has the form

          string_expression [ int_expression ]
where string_expression is an expression yielding a string result and int_expression is an expression yielding an integer result. To evaluate this expression, YAA first evaluates the integer expression to get an integer. (This integer is called the subscript.) YAA then obtains the corresponding character from the string. The character at the beginning of a string has a subscript of 0, the next has a subscript of 1, and so on.

The result of subscripting a string is the character obtained from the string. This character is expressed as an integer value, and therefore subscripting a string is an integer expression. As an example, the result of

          "abc"[0]
is the integer value 'a'. The result of
          "abc"[2]
is the integer value 'c'.

Note that any string expression may precede the square brackets. For example,

          ( .concat("abc","def") )[4]
has the value 'e', since the result of the .concat operator is the concatenated string "abcdef".

The second form of the subscripting operation is

          token_sequence_expr [ int_expression
          ]
The expression before the square brackets is one that yields a token sequence. This token sequence should consist of a list of Bracket-balanced subsequences, separated by commas. YAA splits this into Bracket balanced token sequences at the commas. For example, the token sequence
          [a(b,c),d[(e)],f]
would be split into the subsequences
          a(b,c)
          d[(e)]
          f
Notice that commas inside Brackets do not count as list separators.

The result of the subscripting operation is the Ith subsequence from the token sequence. The subsequence at the beginning of the sequence has a subscript of 0, the next token has a subscript of 1, and so on. As an example, the value of the expression

          [lda y,ldq x][0]

is the token sequence

          [lda y]
The value of
          [ldx0,ldx1,ldx2,ldx3][2]
is ldx2. As a more complicated example,
          [ [a,b],[c,d] ] [1]
gives the token sequence we could write as
          [ [ c , d ] ]
This sequence consists of five tokens:
          [
          c
          ,
          d
          ]

5.4.3 Function Calls

An expression of the form

          function_name ( expression,
          expression, ...)
is a function call. The expressions inside the parentheses are the arguments of the function. Arguments are separated by commas and must be Bracket- balanced. The functions recognized by YAA are described in Chapter 7.

5.5 Unary Operators

Unary operators are evaluated after primary expressions. They are evaluated from right to left. Recognized unary operators are

          + numeric_expression
          - numeric_expression
          ! int_expression
          ~ int_expression
          ++ int_variable
          -- int_variable
          int_variable ++
          int_variable --

5.5.1 Unary Plus and Minus

The unary plus (+) and (-) operators perform the usual operations on integer, floating point, and location expressions. For example, the result of -i is the value with the same magnitude as i but the opposite sign.

5.5.2 Logical Negation

The logical negation operator '!' may be applied to any integer expression. The result of !I is 0 if I is non-zero, and 1 if I is zero.

5.5.3 Bitwise Complement

The bitwise complement operator '~' (tilde) may be applied to any integer expression. The result of ~I is an integer that has a 1-bit wherever I has a 0-bit, and a 0-bit wherever I has a 1-bit. For example, ~0777000777000 is 0000777000777.

5.5.4 Auto-increment

++ is called the auto-increment operator. It may only be applied to YAA variables (described in Chapter 8).

To evaluate the expression

          ++int_variable
YAA first adds 1 to the current value of the given integer variable. The result of the expression is the resulting value of the variable.

The result of the expression

          int_variable++
is the current value of the given integer variable. After this value is obtained, YAA adds one to the variable. In other words, when the ++ appears before its argument, the variable is incremented before the result of the expression is obtained; when the ++ appears after its argument, the variable is incremented after the result of the expression is obtained.

For example, suppose the variable X currently has a value of 3. The result of the expression

          ++X
is 4 and after the expression is evaluated, X will also have a value of 4. If Y currently has a value of 10, the result of
          Y++
is 10, but the value of Y after the expression has been evaluated will be 11.

5.5.5 Auto-decrement

The auto-decrement operator (--) works like the auto- increment operator, except that a value of 1 is subtracted from the argument instead of added.

          --int_variable
subtracts 1 from the given variable and returns the resulting value as its result.
          int_variable--
obtains the current value of the given variable as the result of the expression, then subtracts one from the variable's value.

5.6 Multiplicative Operators

The multiplicative operators follow the unary operators in order of precedence. They are evaluated from left to right. The multiplicative operators are

          expression * expression
          expression / expression
          int_expression % int_expression

5.6.1 Multiplication

The * operator represents normal multiplication. For example, the result of 5*2 is 10. The arguments must be integer expressions. YAA does not take note of arithmetic overflow -- for example, if the true result of an integer multiplication cannot be represented in the long integer format, the actual result returned will be reduced modulo 2**36.

YAA allows the special construction of multiplying an integer times a location expression, as in

          4*A
This is equivalent to adding the location expression the given number of times. For example, the above is equivalent to
          A + A + A + A
Multiplying by a negative integer is equivalent to subtracting the location the given number of times. For example, the following are equivalent.
          -3*A
          - A - A - A
In expressions of this form, the integer must be in the range from -10 to 10 (inclusive).

5.6.2 Floating Point and Integer Division

The / operator represents normal division. If either operand is floating point, both operands will be converted to floating point and floating point division will be used. For example, the result of 3.0/2 is 1.5. If both operands are integers, integer division will be used. For example, the result of 14/7 is 2. If division of integers is not exact, the result is truncated towards zero. For example, the result of 5/3 is 1, while the result of -5/3 is -1.

5.6.3 Integer Remainder

The % operator represents the integer remainder operation, also known as the "modulo" operation. The arguments must be integer expressions. If A and B are positive, A%B is the remainder obtained when A is divided by B (A modulo B). More generally, A%B is defined so that

          ((A/B) * B) + (A%B)
is equal to A.

5.7 Additive Operators

The additive operators follow the multiplicative operators in order of precedence. They are evaluated from left to right. The additive operators are

          expression + expression
          expression - expression

5.7.1 Addition

The binary + operator is used to indicate a variety of operations, depending on the types of its two arguments.

When both arguments are integer expressions, the result is the integer sum of the two arguments. If the addition results in overflow (greater than the largest representable integer or smaller than the most negative such integer), the result is reduced modulo 2**36.

If either of the arguments is a floating point value, both values will be converted to floating point and the result will be a floating point value.

Integer expressions may be added to location expressions. The integer will be taken to represent an offset in the same units as the offset mode of the location expression. For example, if you add 3 to a word offset, the 3 is taken to mean three words.

5.7.2 Subtraction

The binary - operator denotes subtraction. Both arguments may be integer expressions, in which case the result is the arithmetic difference of the two expressions. If the operation overflows, the result of the operation will be the true result of the subtraction, reduced modulo 2**36.

If either of the arguments is a floating point value, both values will be converted to floating point and the result will be a floating point value.

Subtraction operations may also have one simple integer argument and one location argument. The rules for this operation are similar to the rules for adding a simple integer and a location.

Finally, two location values representing offsets may be subtracted from one another.

5.8 Shift Operators

The shift operators follow the additive operators in order of precedence. They are evaluated from left to right. The shift operations are

          int_expression <<<<
          int_expression
          int_expression >>>>
          int_expression

5.8.1 Left Bit Shifts

The arguments of the <<<< operator must be integer expressions. The result of A<<<If the right argument is negative or greater than the number of bits in an integer value, the result of the operation is undefined.

5.8.2 Right Bit Shifts

The arguments of the >>>> operator must be integer expressions. The result of A>>>>B is the value of A with its bits shifted B positions to the right. Vacated bits are filled with zeros (i.e. the shift is always performed logically). For example, 070>>>>3 is 007.

If the right argument is negative or greater than the number of bits in an integer value, the result of the operation is undefined.

5.9 Relational Operators

The relational operators follow the shift operators in order of precedence. They are evaluated from left to right, but this is seldom useful. The relational operations are

          expression << expression
          expression >> expression
          expression <<= expression
          expression >>= expression
<< stands for "less than". >> stands for "greater than". <<= stands for "less than or equal". >>= stands for "greater than or equal".

The result of every relational operation is an integer value: 1 if the relation is true and 0 if it is false. For example, the result of A>>B is 1 if A is greater than B and 0 otherwise.

The arguments of any relational operator must be numbers (floating point or integer) or else have the same type. Numbers are compared with other numbers in the usual way. Strings are compared to other strings using the ASCII collating sequence; for example, the string "abc" is less than the string "abd".

Relocatable values may be compared to relocatable values if they are in the same section, in which case one value is greater than another if it has a greater integer offset from the beginning of the section. For example, this lets you compare two statement labels.

YAA does not let you compare token sequences.

Note that expressions like A<5.10 Equality Operators

The equality operators follow the relational operators in order of precedence. They are evaluated from left to right. The equality operations are

          expression == expression
          expression != expression
== stands for "is equal to". != stands for "is not equal to". Like the relational operators, the equality operators return the integer 1 if the relation is true and 0 if it is false.

Any two expressions of the same type can be compared for equality or inequality. In addition, you may compare any number to any other number (floating point or integer). You may also compare relocatable expressions to integers, but they are always considered to be unequal.

If you compare two SYMREFs with different names, they will always be considered to be unequal, even if the linking process eventually puts the two SYMREFs at the same memory location. The same is true for two locations expressed as offsets from the beginning of different sections.

5.11 Bitwise AND

The binary && operator is used to "AND" together bits in integers. The result of

          int_expression && int_expression
is an integer value that has a 1-bit wherever both arguments have a 1-bit, and that has a 0-bit everywhere else. For example, the result of 0101&&0011 is 0001.

5.12 Bitwise Exclusive OR

The binary ^ (caret or circumflex) operator is used to obtain the exclusive "OR" of the bits in two integers. The result of

          int_expression ^ int_expression
is an integer value that has a 0-bit wherever both arguments have 1-bits or 0-bits, and that has a 1-bit everywhere else. For example, the result of 0707^0077 is 0770.

5.13 Bitwise Inclusive OR

The binary | (or-bar) operator is used to obtain the inclusive "OR" of the bits in two integers. The result of

          int_expression | int_expression
is an integer value that has a 0-bit wherever both arguments have 0-bits, and that has a 1-bit everywhere else. For example, the result of 0101|0011 is 0111.

5.14 Logical AND

&&&The binary &&&& operator is used to obtain the logical "AND" of two integers.

          int_expression &&&&
          int_expression
has the value 1 if both arguments are non-zero, and the value 0 otherwise. The first operand (before the &&&&) must be an immediate expression.

In evaluating the arguments of a logical AND expression, the first argument is always evaluated before the second. If the first argument proves to be zero, YAA knows that the result of the entire &&&& expression will be zero, so the second argument is not evaluated.

5.15 Logical OR

The binary || operator is used to obtain the logical "OR" of two integers.

          int_expression || int_expression
has the value 0 if both arguments are 0, and 1 otherwise. The first operand (before the ||) must be an immediate expression.

In evaluating the arguments of a logical OR expression, the first argument is always evaluated before the second. If the first argument proves to be non-zero, YAA knows that the result of the entire || expression will be 1, so the second argument is not evaluated.

5.16 Conditional Operations

Conditional operations have the form

          int_expression ? expression :
          expression
The last two expressions must have the same type, but any type is allowed. The first expression (before the ?) must be an immediate expression.

To evaluate this expression, YAA first finds the value of the integer expression before the ?. If this value is non-zero, the result of the entire conditional expression will be the value of the expression before the colon (:). If the value is zero, the result of the entire conditional expression will be the value of the expression after the colon. For example, the value of

          (A >> B) ? A : B
is the value of A if A is greater than B; otherwise, it is the value of B. In other words, the value of the above expression is the maximum of A and B.

5.17 Assignment Operators

Assignment operators are evaluated after all other operations. They may only be applied to YAA variables, as described in Chapter 8. The assignment operations are

          variable =  expression
          variable += expression
          variable -= expression
          variable *= expression
          variable /= expression
          variable %= expression
          variable >>>>= expression
          variable <<<<= expression
          variable &&= expression
          variable ^= expression
          variable |= expression

The = operator represents simple assignment. The value of the expression on the right is assigned to the variable on the left. The right hand expression may have any type. The left operand must be a single YAA variable.

The other assignment operators are called compound assignment operators because they combine assignment with another operation. For example, += combines assignment with addition.

          A += B
is precisely equivalent to
          A = A + B
(C programmers should note that this is slightly different from C: in C, the variable A would only be evaluated once in A+=B, while in YAA, it is evaluated twice.)

The types of arguments in a compound assignment must be appropriate to the actions being performed. For example, in

          A >>>>= B
A must be an integer variable and B must be an integer expression.

The arguments of a compound assignment must have types that are compatible with the desired operation. For example, in

          A += B
A and B must have types that can be added together.

Assignment operations are expressions, and as such they have values. The value of an assignment expression is the value that is assigned to the left hand variable. Thus the value of the expression

          X = "xyz"
is the string "xyz", while the value of
          A += 2
is the value of A+2.

Assignment expressions bind from right to left. This means that expressions like

          X = Y = Z = 0;
are valid. First, 0 is assigned to Z. The result of this assignment (the value 0) is then assigned to Y, and so on.

6. Writing Machine Instructions

The machine instructions available on the DPS-8 family of machines are described in various Bull HN hardware manuals (e.g. DPS90 Assembly Instructions, Bull HN document DX20). These hardware manuals describe the nature of the machine's instructions and how they are used.

We will not attempt to duplicate this information in this manual. However, an understanding the instruction set is not the same as understanding how to code those instructions in a YAA program. Thus this section provides information on how to code machine instructions in YAA.

(Note: The Bull HN hardware manuals occasionally show instruction example written using the Bull HN GMAP assembler. Appendix A discusses ways in which GMAP differs from YAA.)

6.1 Instruction Format

A instruction consists of a one or more machine words, possibly followed by additional words giving arguments for the instruction. We will call the first word of an instruction the instruction word, since this dictates what kind of instruction we are dealing with. In most cases, the instruction word is the entire instruction; the exceptions are vector instructions and so-called register- register instructions. See the hardware manuals for more details.

The instruction word has the following format:

Bits 0-17:
The address field. This gives a value used in calculating the address of the operand to the instruction.
Bits 18-27:
The opcode field. This is a number identifying the opcode.
Bit 28:
The inhibit bit; see the hardware manuals for more information.
Bit 29:
The AR bit. This indicates that the instruction uses an ar register rather than an index register; see the hardware manuals for more information.
Bits 30-35:
The tag field. This is used in controlling address modification.
Note that this description is slightly different from the description in the hardware manuals. The hardware manuals say that the opcode field is contained in bits 18-26, with bit 27 used to distinguish EIS (Extended Instruction Set) instructions from other instructions. In our opinion, it is more natural to regard bit 27 as another bit in the opcode field rather than a separate flag bit.

6.2 Instruction Set

YAA recognizes the standard DPS-8 opcodes. All opcodes are reserved words when they appear in the opcode field. They must be entered in lower case.

6.3 Registers

Most machine instructions involve hardware registers. Instructions refer to registers by mnemonics; these must be in lower case (for example, p1 must be used for pointer register 1).

The hardware manuals describe the use and characteristics of all the hardware registers. We will not duplicate that information here. However, below we list the symbolic names used to refer to various registers in machine instructions. This is not the complete list of registers; it only discusses those registers which may be named in instructions.

a
The A register (accumulator). 36 bits, used in most integer arithmetic.
au
Upper half of the A register. 18 bits.
al
Lower half of the A register. 18 bits.
q
The Q register (quotient). 36 bits, most often used in integer divides but also as general purpose arithmetic register.
qu
Upper half of Q register. 18 bits.
ql
Lower half of Q register. 18 bits.
xn
Index registers 0 through 7 (e.g. x0, x1, etc.). 18 bits each. Typically used in indexing operations (e.g. offset from some location).
ic
Hardware instruction counter (not to be confused with YAA's IC). 18 bits. Holds address of the instruction currently being executed (but only gives offset within segment, not SEGID).
arn
Address registers 0 through 7 (e.g. ar0, ar1, etc.). 24 bits in NS mode, 36 bits in ES mode. Contains word offset (18 bits in NS mode, 30 bits in ES mode), character offset within words (2 bits), and bit offset within character (4 bits). Note that SEGID is not stored here.
pn
Pointer registers 0 through 7. Each pointer register n is made up of descriptor register n, SEGID register n, and address register arn. Thus changing a pointer register changes the component registers.
The above symbol names may be used as arguments to machine instructions that require register operands. For example, the following piece of code comes from the standard GCOS-8 function call sequence.
          ldp    p1,func
          ldx    x0,STBUMP,du
          eppr   p0,*+3,$
          tra    .call+S_BIAS,,p3
This shows the use of the symbols p1 (pointer register 1), x0 (index register 0), p0 (pointer register 0), and p3 (pointer register 3).

Register names are only reserved in positions where a register value is allowed or expected.

Note: Examples in the Bull HN hardware manuals are written using the GMAP assembler rather than YAA. GMAP lets you use numbers to indicate registers instead of requiring symbolic names. For example,

          lda  3,1,1     # GMAP
could be written for
          lda  3,x1,ar1  # YAA
In YAA, you must use the full symbolic name. For this reason, some of the examples given in the hardware manual will not work with YAA. See Appendix A for more information on differences between GMAP and YAA.

6.3.1 Registers as Part of Instruction Opcodes

Some instructions incorporate a register number directly in their opcode. For example, ldp1 loads a pointer value into register p1. If you prefer, you can write the register as the first argument rather than as part of the opcode itself. For example, the following instructions are equivalent

          ldp1   func
          ldp    p1,func

The second format can be more useful if you are writing macros or using synonyms (described in Chapter 8). For example, you might create a synonym with

          p_entr:  .synonym p1
and then write
          ldp   p_entr,func
This would be more difficult if you used the form of the opcode that incorporates the register number right in the opcode.

This trick can be used with any opcode that incorporates a register number into the opcode. For example, the following instructions are equivalent

          ldx1   y
          ldx    x1,y
Note that you have to give the full name of the register; you can't say
          ldx  1,y    # Incorrect!
Also note that the trick of splitting off the register from the opcode only works with register numbers, not register names. For example, you cannot use
          ldx  a,value   # Incorrect!
instead of
          lda  value

6.4 Address Generation and Modification

Addresses in non-EIS machine instructions may be specified in a number of different ways. The tag field in the machine instruction indicates which approach a particular instruction is using.

There are four types of addressing in non-EIS instructions. These are:

Register (R)

Register then indirect (RI)

Indirect then register (IR)

Indirect then tally (IT)

The type of addressing is indicated by the first two bits of the tag field. The other four bits provide further information, as described in the Bull HN hardware manuals.

The sections that follow are not intended to duplicate the addressing information in the hardware manuals, but to provide enough information to write up YAA instructions once you understand the hardware addressing modes.

6.4.1 Register Addressing

In R addressing, the address of an operand is calculated using the address field and the contents of a register. The registers which can be used are the index registers (x0-x7), half-words in the aq (au, al, qu, ql), or the instruction counter ic. The contents of the address field are added to the contents of the register, giving the address of the instruction's operand.

For example,

          lda  2,x1
generates its operand address by adding 2 and the contents of the x1 register. If x1 holds the address of the beginning of an array iarray, the above instruction loads the A register with the value of iarray[2]. If the value of the constant is zero, it can be omitted as in
          lda  ,ic
This takes the current value of the ic as the operand address, and loads the value at that address into the A register. The result is that it loads the lda instruction itself into the A register.

If you omit a register, the address field is assumed to contain the operand address itself. For example,

          lda  name
assumes that name gives the address of the actual operand. The system will find the value in that address and load it into the A register. This type of instruction can also be written by adding a tag of n after the address, as in
          lda  name,n
This form of the instruction is equivalent to the previous one.

6.4.2 Immediate Constants

R addressing also allows immediate constants. In this case, the address field is treated as the actual value of the operand rather than a value used in calculating the operand's address. A tag of dl after the constant in the YAA instruction indicates that the constant represents the lower half of a word. For example,

          lda  2,dl
loads the constant 2 into the A register. A tag of du after the constant indicates that the constant represents the upper half of a word. For example,
          lda  2,du
loads the constant 2 into the upper half of the A register (and puts zeroes in the other half).

6.4.3 Register Then Indirect

Indirect addressing specifies the address of an operand in a two-stage process. First, you constuct a memory address; then you use the contents of that address to find the address of the true operand (directly or indirectly).

The Register Then Indirect (RI) method of addressing uses a register (R) address format to calculate the first memory address. This is said to be the address of the indirect word.

The contents of the indirect word have the same format as a machine instruction: an address field in the top 18 bits and a tag field in the bottom 6 bits. If the tag field specifies R addressing, the address of the true operand is calculated from the indirect word using normal R addressing. If the tag field is some other type of addressing, a new indirect word is calculated from the old one, as if the old indirect word was part of a machine instruction. This process is repeated until it finds an indirect word with R addressing.

To write an instruction that uses RI addressing, put an asterisk immediately after the register operand in an R addressing construct. For example,

          lda  3,x5*
obtains the indirect word address by taking the contents of x5 and adding 3. The instruction will use the indirect word to find the address of the true operand.

The du and dl forms of register addressing are not valid with an RI tag. However, they may be used in the R address that ends the indirect word chain.

In a previous section, we mentioned a form of R addressing where no register was actually used. For example,

          lda  Z
takes the value of Z as the operand address. The corresponding RI instruction is written
          lda  Z,*
or
          lda  Z,n*
This will follow around a chain of indirect words beginning with the address given by Z.

The arg pseudo-op is useful for constructing indirect words. arg is described in a later section of this chapter.

6.4.4 Indirect Then Register

Indirect then Register (IR) is similar to RI addressing, but there are a number of important differences. IR addressing is indicated by an asterisk * preceding the register involved in the addressing. For further information on IR addressing, see the hardware manuals.

As an example of an instruction which uses IR addressing, consider

          lda  Y,*x1
The value of x1 is cached away, and Y is used as the address of the indirect word. If this word has the format
          arg  N,x6
the final operand address is N plus the cached away contents of x1. In an IR addressing chain, the hardware ignores any register specified with the R instruction that ends the chain. The address field of the R instruction is added to the cached register value.

As another example, consider

                lda  S,*du
          S:    arg  T,x3*
The first instruction calculates S as the indirect word. This instruction is in RI format. The new indirect word will be calculated as T plus the contents of x3. If the new indirect word is
          arg  U,ql
then the final operand address is effectively
          arg  U,du
since the du from the original IR instruction was cached away for use at this time.

If an IR instruction has no register for modification, you must specify *n, as in

          lda  name,*n
This starts an IR indirect word chain, but no register value is cached. You cannot omit the n, since
          lda  name,*
is interpreted as
          lda  name,n*
(RI addressing).

6.4.5 Indirect Then Tally

The Indirect then Tally (IT) addressing modification combines indirection with an automatic increment or decrement of fields in the indirect word. The indirect word is broken into the following fields: Bits 0-17: Address field. Bits 18-29: Tally field. Bits 30-35: Tag field. Indirect words for IT instructions are usually created using one of the opcodes tally, tallyb, tallyc, or tallyd. These simplify the creation of the three fields of a tallying indirect word. They will be discussed in a later section of this chapter.

There are many ways in which tallying instructions and indirect words can be set up. These are described in the appropriate hardware manual and will not be repeated here.

Instructions using IT addressing are created by special symbols in the tag field of the instructions. The following symbols are recognized:

i
Indirect.
id
Increment address, decrement tally.
di
Decrement address, increment tally.
sc
Sequence character.
scr
Sequence character reversed.
ci
Character from indirect.
ad
Add delta.
sd
Subtract delta.
f
Fault.
idc
Increment address, decrement tally, and continue.
dic
Decrement address, increment tally, and continue.
All of these tag symbols must be in lower case. The meanings of all these tags are explained in the appropriate hardware manual.

Below are a few instructions using IT addressing. We will not explain them here, but simply provide them as examples of the format.

          lda   Z,id
          lda   A,sc
          sta   Y,idc

6.5 Use of the Asterisk in Addresses

The asterisk stands for the current instruction counter. For example, consider

          ldq  X,dl       # load X into Q
          cmpq min,dl     # is X less than min?
          tpl  *+2        # if so...
          ldq  min,dl     #    ...use min
          # and on we go
The operand of the tpl instruction is *+2. Since the instruction counter is currently pointing to the tpl instruction itself, this operand stands for the second word after the tpl instruction. In other words, the tpl instruction will jump around the ldq instruction that immediately follows it if X is not less than min.

6.6 DPS-8 Specific Pseudo-Ops and Macros

A pseudo-op is a construct which can be used in place of a normal machine instruction. A macro is a construct that expands into one or more YAA instructions (either machine instructions or pseudo-ops).

The pseudo-ops and macros of YAA divide into two classes: ones that perform DPS-8 specific operations, and ones that perform machine-independent operations. The rest of this chapter describes DPS-8 specific pseudo-ops and macros. Chapter 8 describes the machine- independent pseudo-ops.

6.6.1 The adsc Pseudo- Ops

The adsc pseudo-ops create arguments which describe alphanumeric strings for EIS instructions. Note that these arguments are not the strings themselves; the strings are located elsewhere in memory and the arguments tell where the strings are.

adsc4 creates arguments describing strings with characters four bits long; adsc6 creates arguments describing strings with characters six bits long; and adsc9 creates arguments describing strings with characters nine bits long.

The format of an adsc pseudo-op is one of

          adsc4 address,charnum,length,ar_reg
          adsc6 address,charnum,length,ar_reg
          adsc9 address,charnum,length,ar_reg
where
address
is an address expression which specifies the word at the beginning of the memory region that the EIS instruction will work with.
charnum
is an expression specifying a character within the word. The first character is number 0. For example, if charnum is 3 in an adsc6 instruction, the memory region begins with the fourth 6-bit character within the word given by address. charnum may be negative or a relocatable value.
length
gives the number of characters in the region. There are two ways of specifying this length: as an integer expression giving the number of characters in the region; or as the name of a register that holds the length. Possible register names are:
au
length given in AU register
qu
length given in QU register
a or al
length given in A register; bits 15-35 for adsc4 and adsc6, bits 16-35 for adsc9
q or ql
length given in Q register; bits 15-35 for adsc4 and adsc6, bits 16-35 for adsc9
xn
length given in index register n
The EIS instruction itself will indicate whether the adsc's length is given in a register or as an integer.
ar_reg
may be the name of an address register arn. This argument is optional. If it is present and if the preceding EIS instruction requests address modification, the contents of the address register will be used to modify both address and charnum.

The ar_reg field of a descriptor may be omitted. Other fields may also be omitted, but you must put in extra commas to indicate fields that are missing. Trailing commas may be omitted. For example,

          adsc9   X,,24
demonstrates that you must add commas for interior missing fields, but not for one on the end.

For an example of how adsc pseudo-ops are used, see the section on EIS instructions later in this chapter.

6.6.2 The arg Pseudo-Op

The arg pseudo-op is designed for constructing indirect words for IR and RI addressing. It creates a machine word with a given address field and tag field. arg has the format

          arg  address,tag,ar_reg
where:
address
creates the address field of the indirect word. If this is not supplied, zero is assumed.
tag
creates the tag field. If this is not supplied, no index modification or indirect modification is used.
ar_reg
is an address register. This argument is optional. If it is supplied, YAA turns on bit 29 of the generated word and puts the specified register number in bits 0- 2.

For example,

          arg  Z,x1
creates a machine word with an address field whose value is Z and a tag field with R addressing using x1. Suppose you have the instruction
          lda  0,x2*
This calculates the indirect word as the address contained in x2. At this address, you could have
          arg  Y,x3*
which is also in RI format. This calculates a new indirect word by adding Y and the value in x3. At this new indirect word, you could have
          arg  T,qu
which calculates yet another address by adding T and the upper half of the Q register. Since this indirect word has R addressing, the resulting value is the address of the true operand. Thus the original lda instruction loads the A register from the location whose address is T plus the contents of qu.

As another example,

          arg  5,,ar3
creates an argument word with 3 in bits 0-2, 5 in bits 3-17, and a tag field indicating no address modification.

6.6.3 The bdsc Pseudo-Op

bdsc is similar to the adsc pseudo- ops, in that it creates an argument for an EIS instruction. adsc describes an alphanumeric string argument, while bdsc describes a bit string. The format of bdsc is

          bdsc address,length,byte,bit,ar_reg
where
address
is an address expression giving the address of the word where the bit string begins.
length
gives the number of bits in the string. There are two ways of specifying this length: as an integer expression giving the number of bits in the string; or as the name of a register that holds the length. Possible register names are:
au
length given in AU register
qu
length given in QU register
a or al
length given in bits 12-35 of the A register
q or ql
length given in bits 12-35 of the Q register
xn
length given in index register n
The EIS instruction itself will indicate whether the bdsc's length is given in a register or as an integer.
byte
is an integer expression indicating the byte in which the bit string begins. The value can be 0, 1, 2, or 3.
bit
is an integer expression indicating the bit where the bit string begins within the byte. Bits are numbered left to right, 0 through 8.
ar_reg
may be the name of an address register arn. This argument is optional. If it is present, the contents of the address register will be used to modify address, byte, and bit, provided that the EIS instruction itself asks for this modification.

The ar_reg field of a descriptor may be omitted. Other fields may also be omitted, but you must put in extra commas to indicate fields that are missing. Trailing commas may be omitted. For example,

          bdsc9   X,,3
demonstrates that you must add commas for interior missing fields, but not for one on the end.

More information on using EIS instructions is given in a later section of this chapter.

6.6.4 The edec Pseudo- Ops

Packed decimal data can be created with the edec4 pseudo-op. It has the format

          edec4  int_expression,string_exp
The int_expression gives the number of packed decimal digits in the number. The string_exp gives the value of the number, in string form. For example,
          PX:  edec4  10,"-8.3e-9"
creates a packed decimal number with ten digits and the value -8.3e-9. The string_exp can give the value in any of the following formats.
nnnn
just a sequence of digits. This gives an integer.
+nnnn
a sequence of digits with a leading sign ('+' or '-'). This gives an integer.
nnnn+
a sequence of digits with a trailing sign ('+' or '-'). This gives an integer.
+nn.nn
a sequence of digits containing a decimal point, with a leading sign ('+' or '-'). The sign may be omitted if the number is positive. This gives a floating point decimal number.
+nn.nnE+nn
a sequence of digits containing a decimal point and an exponent. The 'E' marking the exponent may be upper or lower case. The mantissa and exponent may have signs ('+' or '-'). These signs may be omitted if the quantity is positive. This gives a floating point decimal number.
+nnE+nn
a sequence of digits with an exponent. The 'E' marking the exponent may be upper or lower case. The mantissa and exponent may have signs ('+' or '-'). These signs may be omitted if the quantity is positive. This gives a floating point decimal number.
There is no provision for left-justification of the values.

The edec9 pseudo-op creates ASCII decimal data. It has the form

          edec9  int_expression,string_exp
and behaves like edec4. The instruction
          edec9  10,"1234"
creates the ASCII string
          "0000001234\0\0"
when generating code in word offset mode (see Chapter 8 for more on offset modes). This is ten ASCII characters, plus two bytes of 0-bits to pad out to the next word boundary. Note that the zeros before the 1234 are ASCII zero characters, not binary zeros.

6.6.5 The iontp Pseudo- Op

The iontp pseudo-op is used to create data control words for hardware I/O processing. iontp stands for "I/O NonTransmit and Proceed". The pseudo-op has the form

          iontp   X,Y
where X is an address that can be represented by an 18-bit quantity, and Y is a word count of data to be transferred per block (in the range 1 through 4096). The result of iontp is a machine word containing the octal value
          XXXXXX03YYYY
where XXXXXX is the value of X and YYYY is the value of Y.

6.6.6 The iotd Pseudo-Op

The iotd pseudo-op is used to create data control words for hardware I/O processing. iotd stands for "I/O Transmit and Disconnect". The pseudo-op has the form

          iotd   X,Y
where X is an address that can be represented by an 18-bit quantity, and Y is a word count of data to be transferred per block (in the range 1 through 4096). The result of iotd is a machine word containing the octal value
          XXXXXX00YYYY
where XXXXXX is the value of X and YYYY is the value of Y.

6.6.7 The iotp Pseudo-Op

The iotp pseudo-op is used to create data control words for hardware I/O processing. iotp stands for "I/O Transmit and Proceed". The pseudo-op has the form

          iotp   X,Y
where X is an address that can be represented by an 18-bit quantity, and Y is a word count of data to be transferred per block (in the range 1 through 4096). The result of iotp is a machine word containing the octal value
          XXXXXX01YYYY
where XXXXXX is the value of X and YYYY is the value of Y.

6.6.8 The lptr Pseudo-Op

The lptr pseudo-op loads a pointer into a register. It has the format

          lptr   register,reference
where register is the register where you want to put the pointer and reference is the address that you want to put into the register.

lptr does its work by issuing a special directive to the LD loader. When LD sees this directive, it will generate an eppr instruction to load the address if the address can be directly indexed off a pointer register. Otherwise, LD generates a literal containing the address and an ldp instruction to load the address into register.

Since the eppr form only uses one machine word while the ldp uses two (one for the instruction and one for the literal), it is better to use eppr whenever possible. By using lptr, you can have the loader decide whether a particular address is "within range" of an eppr or whether an ldp is necessary.

6.6.9 The ndsc Pseudo- Ops

The ndsc pseudo-ops are similar to the adsc pseudo-ops, in that they create an argument for an EIS instruction. adsc describes an alphanumeric string argument, while ndsc describes a numeric string. The ndsc pseudo-ops have the form

          ndsc4 address,charnum,length,type,scale,ar_reg
          ndsc9 address,charnum,length,type,scale,ar_reg
where
address
is an address expression which specifies the word at the beginning of the number string that the EIS instruction will work with.
charnum
is an expression specifying a character within the word. The first character is number 0. For example, if charnum is 3 in an ndsc9 instruction, the memory region begins with the fourth byte character within the word given by address. charnum may be negative or a relocatable value.
length
gives the number of digits in the number. There are two ways of specifying this length: as an integer expression giving the number of digits in the number; or as the name of a register that holds the length. Possible register names are:
au
length given in AU register
qu
length given in QU register
a or al
length given in A register; bits 15-35 for ndsc4, bits 16-35 for ndsc9
q or ql
length given in Q register; bits 15-35 for ndsc4, bits 16-35 for ndsc9
xn
length given in index register n
The EIS instruction itself will indicate whether the ndsc's length is given in a register or as an integer.
type
is a symbol giving the type of the number string. Possible symbols are:
fs
indicates signed floating point.
ls
indicates a scaled number with a leading sign.
ts
indicates a scaled number with a trailing sign.
ns
indicates a scaled number with no sign.
If no code is specified, fs is assumed.
scale
is the scale factor for fixed point numeric strings. This is an integer expression; the actual value of the numeric string will be the apparent value multiplied by 10 to the power of the scale factor. For example, if the numeric string is 149 with a scale factor of -2, the true value of the numeric string will be 1.49. The value of scale must be in the range from -32 through +31.
ar_reg
may be the name of an address register arn. This argument is optional. If it is present, the contents of the address register will be used to modify both address and charnum, provided that the EIS instruction itself asks for this modification.

The ar_reg field of a descriptor may be omitted. Other fields may also be omitted, but you must put in extra commas to indicate fields that are missing. Trailing commas may be omitted. For example,

          ndsc9   X,,24
demonstrates that you must add commas for interior missing fields, but not for one on the end.

EIS instructions are discussed in more detail later in this chapter.

6.6.10 The pointer Pseudo- Op

The pointer pseudo-op creates a (relocatable) DPS-8 pointer, much like

          .data  pointer::X
does. (.data is described in Chapter 8).

The format of pointer is

          pointer address,char_offset,bit_offset,segid
The last three fields are optional. address gives a (word) address, char_offset a byte number within that word, bit_offset a bit offset within the byte, and segid a SEGID. If char_offset and/or bit_offset are omitted, an offset of zero is assumed. If the segid is omitted, YAA assumes that the address is in the same segment as that of the address. pointer generates a machine pointer in the assembled code. Thus
          pointer  X
is equivalent to
          .data  pointer::X

The segid field may contain an expression yielding a value with the segid relocation instruction to override the default SEGID. In our experience, however, the segid field is seldom specified at all. More commonly, the only operand of pointer will be a symbol name from the program or a SYMREF.

6.6.11 The tally Pseudo- Op

tally generates a tally word that can be used in Indirect then Tally (IT) address modification. It is used in connection with 6-bit character strings using ci, sc or scr modification, and with word arrays using i, id, and di modification.

The format of tally is

          tally  address,count,offset
where
address
is an address expression whose value will be placed in the address field of the generated tally word.
count
is an integer expression whose value will be placed in the count field of the generated tally word. Only the low order 12 bits of this integer will be used. If no count is specified, the default is zero.
offset
is a character offset that will be placed into bits 31-35 of the generated tally word. This offset may have any value (include negative values) and may be specified with a relocatable value. If no offset is specified, the default is zero.
The tally word generated by tally will have a zero in bit 30, indicating that the word references 6-bit characters.

6.6.12 The tallyb Pseudo- Op

tallyb is almost exactly like tally, except that it generates tally words used in connection with 9-bit character strings using ci, sc, and scr address modification. It has the form

          tallyb  address,count,offset
address and count have the same meaning as in tally. offset is a byte offset which can be positive, negative, or a relocatable value.

The tally word generated by tallyb will have a one in bit 30, indicating that the word references 9-bit characters.

6.6.13 The tallyc Pseudo- Op

tallyc also generates a tally word for Indirect then Tally addressing. It is used in connection with indirect references to tag fields using idc and dic address modification.

tallyc has the form

          tallyc  address,count,modifier
where address and count are the same as for tally. The modifier is used for normal tag field references, and may have any of the symbolic values associated with IT tags:
          i  id  di  sc  scr  ci  ad  sd  f  idc  dic
The modifier may also have any of the IR address modification forms, or n or n*.

As an example, the following sequence of code shows a simple example of the use of tallyc.

              lda     X,idc

          X:  tallyc  B,10,i
          B:  arg     U
              arg     V
              arg     W     # etc.
The lda contains an idc reference to X. At X, there is a tallyc pseudo-op which generates the corresponding tally word. This points to B. At B, there is a sequence of indirect words; addressing will run down this sequence until the tally runs out.

6.6.14 The tallyd Pseudo- Op

tallyd is similar to the other tally pseudo-ops. It generates a tally word for use in connection with indirect references using the i, ad, sd, id, or di addression modifications.

tallyd has the format

          tallyd  address,count,delta
where address and count are the same as for other tally pseudo-ops. delta is an optional integer expression whose value is used in the delta field of the generated tally word by ad and sd tags. The delta value will be hashed into a value in the range 0 through 64. If delta is omitted, the default is zero.

6.6.15 The tdcw Pseudo- Op

The tdcw pseudo-op is used to create data control words for hardware I/O processing. tdcw stands for "Transfer to Data Control Word". The pseudo-op has the form

          tdcw   X
where X is an address that can be represented by an 18-bit quantity. The result of tdcw is a machine word containing the value of X in its upper half; in its lower half it has the octal value
          020000

6.6.16 The zero Pseudo- Op

The zero pseudo-op generates a machine word made up of two 18-bit fields. It has the form

          zero   upper,lower
where upper is a value to be put into the upper half of the word and lower is a value to be put into the lower half.

Either of the arguments may be omitted, as in

          zero    A
          zero    ,B
In this case, zero puts the specified argument in the appropriate half of the word and puts 0-bits in the other half.

If both arguments are omitted, as in

          zero
YAA creates a machine word that contains only 0-bits.

6.6.17 The .bias Pseudo- Op

The .bias pseudo-op is provided for backwards compatibility with GMAPV. In our opinion, it is an outmoded concept and should not be used in new programs. The pseudo- op has the form

          name: .bias  expression,ar_reg
The label name gives the name of a segment or template. If it is omitted, the current segment or template is used.

.bias indicates that the value of the expression should be added to the value of every symbol in the given segment or template, any time one of these symbols is used.

The ar_reg argument is optional. If it appears, it should be the name of an address register. This address register will be added to any references to symbols in the biased segment, unless the reference explicitly uses a different address register.

By default, segments and templates will have a bias of 0, and there is no ar_reg. To turn off an existing bias, use the instruction

          name: .bias 0
This turns off the use of the address register as well as setting the bias to zero.

If there are several .bias statements for the same segment or template, the last one appearing in the source code is the one that is used when code is actually generated.

6.6.18 The SDSC Pseudo- Op

The sdsc pseudo-op is used in conjunction with mtr and rtm instructions. The pseudo- op has the general form

          sdsc   word,byte,code,register
where word and byte give a memory location and register specifies one of the ar registers to be involved in the operation. The code argument is one of the following letters:
          s     load a signed value
          z     load an unsigned value
For example, the following could be used with an mtr instruction to load a signed value from memory into AR7.
          sdsc   addr,0,s,ar7

6.7 EIS Instructions

EIS (Extended Instruction Set) instructions perform multiword alphanumeric operations. For example, mlr is an EIS instruction that copies a number of characters from one region of memory to another. The actual mlr instruction specifies options for the operation; after this instruction comes two adsc4, adsc6, or adsc9 descriptors describing the two memory regions and their contents. In the rest of this section, we will discuss the mlr instruction, with the understanding that other EIS instructions behave in a similar way.

The general format of an mlr instruction is

          mlr token_sequence,token_sequence,modifiers
          adscn address,charnum,length,modifier
          adscn address,charnum,length,modifier
As shown, the first two arguments of mlr are Bracket- balanced token sequence expressions. These token sequences are called MF's (modifier fields). The first MF provides options that are used in interpreting the first adsc descriptor (describing the memory region from which data will be copied). The second MF provides options for interpreting the second adsc descriptor (describing the memory region to which data will be copied). The modifiers on the end of the mlr instruction give additional options for the operation.

The MF arguments for mlr consist of zero or more option keywords separated by commas (and enclosed in square brackets to make a token sequence). Order of keywords is unimportant. Recognized options are:

ar
indicates an address register should be used when calculating memory region referenced by the corresponding descriptor. If this option is present, the corresponding descriptor should specify an address register after the final modifier. NOTE: unlike GMAP, YAA will not make sure that the required address register is specified. It is up to the programmer to make sure the descriptor specifies an address register.
id
indicates that the associated descriptor contains an indirect word that points to the true operand descriptor. If this option is not present, no indirection is used.
rl
indicates that the length field in the associated descriptor contains refers to an index register that contains the actual number of characters to be copied. If this option is not present, the length field gives the actual number of characters.
index register name
gives the name of an index register containing a word offset that should be used in calculating the region referenced by the corresponding descriptor.
The options may appear in any order. For example, you might have
          mlr    [ar,x2], [rl]
          adsc9  0,1,8,ar2
          adsc9  ABC,0,x5

The first MF says that address register modification is required for the first descriptor, and that index register x2 contains a byte offset. Since the address field of the first descriptor is 0, the result is that the address of the first word of the memory region is whatever value is in ar2, plus the byte offset in x2, plus one more byte (the charnum value). The length of the memory region is 8 characters.

The second MF says that the length field of the corresponding descriptor refers to an index register that holds the actual length. Looking at the corresponding descriptor we see that the memory region begins at the location of symbol ABC, and character 0 in that word. The length of the region is found in x5.

Unlike GMAP, the second descriptor could not be written

          adsc9  ABC,0,5
You must write x5 instead of just 5. It is never correct to refer to registers just by number, even though (in this case) the rl option of the mlr instruction suggests that the descriptor should specify a register in the length field.

Earlier versions of YAA insisted that adsc instructions had to have at least three fields. In this version, fields may be omitted; unnecessary trailing commas may also be omitted. In general, if a trailing value may be omitted, the comma may be too.

Several modifiers may be added after the two MF arguments in an EIS instruction. These are

f
indicates that a 1-bit (fill bit) should be used when combining a short bit string with a long bit string, so that the shorter string appears to be the same as the longer string.
nt1
works with the CMPCT instruction. Indicates that characters from the first operand should not be translated.
nt2
works with the CMPCT instruction. Indicates that characters from the second operand should not be translated.
p
indicates that signed 4-bit results should be stored with octal 013 as the plus sign. If this is not specified, signed 4-bit results will be stored with octal 014 as the plus sign.
r
indicates that rounding should take place as the final operation of arithmetic operations.
t
enables truncation faults (i.e. if data is truncated, it should cause a fault).
Many other EIS instructions use address modification options specified in similar forms. The options may be specified in any order.

If you specify a modifier that is not one of the characters listed above, it is interpreted as a fill character or a BOLR field, depending on the instruction. For BOLR fields, the value should be an expression that evaluates to an integer; the assembler uses the bottom four bits of this integer as the BOLR field.

6.8 Shrink Vectors

Several YAA pseudo-ops create shrink vectors for DPS-8 hardware operations. These are described in the sections that follow.

All of the pseudo-ops described here are actually implemented with macros obtained with the instruction

          .include "climb.a"
For more on macros and the .include statement, see Chapter 8.

6.8.1 The cvec Pseudo-Op

cvec generates a copy vector for use in descriptor operations. It has the format

          cvec  segid
where segid is used to fill in the SEGID field in the generated copy vector (i.e. the low-order 12 bits). This argument must be supplied.

cvec generates a two-word copy vector beginning on the next double-word boundary.

6.8.2 The fvec Pseudo-Op

fvec and fvecb generate data frame vectors for use by instructions like ldd3. They have the format

          fvec   size,[attributes]
          fvecb  size,[attributes]
where
size
is an integer expression giving the size of the frame to be created by the generated vector. With fvec, the size is given in words; with fvecb, the size is given in bytes.
[attributes]
is an optional token list giving access attributes for the frame. Attributes in the list are separated by blanks, and the entire list is surrounded by square brackets. Attributes are represented by the following symbols:
     r - read
     nr - no read
     w - write
     nw - no write
     s - save
     ns - no save
     c - cache
     nc - cache bypass
     x - extended (EI mode)
     nx - not extended (ES or NS mode)
     e - execute
     ne - no execute
     p - privileged
     np - not privileged
     b - bounded (non-zero size)
     nb - not bounded (null)
     a - accessible
     na - not accessible (missing)
For example,
          fvec  16,[nw]
is used for generating a read-only frame that is 16 words long.

If no attributes are specified, the defaults are r, w, s, c, x, e, p, b, and a (everything on).

6.8.3 The vec Pseudo-Ops

The vec and vecb pseudo-ops generate shrink vectors. Each shrink vector is two words long and double-word aligned. They have the format

          vec  segid,offset,size,[attributes]
          vecb segid,offset,size,[attributes]
where
segid
is an unsigned integer expression with a value less than 4096. This is put into the SEGID field of the generated shrink vector. The segid value may also be given by a relocatable expression.
offset
is put into the offset field of the generated shrink vector. With vec this is a word offset; with vecb it is a byte offset. The offset value may be relocatable.
size
is an integer expression giving the size field of the generated shrink vector. With vec, the size is given in words; with vecb, the size is given in bytes.
[attributes]
is an optional token list giving access attributes for the frame. Attributes in the list are separated by blanks, and the entire list is surrounded by square brackets. Attributes are the same as those for fvec.

7 Functions

YAA's functions are primary expressions that accept zero or more arguments. All function calls consist of a function name (beginning with a dot) followed by a list of zero or more arguments enclosed in parentheses. Arguments are separated by commas and must be Bracket-balanced.

The result of a function is an expression of one of the types described in Chapter 5: integer, floating point, string, token sequence, or location. This result may be used inside other expressions. The result of a function is an immediate expression if the argument(s) are all immediate expressions.

The functions recognized by YAA are listed below.

          .concat( expression, ... )
          .defer( expression )
          .div( int_expression,
          int_expression)
          .eval( token_sequence_expr
          )
          .exists(
          token_sequence_expr )
          .highest()
          .highest( reloc_expression
          )
          .ic()
          .ic( reloc_expression )
          .length( expression )
          .list()
          .lowest()
          .lowest( expression )
          .max( int_expression, ...
          )
          .min( int_expression, ...
          )
          .mod( int_expression,
          int_expression)
          .quote( expression )
          .sshift(
          int_expression,int_expression )
          .substr(string_expression,
          int_expression,
                  int_expression)
          .substr(token_seq_expr,
          int_expression,
                  int_expression)
          .system()
          .system( string_expression
          )
          .tagval( tag_expression )
          .time()
          .udiv(
          int_expression,int_expression )
          .unquote( expression )
          .upto( int_expression )
          .upto(
          int_expression,reloc_expression )

All functions but .defer are evaluated on the first pass through the source code (provided that their arguments can be evaluated on the first pass). This means, for example, that the .highest function refers to the highest point in a section at the time the function is encountered. This may not be the absolute highpoint, if more code is added to the section later in the assembly. If you do not want a function evaluated on the first pass, use .defer (described later in Section 7.21).

7.1 Concatenating Strings and Token Sequences: .concat

Use:

          .concat(seq1,seq2,...)
          .concat(string1,string2,...)

Where:

seq1,seq2,...
are token sequences.
string1,string2,...
are strings.

Description:

.concat concatenates one or more token sequences or strings. For example,

          .concat([a b c],[d e f])
results in the token sequence
          [a b c d e f]
and
          .concat("abc","def")
results in the string
          "abcdef"

7.2 Integer Division: .div

Use:

          .div(A,B)

Where:

A,B
are integers.

Description:

The .div function performs integer division in a slightly different way than the / operator.

          .div(A,B)
is the quotient of the integer A divided by B. If the division is inexact, the result of .div is always the largest integer less than the true quotient. In other words, truncation always goes towards negative infinity. Contrast this with A/B, where truncation is always towards zero. Thus we have
          .div(7,-3)  == -3
          7/(-3)      == -2

7.3 Integer Modulus: .mod

Use:

          .mod(A,B)

Where:

A,B
are integers.

Description:

The .mod function returns the remainder from a .div operation. The relationship

          A == B*(.div(A,B)) + .mod(A,B)
is always true. This means that .mod always returns a non-negative result.

7.4 Evaluation Expression from Token Sequence: .eval

Use:

          .eval(tokseq)

Where:

tokseq
is a token sequence whose tokens form a valid expression.

Description:

The .eval operator evaluates the tokens in the given token sequence as if the tokens formed a single expression. The result of .eval is the result of this expression. For example,

          .eval( [1 + 2] )
yields the integer result 3.
          .eval( .concat([1 +],[2]) )
also yields the integer result 3. On the other hand,
          .eval( ["a"] )
yields the string result "a".

7.5 Checking the Existence of a Token: .exists

Use:

          .exists(token)

Where:

token
is a token sequence expression consisting of a single token.

Description:

The .exists function returns the integer 1 if the given token has been defined or declared up to this point in the program; otherwise, .exists returns 0. For example,

          .exists([A])
determines if the symbol A has been defined or declared up to this point in the program.

7.6 The Size of a Section: .highest

Use:

          .highest()
          .highest(reloc_exp)

Where:

reloc_exp
is a relocatable expression.

Description:

The .highest function returns a relocatable value representing the largest value of the instruction counter inside a section. Without arguments,

          .highest()
returns the largest value of the instruction counter inside the current section.
          .highest ( reloc_exp )
returns the largest value of the instruction counter inside the section that contains the location given by the relocatable expression. The result of .highest is an immediate expression.

Note that .highest returns the greatest value of the instruction counter at the time that .highest is called. More material may be added to the section later in the program.

7.7 The Start of a Segment: .lowest

Use:

          .lowest()
          .lowest(reloc_exp)

Where:

reloc_exp
is a relocatable expression.

Description:

The .lowest function returns a relocatable value representing the beginning of a section.

          .lowest()
returns the beginning of the current section.
          .lowest(reloc_expression)
returns the beginning of section that contains the given relocatable value. The value of .highest minus .lowest is the length of the section (expressed in units dictated by the section's offset mode).

7.8 Determining the IC Value: .ic

Use:

          .ic()
          .ic(reloc_exp)

Where:

reloc_exp
is a relocatable expression.

Description:

The .ic function determines the current value of the instruction counter inside a section. Without arguments,

          .ic()
returns a relocatable value representing the current instruction counter value of the current section.
          .ic( reloc_exp )
returns the current instruction counter value of the section that contains the given relocatable expression. The result of .ic is an immediate expression. The result is expressed as an offset in the units given by the section's current offset mode.

7.9 Lengths of Strings and Token Sequences: .length

Use:

          .length(string)
          .length(tokexp)

Where:

string
is a character string.
tokexp
is a token expression

Description:

When applied to a string, .length returns the number of characters in the string. When applied to a token sequence, .length returns the number of Bracket-balanced token subsequences (separated by commas) in the token sequence. For example,

          .length("abc")
yields the integer 3, while
          .length([tok1,tok2])
yields the integer 2.

7.10 Determining Listing Options: .list

The .list function returns an integer whose bits describe the listing options that are currently in effect. For more information about listings, see Chapter 9.

7.11 Finding the Maximum in a Set of Expressions: .max

Use:

          .max(arg1,arg2,...)

Where:

arg1,arg2,...
can be integer or floating point expressions.

Description:

The .max function evaluates each of the argument and returns the largest of the results.

7.12 Finding the Minimum in a Set of Expressions: .min

Use:

          .min(arg1,arg2,...)

Where:

arg1,arg2,...
can be integer or floating point expressions.

Description:

The .min function evaluates each of the argument and returns the smallest of the results.

7.13 Signed Right Shift: .sshift

Use:

          .sshift(value,offset)

Where:

value
is an integer expression giving the value you want to shift.
offset
is an integer expression telling how many positions you want to shift the value to the right. If this is negative or greater than the number of bits in an integer value, the result of .sshift is undefined.

Description:

The .sshift function performs a signed right shift. Vacated bits are filled with the high-order (sign) bit of the given value (which means that the shift operation is performed arithmetically). For example,

          .sshift(0700777000777,3)
has the result
          0770077700077

If you want to perform a logical right shift (filling vacated bits with zeroes), use the >>>> operator described in Chapter 5.

7.14 Substrings and Token Subsequences: .substr

Use:

          .substr(string,start,end)
          .substr(tokseq,start,end)

Where:

string
is a string expression from which you want to extract a substring.
tokseq
is a token sequence expression from which you want to extract a subsequence.
start
is an integer expression giving a position in the string or token sequence. The beginning has a position of zero. Subsequent characters or tokens are numbered sequentially.
end
is an integer expression giving the last position that should be included in the substring or subsequence. If this is greater than or equal to the number of characters in the string, or number of tokens in the token sequence, .substr only goes up to the end of string or token sequence, then stops.

Description:

The .substr function returns a substring of a string, or a subsequence of a token sequence. For example,

          .substr("abcdef",0,3)
has the value "abcd" (positions 0 through 3). If end is greater than or equal to the number of characters in string, .substr goes up to the end of string and stops. Thus
          .substr("abcdef",2,.length("abcdef"))
has the value "cdef".

As an example of taking a subsequence of a token sequence expression,

          .substr([a,b,[c,d]],1,2)
has the token sequence value
          [b,[c,d]]

If the first argument is a string, the result of .substr is a string. If the first argument is a token sequence, the result of .substr is a token sequence.

7.15 Identifying the Operating System: .system

Use:

          .system()
          .system(string)

Where:

string
is a string expression to be tested against the current system name.

Description:

The .system function has two forms.

          .system()
returns a string indicating the system for which source code is being assembled. Possible values for system are currently
          "GCOS8_NS"  -- GCOS-8 multi-segment environment
          "GCOS8_SS"  -- GCOS-8 single segment environment
          "MARKIII"   -- MARKIII operating system
          "PORT"      -- PORT operating system on the PC
          "DOS"       -- DOS operating system on the PC
You may specify the option System=name on the assembler command line to specify a different system name.

The other form of the .system function is

          .system ( string )
This compares the value of the string to the string that would be returned by .system(). If the two strings are identical, .system returns the integer 1; otherwise, it returns a 0. For example, the value of
          .system("GCOS8_NS")
is 1 if the code is being assembled for the GCOS-8 NS mode environment, and 0 otherwise.

7.16 Tag Values: .tagval

Use:

          .tagval(tagexp)

Where:

tagexp
is an expression yielding a token that can be used as a tag field or address register name in a simple instruction (for example, "[x1]"). The .tagval function is particularly helpful when you have used a synonym to refer to a register.

Description:

The .tagval function returns the bit pattern associated with tagexp when it is used as a tag in a simple instruction. For example,

          eaa   lclvar,,p.fram
          ada   .dr0+.tagval([p.fram])
can be used to build a pointer to a local variable.

7.17 Date and Time: .time

Use:

          .time()

Description:

The .time function returns a string that gives the current date and time. This string has the form

          "Wed Apr 20 15:32:40 1993"
                    or
          "Thu Nov  7 03:02:01 1994"
This will always have the same number of characters. Notice that a blank is used to pad out the day of the month if it only has one digit.

7.18 Unsigned Integer Division: .udiv

Use:

          .udiv(A,B)

Where:

A,B
are integer expressions.

Description:

The .udiv function performs unsigned integer division. The result is an integer expression equal to A divided by B in unsigned integer arithmetic.

7.19 Converting Strings to Token Sequences: .unquote

Use:

          .unquote(string)

Where:

string
is a string expression.

Description:

The .unquote function returns a token sequence whose contents are the (tokenized) contents of the "string" argument. For example,

          .unquote("a,b,c")
yields the token sequence
          [a , b , c]

7.20 Converting Expressions to Strings: .quote

Use:

          .quote(exp)

Where:

exp
can be any type of expression.

Description:

The .quote function returns a string whose contents match the value of the "exp" argument. More precisely, .quote is defined so that

          .unquote( .quote(X) ) == X
for any argument X. Below we give some examples:
          .quote("abc")     == "\"abc\""
          .quote([lda b,c]) == "[lda b , c]"
          .quote(1+2)       == "3"
          .quote(.concat([lda],[b,c]))
                            == "[lda b , c]"
Notice that when .quote is applied to a token sequence, the resulting string contains opening and closing square brackets. Each token inside the brackets is separated by a single space character.

7.21 Determining Distance to the Next Alignment Boundary: .upto

Use:

          .upto(boundary)
          .upto(boundary,section)

Where:

boundary
is a number representing an alignment boundary as a number of bits. For example, on systems where words contain 36 bits,
        .upto(36)
determines the distance between the current location in the current section and the next word boundary.
section
is the name of a section defined in the current assembly.

Description:

The .upto function returns the number of bits between the current position and an alignment boundary. The form

          .upto(number)
returns an integer representing the number of bits between the current location in the current section and the next alignment boundary, as determined by the given "number". For example,
          .data  .upto(36):0
fills in the rest of the current word with zero bits. (See Chapter 8 for a description of the .data statement.)

The form

          .upto(number,section)
returns the number of bits between the current location in the specified section (as given by .ic) and the specified alignment boundary. For example,
          .upto(36,X)
returns the number of bits between the current location in section X and the next word boundary.

7.22 Deferred Expression Evaluation: .defer

Use:

          .defer(exp)

Where:

exp
is any type of expression. This expression cannot contain YAA variables, since they are discarded once they have been used in the first pass. It also cannot contain assignments, or the operators ++ or --, since such operations can only be used with YAA variables.

Description:

The .defer function delays the evaluation of the given expression until the assembler's second pass through the source code; most other expressions are evaluated (or partly evaluated) during the first pass. As an example, the result of

          .highest()
is the highest location in the current section, up to this point in the assembly.
          .defer( .highest() )
evaluates the .highest function in the second pass, at which point the assembler knows the highest location that was ever reached.

8. Pseudo-Ops and Code Generation

If the opcode field of a statement is not empty, the field must contain a token sequence expression whose result is a single token. There are two acceptable token types.

  1. Tokens that stand for hardware opcodes on the machine for which the program is assembled.
  2. Tokens that provide information and control the way that YAA behaves, but may not generate machine code directly. Such tokens are called pseudo-ops.

We assume that you are familiar with the DPS-8 hardware opcodes. The pseudo-ops of YAA are described later in this chapter.

All pseudo-ops and hardware opcodes must be written in lower case in source code. Pseudo-ops and hardware opcodes are reserved words when they appear in the opcode field of a statement.

Before we describe the pseudo-ops of YAA, it will be helpful to cover some general principles of code generation.

8.1 Code Generation

Output code is generated by hardware opcodes and by some pseudo-ops (e.g., .data). Every time YAA has to output code, YAA first checks to see if it is at the proper alignment boundary as dictated by the current section's offset mode. For example, if the current offset mode is word, YAA checks to see if the output code is currently at a word boundary.

If the output code is not at an appropriate boundary, YAA outputs filler to get up to the required boundary. In a data section, YAA uses 0-bits for filler. In a code section, YAA tries to use NO-OP instructions; if there isn't enough room to put a complete NO-OP instruction, 0-bits are used instead.

Alignment checks and adjustment take place before YAA finds out what the next instruction is. This has side effects, as we will note in Section 8.4.2.

Some types of statements require additional alignment manipulation. For example, the GCOS-8 rpd instruction must be aligned on an odd-word boundary. When YAA sees such an instruction code, it will react, possibly by outputting more filler to reach such a boundary. The filler will be 0-bits in data sections, NO-OP instructions in code sections. Note that this means there is a difference between

          label:  .null
                 rpd
and
          label:  rpd
The first goes to the next word boundary, then may go on to an odd-word boundary for the rpd instruction. The second goes to an odd-word boundary immediately, puts down the rpd instruction, and associates label with that instruction. In the second form, label is always associated with the rpd instruction; in the first, it may be associated with the word before the rpd instruction.

8.1.1 Statement Labels

Some pseudo-op statements give a special meaning to the statement label field. For example, the .macro definition pseudo-op uses the statement label field to give the name of the macro.

If a statement does not interpret the label field in a special way, the statement label is taken as a label for the current location in the section. This will be the location after all alignment manipulation has taken place. For example, a label on an rpd instruction will refer to the location of the rpd after it has been properly aligned.

Since YAA aligns to the proper offset mode boundary for every statement, location label names are always associated with whole multiples of the current offset mode. For example, if a section currently has the word offset mode, new locations named in that section will always refer to an exact number of words. Consider the code fragment

          X:   .data 1:0   #a single bit
          Y:   .data 35:0  #35 bits

This example shows a single bit of data allocated at the location labelled X, followed by 35 bits of data at the location associated with Y. Even though these two pieces of data could fit in a single word, YAA aligns data object on the boundary given by the section's offset mode. In a section with the word offset mode, X would be associated with one word and Y with the next word. If you want the two data objects to share the same word, you could change the section's offset mode to bit just before the definition of X. This is done with the .usage pseudo-op (described in a later section).

8.1.2 Literals

YAA offers a convenient way to create literals. If the operand field of an instruction contains a sequence of one or more statements enclosed in brace brackets, the statements are assembled into an unnamed section that can serve as a literal. In addition, the bracketed code regarded as a relocatable expression which can be used in other statements. The value of this expression is a relocatable value referring to the start of the unnamed section.

As a simple example, consider the instruction

          lda {.data "abcdefg"},du
An unnamed section is created to hold the assembled output of the .data statement (i.e. the literal string). The code contained in the brace brackets is then replaced with a single relocatable value referring to the beginning of the newly created section. The effect of this instruction is therefore to create a literal string in an unnamed section, then load a pointer to the first character of this string into the A register.

If the link editor discovers two literals with identical contents, the two will be merged ("folded") so that code is not duplicated. Thus literals should be considered non- writable.

8.1.3 Code Blocks

The brace bracket construct can also be used in the operator field of an instruction, to create a code block. In this case, YAA makes note of the current section when it encounters the opening brace, and goes back to that section after the closing brace that ends the code block. For example, you might write

          # Instructions 1
          {
             .template
          x: .data 0
          y: .data 0
          }
          # Instructions 2
in order to define a template in the middle of other instructions. YAA takes note of the current section before entering the code block and goes back to that section after the code block is finished. This means that the second set of instructions immediately follow the first set in the original section. If you just wrote
          # Instructions 1
             .template
          x:  .data 0
          y:  .data 0
          # Instructions 2
the second set of instructions would be considered part of the template, not the section that was being assembled before the .template instruction.

When YAA goes back to a section after a code block, YAA restores the offset mode of the section (if necessary). For example, the code

          #current offset mode is "word"
          {
                 .usage byte   #now "byte" offset mode
             A:  .data  9:'a'
             B:  .data  9:'b'
             C:  .data  9:'c'
             D:  .data  9:'d'
          }
          #current offset mode back to "word"
goes to the byte offset mode within the brace brackets so that it can lay down data in byte quantities. After the closing brace, the section is returned to word offset mode.

When YAA goes back to a section after a code block, YAA does not restore the IC to what it was before the braces. Thus

          lda 1; {lda 2; lda 3} lda 4
YAA makes note of the current section when the opening brace is encountered and restores the section at the closing brace. However, the instructions inside the braces do not change sections. Therefore the material inside the braces is assembled as normal. When the section is restored after the closing brace, the IC will have the value it had after the lda 3 instruction. Therefore the above line is equivalent to
          lda 1; lda 2; lda 3; lda 4

8.2 Labelling a Location: .null

Use:

          name:             .null
          [name,name,...]:  .null
                            .null

Where:

name
is any valid name.

Description:

The .null statement just associates the given name(s) with the current location in the current section. If necessary, .null forces alignment to the next boundary as dictated by the section's offset code.

For example, suppose that the offset mode of the current section is word.

          A:     .null
          [B,C]: .null
moves up to the next word boundary (if necessary) and associates A with the resulting location. .null does not change the instruction counter, so the next .null associates B and C with the same location.

If the .null has no label, it just forces alignment to an appropriate boundary (if necessary).

8.3 Data Allocation

YAA has two pseudo-ops that reserve storage for data objects: .space and .data. The .align pseudo-op also contributes to the way that memory is allocated.

8.3.1 Memory Alignment: .align

Use:

          .align  boundary
          .align  boundary,offset

Where:

boundary
is an integer expression giving a number. This number represents an alignment boundary, in the units dictated by the offset mode of the current section. For example, if the offset mode is word,
        .align 2
moves to the next double-word boundary.
offset
is an offset from the given boundary. For example, if the current offset mode is word,
          .align 2,1
goes to a double-word boundary and then one word more (i.e. an odd-word boundary).

Description:

With the format,

          .align  boundary
.align outputs "filler" (if necessary) until it reaches a memory location that is an even multiple of "boundary" times the offset mode of the current section. For example, if the offset mode of the current section is word,
          .align 2
outputs filler (if necessary) until reaching a double-word. The filler consists of 0-bits in a data section, and NO-OP instructions in a code section.

The form

          .align  boundary,offset
works much the same way, filling to the next location that is at the given offset from the specified type of boundary.

If necessary, you can specify offset types explicitly using the format used by .highest, .lowest, and .ic. For example,

          .align byte::1
aligns to the next byte boundary.
          .align bit::9
also aligns to the next byte boundary. If an offset argument does not have an explicit offset type but the boundary argument does, the offset takes its type from the first. For example,
          .align  bit::18,9
aligns to nine bits beyond a half-word boundary.

Notes:

.align does not accept a label field.

8.3.2 Uninitialized Storage Allocation: .space

Use:

          .space size

Where:

size
is an immediate integer expression, giving the number of units of memory to reserve. By default, the units are dictated by the section's offset mode, but different units may be specified explicitly (see below).

Description:

The .space pseudo-op reserves a block of memory but does not specify the contents of that block. For example, in a section with the word offset mode,

          .space 4
reserves four words of space.

Different units may be specified for the reserved space using the notation type::value. For example,

          .space byte::2
reserves two bytes of space. Similarly,
          .space bit::1
reserves the next available bit.

There is a special consideration in situations where you are using .space to allocate space for a symbol that already has an associated type. In this case, you can omit the size argument for .space, and the assembler will automatically allocate the amount of space required to hold a value of the given type. For example, in

          X:   .object    type=>>.double
          X:   .space
.space automatically allocates enough space to hold a double object.

When .space reserves memory, it does not initialize the contents of the storage.

8.3.3 Initialized Data Storage: .data

Use:

          .data  value
          .data  value,value,...
          .data  bitlength:value
          .data  bitlength:value,bitlength:value,...

Where:

value
is an expression giving the value to be stored in the allocated data area.
bitlength
is the number of bits that should be used to hold the value.

Description:

The .data pseudo-op reserves storage for a data object and places a value in that piece of storage.

With the format

          .data expression
YAA moves to an appropriate alignment boundary as dictated by the current offset mode, before looking at the .data statement. Then, if necessary, YAA adds additional 0-bits to fill out to an alignment boundary which is appropriate to the type of the given expression. The value of the expression is then stored in this memory location, taking up the amount of space appropriate to the type of the expression. For example, in
          .data 3
the value is an integer (requiring one word of storage on a word boundary). Therefore YAA goes to the next word boundary and reserves one word to hold the value 3.
          .data "abcdefgh"
reserves the amount of memory needed to hold eight characters, aligned appropriately for a character string (a byte boundary). C users should remember that YAA does not automatically put a '\0' on the end of this string.

Notice the difference between

          .data "a"
which reserves space for a single character and only requires byte alignment, and
          .data 'a'
which reserves space for an integer constant containing the ASCII character 'a' and therefore needs word alignment.

If the argument of .data refers to a location, YAA allocates a word for the value. Then YAA represents the location value with a pseudo-address and a relocation instruction based on the offset mode of the location. For example, suppose X is a location in a program and X has the word offset mode.

          .data  X
reserves a word for the location value and outputs a word relocation instruction. This will be a lower word relocation, since the data area is an entire word. The link editor eventually fills in the data area with a value indicating the word offset of X from the beginning of the segment containing X.

You can request different relocation instructions by preceding the location value with the desired relocation type followed by two colons. For example,

          .data  pointer::X
allocates a word for the location value and generates a relocation instruction that tells the link editor to fill in this value with a machine pointer to the location X.

For more complex kinds of data, you may use the form

          .data bitlength:value
YAA moves to the next word boundary, then reserves bitlength bits to hold the given value. For example,
          .data 18:5
moves to the next word boundary and then stores the simple integer value 5 in the next 18 bits of storage.

If the expression after the colon has a location value, YAA generates an appropriate relocation instruction for the value. For example, consider

          .data  18: segid::X
The location value is the 12-bit SEGID of the segment containing X. This is stored in the 18 bits reserved by the .data statement. If the alignment of these 18 bits is not suitable to hold a SEGID, YAA issues an error message. YAA generates an upper or lower segid relocation instruction, depending on whether the 18 bits reserved for the SEGID are in the upper or lower half of a machine word.

When a .data statement contains a bitlength, the type of the accompanying value determines how the data is stored in the given location. Integer values are right-justified and sign-extended to the entire width of the data. For example,

          .data  100*36:-1
extends the sign of the -1 to 100 36-bit words (filling all the words with 1-bits). String values are left-justified and padded with zero bits on the end. Floating point values are also left-justified and padded with zero bits.

A single .data statement may initialize several consecutive memory locations. Initialization values are specified with a list of Bracket-balanced operands, separated by commas. For example,

          .data 1,2,3
initializes three consecutive words to the given values.
          .data  18:word::X,18:word::Y
initializes the upper half of a word to the relocatable word offset of X and the lower half to the relocatable word offset of Y. Two relocation instructions would be generated: an upper word relocation for X and a lower word relocation for Y. The two relocatable offsets would be put together into a single word.

Note that this is different from

          .data  18:word::X
          .data  18:word::Y
Since each .data statement automatically goes to the next word boundary before it reserves storage, the above code allocates two consecutive words. Each of these words would have a relocatable offset in its upper half.

8.4 YAA Variables

A YAA variable is a variable created for use while assembling a YAA program. The variable is not directly associated with any memory locations in the assembled program.

8.4.1 Creating YAA Variables: .var

Use:

          name:             .var   expression
          [name,name,...]:  .var   expression

Where:

name
is the name of the variable you want to create. This must be a valid YAA identifier.
expression
is an expression giving the initial value for the variable(s) being created.

Description:

The .var pseudo-op creates and initializes one or more YAA variables. The given names become the names of YAA variables, each of which has the value of the given expression. For example,

          [A,B,C]: .var  0
creates three variables and initializes them to zero.

Names created as YAA variables may not have been used for any other purpose earlier in the program. For example, you can't use a name as the name of a memory location, then use it as a variable name too.

YAA variables must be created by a .var statement before they can be used (which means that the .var statement must precede any references to the variables in the source code).

Once you have created a variable, you can use it in other statements. For example, in

          A:       .var  "Hello!"
          String:  .data  A
the token A is replaced by the current value of the YAA variable A. This means that the above statement is equivalent to
          String:  .data  "Hello!"
A .var statement may assign a new value to an existing YAA variable, as in
          A:  .var  0
            ...
          A:  .var  A+1

A .var statement does not change the current alignment of the output code.

8.4.2 Setting Values: .set

Use:

          name:             .set  expression
          [name,name,...]:  .set  expression

Where:

name
is the name of the item whose value you want to set. This can be an existing YAA variable (created by a previous .var statement) or another symbol.
expression
is an expression giving the value you want to assign to the given item.

Description:

The .set pseudo-op lets you change the value of YAA variables or to assign values to other symbols. For example, in

          A:  .var  0
             ...
          A:  .set  A+1
the variable A is set to zero, then later incremented. In
          [X,Y,Z]: .var  'a'
               ...
          [X,Y,Z]: .set  'b'
three variables are initialized to 'a', then changed to 'b'.

You may use .set to assign values to symbols which are not YAA variables. The format of the statement is the same. However, such symbols can only be .set to a value once. YAA variables (created with an earlier .var statement) can have their values changed as often as you like.

YAA variables do not have a fixed type. Every time a variable is given a value by a .set statement, the type of the variable changes to the type of the expression in the statement. For example, you could say

          X:      .var    5
          int:    .data   X
          X:      .set    7.8
          float:  .data   X
In the first .data statement, X is an integer; in the second, it is a floating point value. If a variable is assigned a location value, the variable takes the same offset mode as the location value.

YAA variables may be associated with the addresses of literals, as in

          A:  .set   {.data  01010101}
This format is better than
          A:  .data  01010101
if you are creating a constant (i.e. a data object whose value you won't want to change). Remember that if a program has several identical literals, they will be merged into a single literal by the linker. Therefore using literals instead of explicit data objects can save memory.

8.4.3 YAA Variables vs. Other ___ Symbols

The previous sections may have given the impression that a YAA variable is very different from other symbols used in a program. This is not really the case. There are only a few differences between YAA variables and other user-defined symbols.

Any symbol may be assigned a value with .set, even if it has not been declared as a YAA variable. However, once a normal symbol has been given a value in this way, the value cannot be changed.

8.4.4 Circular Definitions

Earlier we gave the example of the statement

          A:  .set  A+1
This is valid if A has already been assigned a value. However, if this is the first reference to A, the statement is what we call a circular definition, and it is invalid. If you try an instruction like
          lda A
the A will be replaced with A+1, leading to (A+1)+1, leading to ((A+1)+1)+1 and so on. Eventually YAA will stop the expansion with the message
          Expression too complex
The same sort of problem occurs if you define A in terms of B and B in terms of A, or use a longer circular chain.

8.5 Manifests and Text Macros

Manifests and macros are symbols used to edit the text of your YAA program. They are similar to symbols created with the #define preprocessor directive in C, but there are several important differences. Manifests and macros are created with the .define pseudo-op.

8.5.1 The .define Pseudo-Op

Use:

          .define  manifest,text
          .define  macro(parm1,parm2,...),text

Where:

manifest
is any valid YAA identifier; this will be used as a manifest name.
macro
is any valid YAA identifier; this will be used as the name of a text macro.
parm1,parm2,...
are a list of parameter names used in the macro definition (see below).
text
is the text assigned to the manifest, or the definition of the macro.

Description:

The .define pseudo-op defines both manifests and text macros. The following sections explain these in detail.

Manifests

A manifest is a symbol whose value is a piece of source code text. Manifests are created with

          .define manifest,text
For example,
          .define  SIZE,30
associates the text 30 with the name SIZE.

Once a manifest has been created with a .define statement, it can be used anywhere in your source code, except in ASCII string and character constants. When YAA recognizes the manifest, the assembler replaces the name with the associated text. For example,

          .space SIZE
is changed to
          .space 30

The difference between manifests and YAA variables is that a manifest is strictly textual. For example, suppose you have

          X:  .set 2+2
              .define Y,2+2
          A:  .data X*X
          B:  .data Y*Y
When X is initialized, the expression 2+2 is evaluated and X is assigned the result of 4. When Y is defined, it is associated with the text 2+2. Thus the definitions of A and B become
          A:  .data 4*4
          B:  .data 2+2*2+2

Notice that the data with the label A is given the value 16. However, the data with the label B is given the value 8, because the multiplication operation takes place before the addition.

Because of the textual nature of manifests, it is a good idea to parenthesize the text values when they are defined. For example, you might write

          .define Y,(2+2)
With this definition,
          B:  .data  Y*Y
results in the expected value of 16. Because of such complications, it is better to use symbols or YAA variables whenever possible, rather than defining manifests.

Text Macros

A text macro is similar to a manifest. It is created with a .define statement of the form

          .define macro(parm,parm,...),text
The macro name and parm values are normal YAA identifiers. The parm symbols are called the parameters of the macro. The text can be any source code text. For example, you might have
          .define plus(A,B),A+B

Once a text macro has been defined in this way, it may be used in any location in the program's source code. To use a macro, you specify the macro's name followed by a parenthesized list of macro arguments. This is known as a macro call. Each macro argument is a Bracket- balanced sequence of tokens. Macro arguments are separated by commas. The number of arguments must be equal to the number of macro parameters given in the original macro definition.

When YAA encounters a macro call in source code, it replaces the call with the text given in the original macro definition. Wherever that text contains a token equal to one of the macro parameters, the token is replaced by the macro argument that corresponds to the parameter. For example, if your program has

          .data plus(3,2)
the macro call plus(3,2) is replaced by the text associated with plus. The macro argument 3 replaces all occurrences of the parameter A in the macro text; similarly, the argument 2 replaces all occurrences of the parameter B. Therefore, the above .data statement becomes
          .data 3+2

As with manifests, this kind of macro substitution is strictly textual. For example, in

          .define times(X,Y),X*Y
          .data   times(1+2,3+4)
the .data statement becomes
          .data  1+2*3+4
For this reason, it is usually a good idea to parenthesize all appearances of macro parameters in the macro definition, as in
          .define times(X,Y),(X)*(Y)
With this definition, the .data statement becomes
          .data  (1+2)*(3+4)

Macro parameters are only recognized in macro text when they are separate tokens. In particular, they are not recognized when they appear as part of string constants. For example, if you define

          .define HOWMANY(WHO,N),"A WHO has N lives"
the parameters WHO and N will not be replaced inside the string constant. You could get around this problem by defining
          .define HOWMANY(WHO,N),\
                  .concat("A ",.quote([WHO]),\
                  " has ",.quote([N])," lives")

A text macro definition is a single statement, and therefore extends to the first semi-colon or to the end of the line. However, long text macros may be created by continuing the macro definition onto additional lines in the usual way (putting a backslash at the end of each continued line).

You should be careful of the way that null token sequences affect macros. For example, consider

          .define sample(X),( ([X]==[]) ? 0 : X )
This looks like the value of sample should be X unless X is null. However, if X is null, the macro expands to
          ( ([] == []) ? 0 : )
which is syntactically incorrect. Thus it will get an error. One way to solve the problem is to change the definition to
         .define sample(X), @(([X]==[]) ? [0] : [X])
If X is non-null, this evaluates to @[X] or just X. If X is null, it evaluates to @[0], or just the integer 0. (The @ operator is described below.)

8.5.2 The @ Operator

The @ operator is related to manifests and text macros in that it is used to modify source code text. The general form of the operation is

          @ token_sequence_expression
where token_sequence_expression is the shortest possible Bracket-balanced token sequence following the @. When YAA encounters such a construct in source code, the construct is replaced by the sequence of tokens that are the result of the token sequence expression. As a simple example,
          x: .var  [1,x0]
             lda   @x

turns into

             lda   1,x0
x is a YAA variable that has been given a token sequence value. The construct @x is therefore replaced by the tokens associated with x.

As with manifests and text macros, the replacement is purely textual: the source code is altered. The token sequence expression after @ must be an immediate expression.

8.6 Opcode Macros

YAA also lets you create a second kind of macro, called an opcode macro. Creating an opcode macro is like creating a new type of statement. In source code, it appears to be a single statement. However, when the program is assembled, the opcode macro statement is replaced by the text associated with the macro, and that text may actually be made up of a number of statements.

8.6.1 Defining Opcode Macros: .macro

Use:

          macro_name:   .macro   parm,parm,...
                        # macro definition
                        .endmacro

Where:

macro_name
is any valid name. This is the name that you use to invoke the macro in any subsequent code.
parm,parm,...
is a list of other names, called the parameters of the macro. Parameter names appearing inside the definition of the macro are replaced with the corresponding argument values when the macro is invoked.

Description:

The .macro pseudo-op starts the definition of an opcode macro. It must be on a line on its own; it cannot come before or after other statements on the same source code line.

The .macro statement is followed by a sequence of zero or more other statements called the body of the macro.

The end of macro definition is indicated by an .endmacro statement as shown above. If the .endmacro statement has a label, the label must be the same as the name on the .macro statement that began the macro definition.

Below we give an example of a simple macro definition

          VECTOR5: .macro  A,B,C,D,E
                   .data   A
                   .data   A*B
                   .data   A*B*C
                   .data   A*B*C*D
                   .data   A*B*C*D*E
                   .endmacro
Once the macro has been defined, it can be used as a statement. For example, you might write
          label:   VECTOR5  1,2,3,4,5
The opcode field is the name associated with the opcode macro. The operand field gives a list of Bracket-balanced token sequences which serve as the arguments of the macro.

When YAA recognizes an opcode macro name in the opcode field, it inserts the body of the macro in the program where the opcode macro statement appears. Occurrences of the macro parameters in the macro body are replaced by the corresponding arguments in the operand field of the opcode macro statement. Thus the above statement turns into

          label:   .null
                   .data  1
                   .data  1*2
                   .data  1*2*3
                   .data  1*2*3*4
                   .data  1*2*3*4*5
Notice that the label of the opcode macro is associated with a .null directive that associates the label with the value of the IC before the macro expansion begins.

The number of arguments specified when you invoke the macro cannot be greater than the number of parameters given in the macro definition. As with text macros, arguments are strictly textual.

Omitting Arguments

When invoking an opcode macro, any or all of the (positional) arguments may be omitted. For example, if a macro is defined with

          mac:   .macro  A,B,C
you could call it with
          mac 1,2   #omits C
          mac ,2,3  #omits A
          mac 1,,3  #omits B
          mac ,,3   #omits A,B
          mac 1     #omits B,C
          mac       #omits all arguments
             # and so on
As shown above, you do not have to put trailing commas on the argument list if you leave off trailing arguments. However, you can still do it if you want, as in
          mac  1,,

When you omit an argument value, the macro is passed "emptiness", i.e. no text. This can be useful. For example,

          .if  [A] == []
tests whether a value was passed for parameter A. (The .if directive is discussed later in this chapter.)

Keyword Parameters

In addition to the normal parameters explained above, macros may be defined to accept keyword parameters. Keyword parameters are specified on the .macro statement that begins a macro definition. They appear in the parameter list of the macro definition, after all the normal parameters have been given. A keyword parameter has the form

          keyword=>>default_value
where keyword is a normal identifier and default_value consists of one or more Bracket- balanced token sequences. Keyword parameter definitions are separated by commas. The keyword parameter list ends at a semicolon or the end of the source code line.

When a program uses an opcode macro that was defined with one or more keyword parameters, YAA checks the operand list for keyword arguments. A keyword argument has the form

          keyword=>>value
where keyword is the same as in one of the keyword parameters specified in the macro definition. If there is a keyword argument matching a particular keyword parameter, the keyword is replaced by the specified argument value wherever the keyword appears in the body of the macro. If there is no such keyword argument, the keyword is replaced by the default_value given when the macro was defined.

As an example, suppose you define

          GOSUB:  .macro  location,modifier,SIZE=>>30
                  tsx1    location,modifier
                  .space  SIZE
                  .endmacro
If you call this macro with
                 GOSUB   rtn,x3,SIZE=>>20
the macro is expanded to
                 tsx1    rtn,x3
                 .space  20
If you omit the keyword argument, as in
                 GOSUB   rtn,x3
you get the default value of SIZE as specified in the macro definition:
                 tsx1    rtn,3
                 .space  30

In a call to an opcode macro that has both keyword and normal parameters, the normal parameters must precede the keyword ones. The normal arguments must appear in the same order as the corresponding parameters in the macro definition, but the keyword parameters may be given in any order.

8.6.2 Deleting Manifests and Macros: .undefine

Use:

          .undefine name

Where:

name
is the name of any symbol, generally a macro or manifest.

Description:

The .undefine pseudo-op gets rid of the current definition of the symbol with the given name. For example,

          .define   SIZE,10
             ...
          .undefine SIZE
gets rid of the SIZE manifest.

Once you have "undefined" a macro or manifest, the name no longer has its special meaning. It is interpreted as a normal identifier for the rest of the program.

You can use .undefine to discard any symbol or variable. However, there are some situations in which this gets you in trouble. If YAA encounters a forward reference to an unknown symbol on the first pass through the source code, YAA simply copies that reference to the intermediate working file. The assumption is that the symbol will be defined later in the source, so that the forward reference can be resolved on the second pass. However, if you define the symbol then .undefine it again, the symbol is not defined on the second pass either and you get an error. The result is that you cannot .undefine anything that is referenced through forward references.

You must have a separate .undefine statement for each symbol you undefine.

8.6.3 Labels on Macro Calls: .label

Use:

          macro_name:  .macro  parm,parm,...
                       .label  placeholder
                        #statements
          placeholder:  #statement

Where:

placeholder
is any valid YAA identifier.

Description:

When a statement containing an opcode macro call has a label field, the label is assigned the value of the instruction counter before the beginning of the macro expansion. At times, however, a macro may prefer to have the statement label associated with a statement inside the macro expansion (i.e. not the first statement of the expansion). To do this, you declare use a statement of the form

          .label placeholder_name
where placeholder_name is a normal YAA identifier. If the .label statement itself has a label, the label must be the name of the macro, as given on the .macro statement.

The .label pseudo-op states that the given placeholder_name will stand for any label that is supplied when the macro is called. When the macro is expanded, any appearance of the placeholder name within the macro body will be replaced by the macro call's statement label field, expressed as a token sequence. For example, suppose you have

          f:     .macro
                 .label LNAME
                 statement1
          LNAME: statement2
                 .endmacro
If you call this macro with the statement
          here:  f
the macro is expanded to
                   statement1
          [here]:  statement2
In other words, the appearance of LNAME is replaced by the label here that was specified when the macro was called. Notice that the label is converted to the token sequence
          [here]
even though it was only specified as a symbol name (without the square brackets).

If the label field of a macro statement contains a list of labels, the placeholder given on the .label statement is associated with the entire list. Using our above example,

          [A,B,C]: f
expands to
                   statement1
          [A,B,C]: statement2

The placeholder name specified in a .label statement does not have to be used in a label field inside the macro. It can be used anywhere, e.g. in an operand field, as in

          tra  LNAME

If a macro definition contains a .label statement, but the macro call does not, the placeholder name is associated with an empty token sequence. In the above example, if you just use f without a label, the macro expands to

               statement1
          []:  statement2
This is a syntax error, since a statement can't have this kind of null label.

8.6.4 Local Variables: .local

Use:

          macro_name:   .macro   parm,parm,...
                        .local   var1,var2,...
             # statements
                        .endmacro

Where:

var1,var2,...
are the names of the local variables you want to declare.

Description:

The .local pseudo-op declares a set of variables which are local to a macro definition. If the .local statement has a label, the label must be the same as the name of the macro being defined.

The variables declared with .local are like normal YAA variables, except that they can only be used within the macro definition itself and they disappear at the end of each invocation of the macro. Using local variables can avoid several problems that might arise if you used normal .var variables in the macro:

Local variables avoid these problems. The following macro provides a simple example of the use of such variables.

          dbl_word: .macro  A,B
                    .local  MAX,MIN
          MAX:      .set    ((A)>>(B))?(A):(B)
          MIN:      .set    ((A)<<(B))?(A):(B)
                    .data   MAX
                    .data   MIN
                    .endmacro

This macro takes two arguments. The local variable MAX is set to the greater of the two arguments and the local variable MIN is set to the lesser. The macro then generates two .data statements with the first having the greater value and the second the lesser. Notice that we parenthesized every appearance of A and B to avoid problems if the arguments are expressions.

The .local statement should appear after the .macro statement that begins the macro definition, and before the first use of any of the local variables. Generally, .local statements should appear immediately after the .macro statement so that their declarations can be easily found.

A macro definition may have as many local variables and as many .local statements as required.

Local Variables in Nested Macro Definitions

A macro definition may contain the definition of another macro. This process is called "nesting" macro definitions.

Any .local and .label statements are associated with the macro that began at the most recent .macro statement. For example, in

          f:  .macro  A,B,C
              .local  X,Y,Z
              .label  LNAME
              ...
          g:  .macro  D,E,F
              .local  R,S,T
              .label  LNAME2
               ...
              .endmacro     # end of g
               ...
              .endmacro     # end of f
the variables X, Y, and Z are local to f, while R, S, and T are local to g. X, Y, and Z can be used inside both f and g; R, S, and T can only be used inside g.

Local Macros

In the above example, the macro f could only be called successfully once. If f were expanded a second time, YAA would try to redefine g and this would conflict with the definition of g given the first time f was called. Of course, the program could use .if to skip the definition of g on subsequent calls to f, but this can clutter up your code.

An alternative approach would be to declare g as local to f. This could be done with

          f:   .macro
               .local g
               g:  .macro
                  ...
                   .endmacro
                ...
               g
                ...
               .endmacro

In this instance, g may only be used inside the definition of f. Each time f is invoked, g is defined from scratch. g can be used by f; at the end of the expansion of f, however, YAA behaves as if g has been automatically undefined. In this way, f may be called as often as you like, and the definitions of g will not conflict with each other.

Defining a macro inside a macro can use up a lot of memory, since a new internal macro is defined every time the external one is used. For this reason, the technique should only be used in circumstances where no other method of implementation will work.

8.6.5 General Use of Backslash as an Escape

In Chapter 3, we showed how the backslash could be used to tell YAA to ignore a new-line character. In general, putting a backslash before a code token tells YAA to treat that token literally, without any special meaning it may usually have. This applies to keywords, manifests, text and opcode macros, and macro parameters. For example, in

          .define one,1
          lda one   # becomes lda 1
          lda \one  # remains lda one
the backslash in front of the manifest tells YAA not to replace it with the associated text.

As an example, suppose you have an opcode macro named fred and it takes a keyword argument SIZE. Suppose also that you want to define a macro named george that also has a keyword SIZE. Finally, suppose that george does some work, then calls fred with the passed arguments. If you wrote

          george:  .macro  SIZE=>>1
                   blah:    #stuff here
                            fred  SIZE=>>SIZE
                   .endmacro
the call to FRED would be changed to
          fred  1=>>1
(or whatever the value of SIZE turned out to be). Obviously, this is not what you want (it's a syntax error). Instead, you must write
          george:  .macro  SIZE=>>1
                   blah:    #stuff here
                            fred  \SIZE=>>SIZE
                   .endmacro
Now the call to fred will be expanded to
          fred  SIZE=>>1
(or whatever the value of SIZE is).

Because the backslash has this special meaning, you must type two of them if you want an actual backslash character (e.g. in a string literal).

8.7 Code Manipulation

YAA provides a number of pseudo-ops for manipulating the code produced by the assembler. These perform such operations as obtaining source code from another file, skipping sections of source code, and repeating sections of source code.

8.7.1 Source File Inclusion: .include

Use:

          .include filename

Where:

filename
is a string expression giving the name of the file you want to include.

Description:

The .include statement obtains source code from another file. YAA replaces the .include statement with the contents of the given file. For example,

          .include "user/cat/file"
obtains the contents of the specified file. Presumably this file contains YAA source code, e.g. macro and symbol definitions that are shared by many different source modules.

YAA remembers each string expression filename specified in an .include instruction. If an .include instruction later in the source code contains the same string expression as an earlier .include, the second inclusion will not take place. This lets you avoid including the same file twice. YAA only compares the string expressions; it does not compare actual file names resulting from those expressions. Thus

          .include "jdortmunder/incfile"
          .include "/incfile"
would both be executed, even if the two statements happen to refer to the same file.

No message is printed when YAA ignores a .include statement.

If YAA cannot find the file that you want to include, YAA normally aborts the assembly. However, you can avoid this by putting a label on the .include statement, as in

          X: .include "file"

In this case, the label is treated like a variable. The variable is assigned a value of 0 if the file is read successfully read, and assigned a non-zero value if the file cannot be read. (For C programmers, the non-zero value is the same value that would be assigned to "errno" in this situation.) With this format, YAA does not abort the assembly if the file cannot be read. This gives the source code a chance to examine the value of the label X and to figure out what should be done next.

If YAA reads an include file successfully, YAA completely processes the file before going on to the statement that follows the .include statement. Thus if you test the label X and it indicates a successful read, the include file has already been included at that point.

Inclusion File Search Rules

An .include statement may give either a relative or an absolute pathname for a file. If the given pathname is relative, as in

          .include "myfile"
YAA must look for the file under some catalog.

When you assemble YAA source code, you may specify

          Include=catalog
options on the command line. These options indicate catalogs that YAA should search through when attempting to find a relative .include file.

8.7.2 Setting Additional Inclusion File Search Rules: .search

Use:

          .search catname

Where:

catname
is a string expression giving the name of a catalog where YAA should search for .include files.

Description:

The .search pseudo-op lets you specify catalogs where YAA should search for files named in .include statements. For example, with

          .search "user/cat"
          .include "file"
YAA tries to satisfy the .include statement by looking for user/cat/file. More specifically, YAA searches through catalogs in this order:
  1. Catalogs named on the YAA command line, in Include= options.
  2. Catalogs named in .search statements within the source code.
  3. The current catalog.

You can have any number of .search statements in your code. .search directives are particularly useful inside initialization files, where they can set up search rules for all of the .include statements in a program.

8.7.3 Conditional Code Inclusion: .if

Use:

          .if int_expression1
              # first set of statements
          .elseif int_expression2
              # second set of statements
          .elseif int_expression3
              # third set of statements
                   ...
          .else
              # final set of statements
          .endif

Where:

int_expression1, int_expression2, int_expression3
are any integer expressions.

Description:

The .if statement tells YAA to skip zero or more source code statements if a certain condition is false. The simplest way to use .if is with a block of code of the form

          .if int_expression
               # statements here
          .endif
When YAA encounters an .if statement, it evaluates the given integer expression. If the value of the expression is non-zero, YAA proceeds normally. If, however, the value of the expression is zero (false), YAA skips all the statements that follow until it finds the .endif statement. YAA begins assembling statements again following the .endif. None of the statements between the .if and the .endif cause any actions. YAA just skips through them looking for the .endif.

A second way to use .if is with a block of code of the form

          .if int_expression
               # first set of statements
          .else
               # second set of statements
          .endif
Again, YAA evaluates the integer expression. If the value of the expression is non-zero, YAA assembles the first set of statements, but skips past the second set (between the .else and the .endif). If the value of the expression is zero, YAA skips the first set of statements, and assembles the second set.

The final way to use .if is with a block of code of the form

          .if int_expression1
              # first set of statements
          .elseif int_expression2
              # second set of statements
          .elseif int_expression3
              # third set of statements
                   ...
          .else
              # final set of statements
          .endif
If the first integer expression is non-zero, YAA assembles the first set of statements and skips the rest. If the first expression is zero but the second is non-zero, YAA assembles the second set of statements and skips the other sets. If all of the integer expressions on the .elseif statements turn out to be zero, YAA assembles the final set of statements following the .else.

The .exists expression is particularly useful in .if statements. For example,

          .if .exists([DEBUG])
                ...
          .endif
assembles the contained code if a symbol named DEBUG has been defined. Such code could contain instructions helpful during the debugging process. If you put
          DEBUG:  .var   1
at the beginning of your program, YAA assembles the debugging code after the .if. Once the program has been debugged, you can remove the definition and YAA skips the debugging code.

The condition in an .if statement must be an expression that can be evaluated on the first pass through the code. For example, suppose A and B are location expressions. You can use

          .if (A == A)
since YAA can immediately tell that this is true. However, you cannot use
          .if (A == B)
unless YAA can immediately tell if A and B refer to the same memory location (e.g. A and B are defined at the same offset from the beginning of a section that has already been defined).

8.7.4 The .while Loop

Use:

          .while condition
               #statements
          .endwhile

Where:

condition
is an integer expression. A non-zero value is considered "true", and a zero value is considered "false".

Description:

The .while loop repeats the assembly of a block of statements. It is similar to a "while" loop in higher level programming languages, except that it works on the text of the program.

To execute a .while loop, YAA first evaluates the condition of the loop as an integer expression. If the condition expression is non-zero, the statements between the .while and .endwhile are assembled. Once they have been assembled, YAA returns to the .while statement, and evaluates the condition again to see if it is still non-zero. The statements are repeatedly assembled until the condition expression is found to be zero. If the condition has the value zero the first time the .while statement is encountered, YAA does not assemble the enclosed statements at all.

As a simple example, consider

          I:  .var   4
              .while I>>0
                    .data I
                    I: .set I-1
              .endwhile
The .while loop will produce the statements
          .data 4
          .data 3
          .data 2
          .data 1

One .while loop may be nested inside another .while loop. Each .while statement must have a corresponding .endwhile statement. An .endwhile statement is always associated with the most recent .while statement.

8.7.5 The Foreach Construct

Use:

          name:  .foreach tokseq
             #statements
          name:  .endforeach

Where:

tokseq
is a token sequence expression.
name
is the symbol name used to stand for each token of the given token sequence, as .foreach repeats the block.

Description:

The easiest way to understand the .foreach construct is to begin with an example.

          x: .foreach [du,dl,qu]
             lda 1,x
          x: .endforeach
generates the following instructions.
          lda 1,du
          lda 1,dl
          lda 1,qu
In other words, the statement that follows the .foreach is repeated once for each item in the token sequence on the .foreach line. On each repetition, the symbol x (the label on the .foreach line) is replaced by the token sequence item wherever x appears.

The name used to label a .foreach statement is used as a variable inside the .foreach loop. If this hasn't already been declared as a variable, it is automatically created as such.

The enclosed statements are repeated once for each item in the token sequence. Each time through these statements, the YAA variable name is replaced by the appropriate item from the token sequence. The value of the variable is a token sequence containing exactly one token from the token sequence specified on the .foreach statement.

Individual items in the token sequence must be Bracket- balanced and separated with commas. The .foreach statement must have a label, and this label must be able to serve as a variable. The .endforeach statement must have the same label.

As another example of .foreach, consider the following macro definition.

          list: .macro
                .label x
                y:     .foreach x
                          .data .quote( y )
                y:     .endforeach
                .endmacro
A statement like
          [A,B,C]: list
would expand to
          y:  .foreach [A,B,C]
                  .data .quote( y )
          y:  .endforeach
which in turn becomes
          .data .quote([A])
          .data .quote([B])
          .data .quote([C])

8.7.6 Nesting Loops and Other Constructs

YAA allows nesting of .if, .while, and .foreach constructs with opcode macro definitions. There are several rules that control this nesting.

A nestable block is a set of statements beginning with an .if and ending at the corresponding .endif, or beginning at a .while and ending at the corresponding .endwhile, or beginning at a .foreach and ending at the corresponding .endforeach, or beginning at a .macro and ending at the corresponding .endmacro. The first and last statements of a nestable block may be given the same label, as in

          A:  .if  something
              ...
          A:  .endif

If both the first and last statements of a nestable block have labels, and the labels are not the same, an error message will be printed. In this way, you can use labels in your source code to show where each nestable block begins and ends.

One nestable block can be contained in another nestable block. For example, an opcode macro definition can contain an .if-.endif block, or vice versa. However, you cannot have two blocks partly overlap, as in

          A:  .if  something
          B:  .macro
              .endif
              .endmacro
One block must be wholly contained by the other.

.label and .local statements must be inside an opcode macro definition, but cannot be inside .if or .while nestable blocks. For example, you cannot say

          M:   .macro
               .if  something
               .local  X,Y,Z
               ...

When an .if or .while block appears in a macro definition, the blocks are not evaluated until the macro is expanded. For example, in

          M:   .macro
               .while A>>10 ...

YAA does not check the value of A at the time the macro is defined. The .while statement is just taken literally. The .while statement is not executed until the macro is actually used, at which time YAA will check the current value of A.

The last statement of a nestable block may use the "generic" pseudo-op .end instead of .endmacro, .endif, .endwhile, or .endforeach. For example, you might have

          A:   .macro
                   .if something
                       .while something
                             ...
                       .end
                   .end
               .end

An .end statement at the end of a nestable block may have the same label as the beginning of the nestable block, as in

          A:  .macro
              ...
          A:  .end

We recommend against using the generic .end statement--it can make debugging very difficult. If you use .endmacro, .endif, and so on, the assembler itself will detect nesting errors for you.

8.8 External Symbols

There are several pseudo-ops dealing with external symbols: .symdef creates symbol definitions, .symref creates symbol references, and .options specifies loader options for a list of symbols.

8.8.1 Declaring External Symbols: .symdef

Use:

          [name,name,...]:  .symdef   options

Where:

name,name,...
are the symbol names you want to declare as external.
options
are keywords indicating options for the declaration. At present, the only recognized option keyword is secondary, indicating that the symbols being declared are secondary SYMDEFs.

Description:

A .symdef statement indicates that the names given in the label field are SYMDEFs. The location of each defined symbol will be established by other statements in the program.

If a symbol is a SYMDEF, an appropriate .symdef statement may appear anywhere in the program. However, good programming style suggests that .symdef statements should appear either at the beginning of the program, or shortly before the first use of the external symbol.

A symbol may be mentioned in more than one .symdef and/or .symref statement in the same program.

8.8.2 Referencing External Symbols: .symref

Use:

          [name,name,...]:  .symref   options

Where:

name,name,...
are the external symbol names you want to reference.
options
are keywords indicating options that apply to all the referenced symbols. See below for a list of the recognized options.

Description:

A .symref statement creates SYMREFs for the names given in the label field. The possible options are

secondary
indicating that these are secondary SYMREFs.
bit
indicating that the SYMREFs have the bit relocation type.
byte
indicating that the SYMREFs have the byte relocation type.
word
indicating that the SYMREFs have the word relocation type.

If a symbol is a SYMREF, an appropriate .symref statement must appear before the first use of that symbol. However, good programming style suggests that .symref statements should appear either at the beginning of the program, or shortly before the first use of the external symbol.

A symbol may be mentioned in more than one .symdef and/or .symref statement in the same program.

If a relocation type is not specified for a SYMREF, word is assumed.

8.8.3 Loader Options for SYMDEFs: .options

Use:

          [name,name,...]:  .options  opts

Where:

name,name,...
are the names or one or more symbols that are declared as SYMDEFs with .symdef somewhere in the program. If you do not specify a label field, the options are applied to the current section.
opts
is an immediate string expression giving a list of options to be passed to the LD link editor.

Description:

The .options pseudo-op specifies loader options to be applied when the assembled code is link-edited by LD. The label field gives the names of one or more symbols that are declared as SYMDEFs with .symdef statements somewhere in the program. The .options statement may come before or after the .symdef statement(s). The opts string expression gives a string that contains the options to be passed to the loader. For example,

          .setu.:  .symdef
          .setu.:  .options  "+entdef"
declares .setu. to be a SYMDEF and associates the +entdef LD option with the symbol. A list of possible options is given in Appendix D of the LD Reference Manual.

The .options directive may be applied to SYMREFs and local sections. However, this has limited use on this system.

8.9 Section Definitions: .section

Use:

          name:     .section   options

Where:

name
is the name of the section.
options
are any options for the section. See below for recognized options.

Description:

A .section statement marks the beginning of a section in YAA code. The name of the section is given as the statement label. This name is passed on to the link editor.

If you omit the section name, YAA begins an unnamed section. Such sections are treated like named sections, but cannot be referenced by external object modules.

The operand field of the .section statement provides additional information describing the section.

Data and Code Sections:

If the keyword argument data appears, the section is assumed to be a data segment (for the purposes of .align). If the keyword argument code appears, the section is assumed to be a code segment. The default is code.

Common Sections:

Common sections (as in Fortran) can be created by specifying the common option, as in

          BLOCK:  .section  common,data
Such sections must be given a name by specifying a label on the .section instruction.

Indicating a Section's Parent:

The option

          parent=>>name
gives the name of the section's parent. This is the name of the section that immediately contains the section being created. The parent name must evaluate to a relocatable expression. This can be either a SYMREF or a location defined in some section of the program. If the location is not the name of an actual section, the parent section is the section that contains the given location.

The parent option can be omitted. In this case, the location of the section is determined by the linker.

Section Alignment:

The option

          align=>>number
lets you specify the alignment of a section. The number gives the alignment as a number of bits. For example,
          X:   .section   align=>>36
aligns the section on a full word boundary. If an alignment is not specified, the default is the offset mode of the section.

Section Origin:

By default, the .section pseudo-op sets the instruction counter to zero. Thus all instruction offset values for the code that follows will be relative to the start of the section at offset zero. However, you can specify

          origin=>>expression
in the operand field of .section. In this case, the IC is set to the value of the given expression. The values of .lowest() and .highest() will also have this value (although .highest() will change as soon as you assemble code in the section). The value of the origin cannot be negative; it can be positive or zero.

Automatic Label Declaration:

In some types of section, you want to create a SYMDEF or secondary SYMDEF for each label defined in the section. To do this, specify one of the options

          label=>>symdef
          label=>>secondary
for the .section pseudo-op. If you specify the first, the assembler automatically generates a SYMDEF for every label in the section. Similarly, if you specify the second, the assembler automatically a secondary SYMDEF for every label in the section.

The default is

          label=>>local
In this case, the labels are considered local to the section, unless you explicitly use .symdef or .secondary pseudo-ops to create appropriate SYMDEFs.

Offset Mode:

The .section pseudo-op can also specify one of the keywords bit, byte, or word to indicate the beginning offset mode for the section. For example,

          BYTE_SECT: .section  byte
gives the byte offset mode to the section. If no offset mode keyword is specified, the default is word.

Notes:

If a section is to be identified with a SYMDEF, the .symdef statement should precede the .section statement. If the .section comes first, the section will have no external name; instead, the specified SYMDEF is created at offset 0 in the (unnamed) section. While this has almost the same effect as naming the section itself, it is usually not what you want.

8.9.1 Changing Offset Mode: .usage

Use:

          .usage bit
          .usage byte
          .usage word

Description:

The .usage pseudo-op changes the offset mode of the current section to the given type. A .usage statement cannot have a label.

You may specify the offset mode of a section by specifying an alignment as part of a .usage pseudo-op, as in

          .usage align=>>36
The alignment is given as a number of bits.

Automatic Label Declaration:

In some types of section, you want to create a SYMDEF or secondary SYMDEF for each label defined in the section. To do this, you may specify one of the options

          label=>>symdef
          label=>>secondary
on the .usage pseudo-op. If you specify the first, the assembler automatically generates a SYMDEF for every label in the section. Similarly, if you specify the second, the assembler automatically a secondary SYMDEF for every label in the section.

The default is

          label=>>local
In this case, the labels are considered local to the section, unless you explicitly use .symdef or .secondary pseudo-ops to create appropriate SYMDEFs.

8.9.2 Code Origins

Use:

          .origin  where

Where:

where
is a relocatable expression whose value was previously defined in the assembly. It cannot be a SYMREF. The value of this operand must resolve to a single linkable symbol and a single offset from the symbol.

Description:

The .origin instruction moves to the section that contains the location given by the where operand and sets the instruction counter to the value it had at that location.

The .origin statement marks the beginning of a block of code that extends to the next .section or .origin statement (if there is one). The .origin statement tells YAA that this block of code should be placed beginning at the relocatable position given by the specified relocatable expression. For example,

          X:  .section
            # statements
              .origin  X
            # code
states that the given code should be positioned at the beginning of the section called X. This overwrites any generated code at the beginning of X, with undefined results. However, the .origin is usually used to set an origin to a location where no code has yet been created.

It is common to set origins at offsets from the current IC value, as in

          .origin  *+10
By default, YAA assumes the offset has the units indicated by the section's offset mode. For example, if the section has the word offset mode, the above instruction moves ahead ten words. Explicit offset modes may also be specified, as in
          .origin  byte::( .ic(byte::*) + 2 )
which sets the origin ahead by two bytes. Note that we had to use
          .ic(byte::*)
to obtain the current IC as a byte offset, and use byte:: as the first part of the operand field to indicate that we were setting a byte origin.

If you leave a section and come back with .origin, the section has the offset mode that it had when you left. This is true, no matter where you set the origin within that section.

8.9.3 Templates

Use:

          name:   .template   options

Where:

name
is the name of the template.
options
are any options for the template. See below for recognized options.

Description:

The .template statement is similar to the .section statement. It marks the beginning of a section and by default, it sets the IC to zero. However, a template is different from a normal section in several respects.

Template sections are intended to be useful when defining the lay-out of data structures. The value of the instruction counter gives an absolute offset from the beginning of a structure.

Offset Mode:

The statement

          .template offset_mode
begins a template section and sets the offset mode for the template. For example,
          .template  byte
begins a template with the byte offset mode. The offset mode of a template can be changed with .usage. If no offset mode is specified, the default is word.

Setting an Origin:

As with sections, you may use .origin to change your origin to an existing template. Thus you can build templates in stages, as in

          A:  .template
          B:  .data 0
          x:  .template
          y:  .data 0
              .origin  .ic(a)  # back to A
          C:  .data 0          # third position in A

Unlike sections, you may use .origin to set your current location to a negative offset from the beginning of the template. For example,

          x:  .template
              .origin *-10
          y:  .data   0
defines x at location 0 of the template and y at location -10. You may not use .origin to move to a negative offset inside a section.

Negative offsets may also be set using the origin=>>expression option. This is similar to the option for .section, except that the given expression may have a negative value as well as a positive one or zero.

Automatic Label Declaration:

In some types of template, you want to create a SYMDEF or secondary SYMDEF for each label defined in the template. To do this, specify one of the options

          label=>>symdef
          label=>>secondary
for the .template pseudo-op. If you specify the first, the assembler automatically generates a SYMDEF for every label in the template. Similarly, if you specify the second, the assembler automatically a secondary SYMDEF for every label in the template.

The default is

          label=>>local
In this case, the labels are considered local to the template, unless you explicitly use .symdef or .secondary pseudo-ops to create appropriate SYMDEFs.

Lowest and Highest Addresses:

The .lowest and .highest functions may be applied to templates. Note that .lowest may return a negative number if you use .origin to define items at negative offsets. The return value of .lowest can never be greater than zero.

Template Alignment:

Default alignments within the section can be specified with align=>>number on the .template pseudo-op. The alignment is given as a number of bits.

8.9.4 Literal Pools

Use:

          .pool
          .pool  parent=>>section

Description:

The sections and templates of an assembly program form trees. A parent section may have a number of subsections as its branches and each of those can have other branches.

Each such tree has a root: a section without a parent section. Each root has an associated literal pool which can contain literals created in the sections of that tree. When the source code refers to a string literal, YAA automatically looks for a literal pool where the data of that literal can be stored. It begins this search with the current section and proceeds up the tree until it finds a section that has an associated literal pool. YAA stores the data of the string in that pool.

The .pool statement associates a literal pool with the current section. Any subsequent literals in the code for this section or any of its branches are stored in this new pool.

YAA puts literals into literal pools as soon as they are encountered in code. For example, suppose a section has the form

              code
          .pool
              more code
If the code before the .pool statement contains literals, YAA immediately looks backward through the tree to find a suitable literal pool; the literals are stored in the pool associated with some parent section. If the code after the .pool statement contains literals, YAA stores those literals in the pool associated with the current section; the current section doesn't have its own pool until YAA finds the .pool statement. As a result, you should usually put the .pool statement at the beginning of a section.

A .pool statement may take the option

          parent=>>section
where section is the name of some other section. This says that the pool should be associated with the specified section instead of the current one.

GMAP programmers should note that .pool is the exact reverse of the GMAP .LIT pseudo-op. .LIT refers to all of the literals that appear before the statement; .pool refers to all of the literals that appear after.

8.9.5 Binding Sections

As noted in the discussion of literal pools, the sections of a program form trees.

The LD link editor decides where sections are placed in memory. These sections go together to form segments.

The first thing in a segment is the root of a tree. Inside the segment, parent sections are followed by their children, in the order that .section statements for the children appeared in the code. Of course, child segments may also be parents with children of their own. As an example, here's a simple segment layout:

          Root
              Child A
                  1st child of A
                  2nd child of A
                  3rd child of A
                  Literal pool for A
              Child B
                  1st child of B
                  2nd child of B
                      1st child of 2nd child of B
                      2nd child of 2nd child of B
                  3rd child of B
              Child C
              Main literal pool
In other words, LD lays down the tree branch by branch. On each branch, sections are laid down in the order in which they are defined in the source code.

Literal pools are always laid down at the end of the relevant branch. This is shown in the above layout diagram. Child A has a literal pool of its own, so the literal pool is the last section laid down in the branch associated with Child A. Similarly, the main literal pool associated with the root is the last thing that is laid down in the segment.

LD may put several section trees into the same segment. For example, sections without parents are usually all put into the same segment. Trees are laid down in an unspecified order (effectively random). If you want to control the order in which sections are laid down in a segment, you have to use the parent mechanism.

8.10 Marking Code

YAA offers a variety of pseudo-ops that specify important facts about your source code. These facts are also written into the object files produced by YAA if the system's object file format allows the recording of this information.

8.10.1 Naming the Source Module: .title

Use:

          .title  title_string
          .title  code,title_string

Where:

title_string
is a string expression specifying a title.
code
tells what should be given this title:

0 assigns the title to the assembly as a whole. This is the default if you don't specify a code.

1 uses the title as a heading for your listing.

2 uses the title as a subheading for your listing.

3 uses the title as a sub-subheading for your listing.

Description:

The .title statement lets you specify a title (or capsule description) of your source module. For example, you could use

          .title "sqrt"
to label a square root routine. YAA places no restriction on the length of the title string, although the linker may restrict this length.

You can also use .title to specify heading, subheadings, or sub-subheadings for the listing of an assembly. In this case, the title is not applied to the source module itself; it just appears in the listing. For more about listings, see Chapter 9 of the YAA Reference Manual.

8.10.2 Naming the Object Module: .module

Use:

          .module  name

Where:

name
is a string expression.

Description:

The .module statement lets you specify a name for the object module produced as a result of the assembly. YAA places no restriction on the length of the module name, although the linker may restrict this length.

8.10.3 Revision Names or Numbers: .revision

Use:

          .revision  name

Where:

name
is a string expression.

Description:

The .revision statement lets you specify a revision name or number for the assembled source code, in situations where you may have several versions of the same code. YAA places no restriction on the length of the revision string, although the linker may restrict this length.

8.10.4 Copyright Notices: .copyright

Use:

          .copyright  notice

Where:

notice
is a string expression.

Description:

The .copyright statement lets you place a copyright notice in the assembled source code. YAA places no restriction on the length of the copyright string, although the linker may restrict this length.

8.11 Issuing Messages

YAA has several pseudo-ops that let you issue diagnostic messages.

8.11.1 Error Messages: .error

Use:

          .error  msg

Where:

msg
is a string expression giving the error message you want to display.

Description:

The .error statement causes an assembly error. The given msg string is printed as part of the diagnostic message associated with the error. For example, you might have

          .if SYMBOL=1
             # statements
          .elseif SYMBOL=2
             # statements
          .elseif SYMBOL=3
             # statements
          .else
          .error "Invalid value for SYMBOL"
          .endif

If the value of SYMBOL is not one of the recognized ones, YAA executes the .error statement.

If YAA encounters an .error statement, the assembly is marked as erroneous and does not produce a useful object file. However, YAA continues to process the source code, and may report additional errors in the assembly. If you requested a listing, YAA continues to produce the listing after .error.

8.11.2 Abort Messages: .abort

Use:

          .abort  msg

Where:

msg
is a string expression giving the message you want to display when the program aborts.

Description:

The .abort statement stops assembly immediately after writing out the specified diagnostic message. Because .abort makes the assembler stop immediately, YAA does not produce a listing, no matter what options you may have specified for the assembly.

8.11.3 Warning Messages: .warning

Use:

          .warning  msg

Where:

msg
is a string expression you want to display as a warning.

Description:

The .warning statement displays the msg string as part of a diagnostic message. YAA then continues assembling source code normally.

8.11.4 Other Diagnostics: .comment

Use:

                  .comment  msg
          label:  .comment  msg

Where:

msg
is a string or expression whose value you want to display as a comment.
label
is treated as a string and is not evaluated. The specified label is printed as part of the output comment.

Description:

The .comment statement displays the msg as part of a diagnostic message, on the terminal and in any listing that might be produced. YAA then continues assembling source code normally.

The format of the output message is

          filename,linenumber: label: msg
if a label is specified. Otherwise, the format is
          filename,linenumber: comment: msg

A comment is similar to a warning (generated by .warning) but simply provides information. It does not warn of a possible problem.

8.11.5 Displaying Values (First Pass): .print

Use:

                  .print  expr
          label:  .print  expr

Where:

expr
is an expression whose value you want to display.
label
is treated as a string and is not evaluated. The specified label is printed as part of the output comment.

Description:

The .print statement evaluates expr, then displays the value on the terminal and in any listing that might be produced. This evaluation takes place on the assembler's first pass.

The format of the output message is

          filename,linenumber: label: expr
if a label is specified. Otherwise, the format is
          filename,linenumber: print: expr

If the given expr has a string value, the value is displayed inside double quotes. This is different from what's done with similar statements like .warning and .comment, where string values are displayed without double quotes.

Note that .print works on the assembler's first pass through the source code. This means that assembler variables will be evaluated with whatever value they have at the time. There is also a .print2 statement which works on the second pass through the source code; at this point, assembler variables will have whatever values they had at the end of the assembler's first pass.

8.11.6 Displaying Values (Second Pass): .print2

Use:

                  .print2  expr
          label:  .print2  expr

Where:

expr
is an expression whose value you want to display.
label
is treated as a string and is not evaluated. The specified label is printed as part of the output comment.

Description:

The .print2 statement evaluates expr, then displays the value on the terminal and in any listing that might be produced. Evaluation takes place on the assembler's second pass.

The format of the output message is

          filename,linenumber: label: expr
if a label is specified. Otherwise, the format is
          filename,linenumber: print: expr

If the given expr has a string value, the value is displayed inside double quotes. This is different from what's done with similar statements like .warning and .comment, where string values are displayed without double quotes.

Note that .print2 works on the assembler's second pass through the source code. This means that assembler variables will have whatever values they had at the end of the assembler's first pass. There is also a .print statement which works on the first pass through the source code; at this point, assembler variables will have whatever value they have during the pass.

8.12 Assembly Clean-Up: .wrapup

Use:

          .wrapup macroname

Where:

macroname
is the name of an opcode macro you want to invoke after all other code has been assembled.

Description:

The .wrapup statement lets you "clean up" after an assembly. After all code has been assembled, YAA invokes the specified opcode macro without arguments, in order to perform any clean-up operations you want.

A program may have any number of .wrapup statements. Macros specified in .wrapup statements are invoked in the reverse of the order in which they were specified (last in, first out).

If there are several .wrapup statements specifying the same macro name, only the first is significant. The rest are quietly ignored.

The opcode macro specified in a .wrapup statement need not be defined at the time the .wrapup statement is encountered; YAA simply records the macro's name. The macro must be defined at the time the assembly terminates. All expressions in the macro must be immediate expressions (relative to the end of the assembly).

8.13 Hardware Inhibits: .inhibit

Use:

          oldvalue:  .inhibit  newvalue

Where:

oldvalue
is the name of a variable. YAA assigns this variable the previous value of the hardware inhibit. This label may be omitted.
newvalue
is an integer expression giving the new hardware inhibit value you want to set.

Description:

Every section has an associated inhibit value. This is set by the .inhibit pseudo-op. For example,

          .inhibit 1
sets the inhibit value of the current section to 1.

The interpretation of an inhibit value is system- dependent. For example, on the DPS-8 an inhibit value of 1 turns on the inhibit bit in a hardware instruction, while an inhibit value of 0 turns it off.

When a section is created, its default inhibit value is 0 (implying no inhibit).

If you leave a section then return, the inhibit value of the section is the same as when you last left it. To change the inhibit value, you must issue an explicit .inhibit statement.

If a .inhibit statement has a label field, the symbol in the label field is assigned the previous value of the hardware inhibit. For example,

          X:  .inhibit  1
assigns X the old inhibit value.

8.14 Naming Difficulties: .equate

Use:

          asmname:  .equate   realname

Where:

asmname
is a valid YAA symbol name that you intend to use within the source code to stand for another symbol name.
realname
is a string expression giving the real name of a symbol. Presumably, this name does not comply with YAA's rules for symbol names; thus, within the source code, you must use asmname to stand for realname.

Description:

Names on a given system may not match the rules governing names in YAA. For example, a system may allow symbols to have names that contain a "*" character, even though YAA does not.

The .equate directive gets around this problem by letting you associate the valid YAA name asmname for the invalid name realname. For example,

          _name:  .equate  "*name"
states that your YAA source code uses _name to stand for the actual symbol *name. When YAA generates linking information, it uses *name as required.

The .equate statement must appear before the first occurrence of the asmname it defines. The realname in an .equate statement may not contain '\0' characters (octal 000, the ASCII NUL).

Note that the above .equate statement is not the same as

          .define _name,*name
The .define pseudo-op works strictly textually, so _name would immediately turn into *name and an error message would be generated.

The .equate directive can also be used in cases where a symbol name might conflict with a YAA reserved word. This sort of conflict should seldom happen, because YAA's keywords are only reserved in restricted contexts.

8.15 Keyword Synonyms and Hardware Extensions: .synonym

Use:

          asmname:  .synonym  keyword

Where:

asmname
is any valid YAA name. YAA will treat this is as a synonym for the given keyword.
keyword
is any YAA contextual keyword.

Description:

The .synonym pseudo-op defines asmname as a synonym for a (contextual) keyword. For example, consider:

          DU:    .synonym  du
          P.RET: .synonym  p4

The first lets you enter the keyword du in either upper or lower case. The second lets you use the symbolic name P.RET in place of p4 (for pointer register 4).

.synonym may also be used to define synonyms for normal opcodes (e.g., lda, tra), but not for YAA pseudo-ops (e.g., .data, .if).

The difference between .synonym and .define is that .synonym only does the replacement within contexts where the keyword is reserved. For example, suppose you have a label named DU. With

          DU:   .synonym  du
the label is not changed because du is only reserved in the operand field. On the other hand,
          .define  DU,du
replaces all occurrences of DU with du, regardless of where the DU occurs.

Hardware Extensions

.synonym has a feature designed to support extensions to the hardware. For example, suppose that new hardware comes out with opcodes that are not currently recognized by YAA. Ultimately, YAA will be updated to recognize the new codes; in the meantime, however, you can work around the problem using .synonym.

To use .synonym for this purpose, you use the format

          newop:    .synonym   oldop,bitpattern
where newop is the new opcode, oldop is an old opcode that has a similar format, and bitpattern is a word-length expression giving the pattern of bits in an instruction containing the new opcode. Note that the bit pattern should look like a complete instruction; YAA will extract the relevant bits of the opcode from this instruction.

Once you have specified a new opcode in this way, you can use it in source code. For example, the following declares a new opcode that is similar to a tra instruction, then uses that opcode in an assembler statement:

          tnew:   .synonym  tra,bitpattern
                  tnew      address

You should note the limitations of this operation. For example, suppose new hardware contains a new vector opcode. If the new opcode follows the same model as existing vector opcodes, you will have no problem defining it (in terms of one of those existing codes). However, if the new opcode follows a different model, the assembler will not know how to handle the arguments of the new opcode.

You can use the same technique for defining new tags:

          newtag:  .synonym   oldtag,bitpattern
This says that the new tag is patterned on an existing tag, but has the given bit pattern. In general, you can then use the new tag in any context you could use the old tag. There are, unfortunately, some exceptions. In particular, YAA does validity checking on certain EIS instructions and only recognizes old tags as valid. Even if you define a new tag with a .synonym pseudo-op, the new tag will not pass YAA's EIS validity checking.

8.16 Alters

Use:

          .alter_append number,"file"
             #statements
          .endalter
          .alter_delete  startline,endline,"file"
          .alter_change startline,endline,"file"
             #statements
          .endalter

Where:

number
is an integer giving a line number within the given file. .alter_append appends the given statements after that line.
"file"
is a string expression giving the name of the source code file you want to alter. If this is omitted, YAA assumes the file that used .include to include the source file that contains the alter pseudo-op.
startline
is an integer giving the line number of the first line you want to delete or change.
endline
is an integer giving the line number of the last line you want to delete or change.

Description:

Alters are instructions which change lines found elsewhere in the source code. They are usually used to record patches made after software has been released.

For example, suppose you make an emergency bug fix to code that is already in production use. You may want to keep the original code intact as a record of the "official" release, then supply alters that put in the bug fix. Typically, you would put all the alters into a single file, then use .include statements to apply those alterations to the relevant source files. For example, suppose that the file test.y contains

          .include "alts.y"
              ldi    0,dl
              adx2   2,du
              tra    0,xl
              stq    0,ic
and that the file alts.y contains
          .alter_change 3,4,"test.y"
              adx2   4,du
          .endalter
          .alter_append 3,"blah.y"
              adx2   5,du
          .endalter
When YAA assembles test.y it will find the .include statement and read code from alts.y. It will collect the alterations for test.y from alts.y and use those alterations to change the source code of test.y before processing. YAA ignores any alterations that don't apply to test.y (specifically, the .alter_append that applies to the file blah.y). Thus the same file can contain alterations to many different source files.

YAA only makes alterations if the alter pseudo-ops appear before the text they are supposed to alter. As a result, you should .include a file containing alters as one of the first instructions in the source code.

It's important to note the difference between alters in YAA and patches made to machine code. For example, if you are using patches to delete machine code instructions, you usually replace the instructions with an appropriate number of no-op instructions, or else put in an instruction to jump around the code you want to get rid of; the size of the program doesn't actually change. However, if you use an alter to delete source code, the code is actually removed from the text that YAA assembles, so the program actually does get smaller.

The sections that follow describe the constructs that can be used to make alterations.

Appending New Instructions

The .alter_append construct tells YAA to append one or more statements after line number in file. For example,

          .alter_append 4,"test.y"
              ldi   0,dl
          .endalter
appends the given ldi instruction after line 4 in test.y. Any number of statements may be supplied between the .alter_append and .endalter.

The file name may be omitted from the .alter_append directive. When YAA finds a construct of the form

          .alter_append number
              statements
          .endalter
YAA assumes that the statements should be appended after the given line number in the file that included the file that contained the construct.

For example, if test.y includes alts.y and alts.y contains an .alter_append of this form, the alterations are made in test.y. If the construct is in a file that was not included (e.g. if the construct is found right in test.y) the alterations are applied in the file that contains the .alter_append instruction.

Deleting Existing Instructions

The .alter_delete directive deletes lines from a specified file, beginning at line number startline and ending with line number endline. For example,

          .alter_delete 3,10,"test.y"
deletes lines 3 through 10 from test.y. If only one line number is supplied, as in
          .alter_delete 7,"test.y"
YAA only deletes that line.

The file name may be omitted from .alter_delete. If so, the situation is similar to that of .alter_append. YAA deletes the line(s) from the file that includes the file that contains the .alter_delete instruction. If the instruction appears in a source file that has not been included by another file, YAA deletes the line from the current file.

Changing Existing Instructions

The .alter_change construct substitutes statements in the specified file, beginning at line number startline and endling at line number endline. For example,

          .alter_change 3,10,"test.y"
               adx2  4,du
               stq   0,ic
          .endalter
replaces lines 3 through 10 in test.y with the two given instructions.

If only one line number is supplied, the given statements are put in place of that line.

You may omit the file name on the .alter_change instruction. In this case, the change is made in the file that includes the file that contains the .alter_change. If the current file was not included by another file, the change is made in the current file.

9. Using YAA to Produce Listings

YAA has sophisticated facilities for producing listings of source input. In this chapter, we examine these facilities.

9.1 Command Line Options

In order to produce a listing, you must specify the option

          List=file
on the command line that invokes the YAA assembler. This tells the assembler to write the listing to the specified file. If this option is not present on the command line, all other listing features will be inoperative.

The command line option

          PageSize=number
lets you control the number of lines per page in the listing. The given number must be an integer greater than or equal to 20.

The command line option

          LineSize=number
lets you control the number of characters per line in the listing. The given number must be an integer greater than or equal to 72. If your source code is using 80-character lines, the LineSize of the listing should be at least 130; otherwise, listing lines will have to be "folded" (split in the middle). Folding makes output much harder to read.

Bear in mind that all of the above are command line options, specified when you assemble your source code. They do not appear in the source code itself.

9.2 Controlling Listings: .list

Use:

          .list code

Where:

code
is an integer expression indicating what should and shouldn't appear in the listing. For further information, see below.

Description:

The .list pseudo-op controls the format of output listings. Source code may contain any number of .list pseudo-ops, to change listing options in the middle of assembly. For example, if you only want a listing for one section of a source module, you can use a .list pseudo-op to start up a listing at the beginning of the section, then use another .list pseudo-op to turn off the listing at the end of the section. When the listing is produced, it will display the code that was assembled between the two pseudo-ops.

NOTE: As we mentioned in the last section, the YAA command line must contain a List=file option to obtain a listing. .list pseudo-ops have no effect if source code is being assembled without a List=file option.

The .list pseudo-op has the form

          .list code
The code is an integer value whose bits represent listing options: if a bit in code is on, the corresponding listing option is turned on; if a bit in code is off, the corresponding listing option is turned off.

We will discuss the various listing options shortly. For the moment, we will discuss a few more features of .list. If a label is specified for the .list pseudo-op, as in

          X: .list value
the label is treated as a variable and assigned a code integer that represents the previous listing options. For example, in
          old: .list new
               ...
              .list old
the first statement saves the current options in old and then sets the options to new. The final statement restores the old listing options using the old code.

If the form of the statement is

          label: .list
without any code, the current code value is assigned to label. The listing options are not changed.

You should take care that the expression for .list does not contain any errors. For example, a typo in entering a listing option could create an undefined variable name, which is automatically given the value 0. The result of this will often be to turn off all listing options and kill the rest of your listing.

The .list Function

Use:
          .list()
Description:

The .list function returns the current listing code value. For example,

          X:  .set  .list()
assigns the current listing code function to X.

In the sections to come, we will give examples of how the .list function may be used productively.

Listing Options

Listing options are represented by bits in an integer code. Each bit may be referenced by a predefined YAA variable whose name begins with .l_. For example, the variable .l_macro represents the bit that controls whether or not the listing will show the expansions of opcode macros.

          .list  (.list() | .l_macro)
uses the .list function to obtain the current listing code and then uses a bitwise OR operation (|) to turn on the .l_macro bit. The effect of the above instruction is to turn on the printing of macro expansions, and to leave all the other options the same.
          .list  (.list() && (~.l_macro))
turns off the printing of macro expansions and leaves all the other options the same. It does this by using the bitwise complement operator (~) to produce a value that has every bit on except the .l_macro bit, then bitwise ANDing this value with the current option code.

If options indicate that any part of an input line should be printed, the whole line is printed. For example, if there are several commands on a single line and one of those commands is a type that should be printed, the entire line is printed.

As noted in an earlier chapter a statement may be continued from one line to the next by putting a backslash on the end of the line. If any line in such a continued statement is printed in the listing, the whole continued statement is printed.

You may find that listings become hard to read if you put more than one statement on a line, especially when this is combined with continued statements. This is because of the interplay between the various options that determine what is and isn't listed. For the greatest readability of listings, avoid multiple statements on the same line.

Note that you may still run into some odd situations when a line contains a literal. A statement inside the literal may cause the whole line to be printed, even though the statement that contains the literal is not one of the types that is currently being printed.

In the descriptions that follow, we give the name associated with each code bit. Since bit positions may change from one release to the next, the actual bit positions are not documented. You should always refer to code bits using their symbolic names, not their actual values.

Listing options break up naturally into several classes. We will describe these classes separately.

Listing Source Options

The first class consists of options related to the origin of source code.

.l_source
If this bit is on, the listing shows all statements that came from the primary source file. By the primary source file, we mean the source file that was named on the command line that invoked the YAA assembler.
.l_include
If this bit is on, the listing shows all statements that came from files obtained with .include directives.
.l_std_include
If this bit is on, the listing shows all statements that came from the initialization file, plus statements from any file included by the initialization file. (As noted in Chapter 3, the initialization file is specified with an INITialize=file option on the YAA command line. If you don't specify an initialization file, a default initialization file is used.
.l_loop
If this bit is on, the listing shows all statements that are generated by looping constructs like .while.
.l_show_def
If this bit is on, the listing shows macro definitions, and the code of looping constructs before they are expanded (the way they were "gathered" from the input). For example, if both .l_show_def and .l_loop are on, you see the original way that a loop was written up in the source code, plus the statements that resulted from executing the loop.
.l_macro_src
If this bit is on, the assembler records macro definitions as they are read. This means that later on, when the macro is invoked, the listing can show the original lines from the macro definition as well as the code generated when the macro is expanded. If this bit is off, macro definitions are not recorded; this means that the macro definition lines are not available for printing when the macro is finally invoked.
.l_macro
If this bit is on, the listing shows all statements that are generated when a macro is used and expanded in source code. For .l_macro to work, .l_macro_src must have been turned on when the macro was defined. Otherwise, YAA won't have a copy of the original lines and won't be able to show you the lines that the original lines generated.
For example, with
          .list (.l_source | .l_loop) | other_options
the listing shows all statements that appeared in the primary source file and all statements generated by looping constructs. If the primary source file contains an .include statement, the listing shows the statement itself (because that was in the primary source file) but does not show the source code obtained from the include file (because .l_include is not on).

Note that the .list statement above contained the notation other_options. As you will see in the next section, you need to turn on some "statement use" options as well as some "listing source" options; otherwise, you get no output.

When YAA is deciding whether or not to print a source line, it checks the appropriate listing source option first. If the source option is off, the input line is not printed, no matter what other options might be relevant.

Statement Use Options

The second class of options control the kind of statements that appear in the listing.

.l_comment
If this bit is on, comments are printed in the listing. If it is off, they are not shown. For the purpose of this option, a comment is a blank line or a line that only contains a comment.
.l_if
If this bit is on, the listing shows .if-.else- .endif statements and the code they enclose. If this bit is off, the listing only shows code that was actually used, not code that was skipped because of the action of .if; you will not see the .if- .else-.endif statements that controlled the code.
.l_list
If this bit is on, the listing shows listing control directives (.list, .title, .eject, etc.). If it is off, they are not shown (although their effects are still seen).
.l_opcode
If this bit is on, the listing shows all statements that actually generate object code (machine instructions or data). If it is off, these are not shown.
.l_var
If this bit is on, the listing shows all statements that control the behavior of symbols (e.g. .var, .symdef, and so on). If this bit is off, these are not shown.
.l_invoke
If this bit is on, the listing shows every statement whose opcode field is a macro invocation. If it is off, macro invocations are not shown.
.l_pseudo
If this bit is on, the listing shows every statement whose opcode field is a pseudo-op that is not covered by one of the other statement use options. If it is off, such pseudo-op statements are not shown.
For example, suppose you have
          .list (.l_source | .l_opcode | .l_invoke)
The listing shows all statements from the primary source file that have true opcodes or macro invocations. It will not show comments or pseudo-ops.

Note that there are several options that relate to opcode macros. .l_show_def shows the definition of macro. .l_macro_src records the macro definition. .l_invoke shows statements that invoke macros. .l_macro shows lines generated by the invocation of a macro; for it to work, .l_macro_src had to be turned on when the macro was defined. For example, if both .l_invoke and .l_macro are on, you see the statement that invokes an opcode macro, followed by the code that was generated when the macro was expanded.

Output Format Options

The third class of options control the format of the listing.

.l_continuation
If this bit is on, the listing shows continuation lines of statements. Note that it only shows continuations of statement types that are actually being printed (according to the "statement use" options).
.l_detail
If this bit is on, the listing shows all data produced by the statements listed. If this bit is off, the listing normally shows only the first word of data produced. For example, suppose you have
        .data 1,2,3
producing three words of data. With .l_detail off, the listing only shows the contents of the first data word produced; with .l_detail on, the listing shows all data produced. If .l_detail is off, YAA still tries to show as much data as possible; for example, if a line that produces many words of data is followed by a blank line or a comment, YAA puts the first word of data on the original statement, and displays more words of data on the blank lines or comments that follow. Thus in some circumstances, you see more than just one word of data.
.l_crossref
If this bit is on, cross-reference data is collected, to be shown at the end of the listing. If this bit is off, only a limited amount of cross- reference data is collected; this includes data on definitions and undefinitions, and the creation of SYMREFs and SYMDEFs. You can turn the bit on and off to get full cross-reference data on some pieces of code and abbreviated cross-references on others.

Cross-reference information takes up a great deal of memory. If the assembler runs out of memory because of the presence of cross-reference information, YAA turns off the option and discards all cross-reference information that has been collected up to this point. (YAA outputs a warning message if it is forced to do this.)

.l_not_used
If this bit is on, the cross-reference index contains entries for symbols, even if they were never used. If this bit is off, unused symbols do not appear on the cross-reference index. Note that if you turn on this bit, your cross-reference index mentions all YAA's predefined symbols (like the .l_ manifests used to turn listing options on and off). There are quite a lot of these, and not all of them are documented.
.l_relative
If this bit is on, the listing shows relative line numbers for each statement. A relative line number gives the line number of the statement within the file that contained the statement. This is useful when you have several include files.
.l_absolute
If this bit is on, the listing shows absolute line numbers for each line in the listing. The first listing line has an absolute line number of 1, and lines increase sequentially from there. Absolute line numbers are more convenient than relative ones for some reference purposes, because they are easier to find in the main body of the listing.

Absolute line numbers are only assigned to lines that would be printed because of the listing source options. For example, if you are not listing the contents of .include files, the contents are not assigned absolute line numbers. On the other hand, lines that are not displayed because of other kinds of options are still assigned absolute line numbers. For example, even if comments are not being printed, they are still assigned absolute line numbers. In this way, a jump in the absolute line numbers gives a hint that the input contained lines that are not shown in the listing.

.l_toc
If this bit is on, the listing includes a table of contents in the information sections at the end.
.l_expanded
If this bit is on, the listing shows the source code lines as they are actually assembled, after all macro and manifest expansion has taken place. Expansions are only shown in lines whose contents were changed by the expansion process. Output from this format may be cluttered and hard to read. In the left hand margin, you will see the notation Expanded text.

Alter Options

The fourth class of options control how alters appear in the listing.

.l_alter
If this bit is on, the listing shows the alters that have been made in your source code.
.l_alter_comment
If this bit is on, YAA puts comments before and after groups of lines that have been changed because of an .alter_append or .alter_change instruction. The comment before the alterations has the form
        ##alter: "name",number
where name is the name of the file that contained the pseudo-op and number is the line number where the pseudo-op appeared. The command after the alterations has the form
        ##end alter
.l_delete
If this bit is on, the listing shows lines that have been deleted because of the .alter_delete pseudo-op. Such lines are marked with **deleted** in the left hand part of the listing.

Defaults

By default, the following options are turned on.

          .l_source         .l_macro_src       .l_show_def
          .l_comment        .l_if              .l_list
          .l_opcode         .l_var             .l_invoke
          .l_pseudo         .l_continuation    .l_crossref
          .l_relative       .l_absolute        .l_toc
          .l_alter
Other options are turned off.

9.3 Overriding List Options: .forcelist

Use:

          .forcelist  ANDlist,ORlist

Where:

ANDlist
is an integer expression. The bits turned on in ANDlist are ANDed with the current .list settings, to give new settings. In other words, bits turned off in ANDlist will be turned off in the .list settings.
ORlist
is an integer expression. The bits turned on in ORlist are ORed into the current .list settings. In other words, bits turned on in ORlist will be turned on in the .list settings.

Description:

The .forcelist pseudo-op overrides the behavior of .list. It is for very specialized debugging purposes and should be avoided in everyday use.

To understand the use of .forcelist, consider a macro that contains its own .list statements (e.g. to suppress the display of items created in the macro expansion). Debugging such a macro would be difficult, since the action of the macro itself affects the output listing.

This is where .forcelist comes in handy. .forcelist does not change the current listing code as used by the .list pseudo-op and returned by the .list function, so all such statements continue working as normal. However, when a listing is actually produced, the listing options will be those set up by .forcelist rather than those dictated by .list.

As an example of .forcelist, consider

          .forcelist -1,.l_macro | .l_macro_src
The -1 is ANDed with the current settings, leaving everything the same; the .l_macro bit is ORed with the current settings, turning that bit on. Thus the above .forcelist statement turns on the .l_macro bit so that statements generated in macro expansions will be printed. Even if a macro uses .list to turn off the printing of macro expansions, the expansion is still listed, because the effects of .forcelist override .list.

9.4 Starting a New Page: .eject

Use:

          .eject

Description:

The .eject statement tells YAA to start a new page in the listing (if a listing is being produced at the time). For example, you might put an .eject at the beginning of each function so that each function starts on a new page.

.eject does not start a new page if it appears in a file that is not being listed (e.g. an .include file). If there is an .eject in a macro, invoking the macro causes the .eject even if the code generated by the macro is not being listed.

9.5 The .title Pseudo-Op Revisited

In Chapter 8, we discussed the .title pseudo-op and how it could be used to record a title in the object output. It also has an effect on the listing. If you say,

          .title "string"
the listing starts a new page and prints "string" as the title at the top of that page. Each subsequent page has the same string as title, until a new .title statement is encountered.

Like .eject, .title has no effect if it appears in a file that is not being listed.

If a .title statement appears in an include file and the include file is being listed, the new title is used for the remainder of the include file. When the assembler returns to the original source file, it reverts to the previous title.

YAA lets you create subtitles with .title. This is done by putting a number before the string, as in

          .title 1,"heading"
          .title 2,"sub-heading"
          .title 3,"sub-subheading"
Numbers cannot be greater than 3. See the description of .title in Chapter 8 for further details.

9.6 Main Listing Format

The main part of the listing displays source code and the data that it generates. It is divided into a right and left part.

9.6.1 Right Part of Listing

The right hand side of the listing shows input code, as dictated by the listing options. Typically, lines may have both a relative and an absolute line number.

The right part of the listing has tab stops set every four spaces.

9.6.2 Left Part of Listing

The left hand side of the listing indicates the material that is generated by the corresponding statements on the right hand side.

The left hand side begins with a field giving the current offset within the current section. This is basically the value that would be returned by .ic at this point.

The next field shows the first word of object code generated for the given statement. Underscores are used to break this up into logical pieces. For example, a generated machine instruction might be divided into one piece for the actual opcode and separate pieces for the various operands. Generated data is broken up into pieces according to a variety of heuristic rules aimed at figuring out the most intelligent method of division. If YAA cannot decide on a good method of division, the generated object output is not split up.

After this comes a list of relocations that may be applied to the generated object. Relocations have the form

          Xnn
where X is an uppercase letter and nn is an integer.

Some generated code may have more than one associated relocation. The listing will only show the first four relocations, regardless of how many additional relocations there may be.

If the value that would normally be printed on the left part of the listing is too long to fit, it just won't be printed. For example, this will happen with long string constants.

9.6.3 Listing Rules

If the .l_detail listing option is turned on, the left hand part of the listing will show all the code generated for the statements on the right hand part.

If the .l_detail listing option is turned off, the left hand part of the listing will only show the first word of output generated.

If no code is generated (e.g. by a .set pseudo-op that assigns a value to a YAA variable), the listing will try to show some intelligible value. For example, with .set, it shows the value assigned to the YAA variable. If you want to display some other value, you can use the .display pseudo-op (described in Section 9.6.4).

There is a special case when a statement generates more than one word of output and the statement is followed by one or more comment lines. In this case, the additional words of output are shown on the left hand side of the comment lines.

The YAA listing facilities can "backtrack" a maximum of three lines. For example, consider input of the form

          macro(arg1,\
                arg2,\
                arg3)

When YAA reaches the end of the macro call, it can backtrack to the line where the macro started and put the appropriate left hand side on that line. However, if the macro call was

          macro(arg1,\
                arg2,\
                arg3,\
                arg4)
YAA would read down to the end of the macro call before generating any code. When it figured what should go on the left hand side of the listing, YAA could only backtrack three lines. Thus the left hand side output for the macro would begin on the arg2 line.

The value printed on the left of an .if directive is the value of the expression on the .if. An .if or .elseif that has been skipped will not have a value on the left; by checking which do and don't have values on the left, you can follow the flow of control.

If a macro invocation actually generates code or data, the left side of the listing will show the first code or data generated. If no code was generated, YAA normally does not put anything on the left side of the macro. However, the .display pseudo-op (described below) can be used to give YAA specific instructions on what should be displayed on the left hand side.

9.6.4 Displaying Values: .display

Use:

          .display  expression

Where:

expression
is any expression. YAA displays the value of this expression on the left hand side of the listing, provided that you are producing a listing at the time.

Description:

The .display pseudo-op gives you some control over what is displayed on the left hand side of the listing. The pseudo-op tells YAA that you would like the value of the given expression displayed on the left hand side of the listing.

In situations where YAA has to decide what to display on the left hand hand side of the listing, .display takes priority over other values that YAA might display. For example, YAA normally uses the left hand side of the listing to show the value of an expression on an .if statement; however, if you specify a .display statement, this overrides the usual behavior.

.display is useful inside macros. If a macro actually generates code or data, the left side of the listing normally shows the generated code or data; but if the macro doesn't generate code or data, there is usually nothing on the left side of the listing. By putting a .display inside the macro, you can display a particular value beside the macro invocation.

9.7 Information Sections

After displaying the source code, the listing provides several information sections summarizing features of the source code. Information sections will all be put on one page if possible.

If you use .title statements to give titles to your listings, the information section page(s) will use the first heading created by a .title statement within the primary source file. Sub-headings and sub-subheadings are not used.

At present, the following information sections are provided:

Source File Summary
This names the source files and include files that were examined in the assembly. It gives the page and/or line number where each source file began. Line numbers may be given as absolute and/or relative, depending on options.
Table of Contents
This lists the various headings, sub-headings etc. created with .title statements in the listing. It gives page and/or line numbers (absolute and/or relative) for each .title statement.
Relocation Types
This lists the one-letter codes used to refer to relocation types in the body of the listing. It also gives a brief description of what each code means.
External Names Section
This gives the external names defined or referenced in the assembly and the reference numbers associated with those names.
Sections Section
This gives a summary of all the sections defined in the code. For each section, you are told the name of the section, its upper and lower bounds, and its type (template or real).
Options Section
This gives a summary of all the options specified in .option pseudo-ops.

9.8 Cross-Reference Index

The cross-reference index appears as the last part of the listing. It contains data that was collected during periods when the .l_crossref option was turned on.

An entry in the cross-reference index describes all the places where a particular symbol appeared. Entries are collected into groups:

  1. Symbols actually defined in the program.
  2. Keywords. The keywords section includes all recognized opcode names, and all standard symbols (e.g. register names).
  3. Built-in symbols. You will only see those which are actually used in your source code.
Within each group, symbol names are sorted according to the collating order of the underlying character set. Using the ASCII character set, this means that names beginning with uppercase letters will precede names beginning with lowercase letters.

Each entry describes the symbol's type. Possibilities include:

          argument   -- argument in macro definition
          keyword    -- opcode or standard symbol
          local      -- local variable in macro
          symbol     -- YAA variable or other symbol
          built-in   -- built-in symbol
Following the symbol's type, the listing gives the value of the symbol at the end of the assembly. This value will not be given if the symbol was undefined or out of scope at the end of the assembly. Some symbols may have a value of the form
          <+Y
This means that the symbol refers to the location that is an offset of Y from the symbol with reference number X. The External Names Section gives the reference number of some symbols. Negative reference numbers refer to templates. Other reference numbers may be generated artificially (e.g. to refer to literals).

Each entry also lists the lines where the symbol appeared. If absolute line numbers are being used, absolute line numbers will be given. If relative line numbers are being used, the cross-reference will show the file name and the relative line number. If both types of line numbers are being used, the cross-reference will show both.

Each line number reference is followed by a character indicating how the symbol was used on that line:

f
first occurrence of the symbol.
d
point at which the symbol was defined.
s
point at which the symbol was declared to be a SYMDEF.
x
point at which the symbol was declared to be a SYMREF.
u
point at which the symbol became undefined.
If none of the above usages can be determined, a blank character is used.

If a name is a synonym for another symbol, the name will appear in the group that is appropriate to the original symbol. If A is a synonym of B which is a synonym of C, YAA will follow through the synonym "chain" until it finds the original name. All synonyms will be listed with this original name.

9.8.1 Local Macro Variables

The local variables used in macros must be given unique names for each invocation of the macro. The reason for this is that it is possible for macros to "export" local variable names in various ways (so that the names become visible outside of the macro). Thus, every local variable is given a name of the form

          .MACRONAME_N_I
where MACRONAME is the name of the macro that contains the variable, N says that this is the Nth local variable defined within the macro, and I says that this is the Ith invocation of the macro.

Names of this form will often appear in the cross- reference index. Various option listings may also make these names visible.

9.9 Error Handling

Error messages are always written to the terminal as the YAA assembler runs. They will also be written to the listing if a listing is being generated.

Errors are written to the listing at the point where the error is detected. This may not be the point at which the error actually occurred. A mistake may not become apparent until several statements after the mistake actually takes place.

YAA makes two passes through input: once to parse the source code and once to generate appropriate object code. If you are not creating a listing, YAA will stop assembling after 20 errors are being detected. If you are creating a listing, YAA can handle up to 200 errors on each pass. If you have more than 200 errors on the first pass, the listing will probably not be valid. If you have more than 200 errors on the second pass, the listing will simply stop when the error limit is exceeded; this means that you will not get a table of contents, cross-reference listing, etc.

10. Debugging Directives

When you compile a program with the system's C compiler, the compiler can store debugging directives in the object code. These directives can be examined by a symbolic debugger to obtain information about your program. Debugging directives record information that would otherwise not be present in the compiled code.

Unlike a compiler, the YAA assembler does not automatically generate debugging directives for your code. However, you can put explicit statements in your code that generate such directives. For example, you can use YAA statements to create named types that the debugger can later interpret as C data types. You might use this if you are using YAA to write a subprogram that will be called from a C program; by giving the YAA code the same kind of debugging directives that the C code has, you can analyze the YAA code with the same debugging tools that you use with C.

Debugging directives are generated using YAA pseudo-ops. The actual output created has the format of LD object code directives, explained in The LD Object Format Reference Manual. The rest of this chapter assumes that the reader is familiar with the LD object format as described in that manual.

10.1 Declaring Types: .type

Use:

          name:   .type  op op op ... known_type

Where:

name
is the name you want to give this type.
op op op ... known_type
is a series of operators defining the new type in terms of a known type.

Description:

The .type pseudo-op defines a data type by generating an LD_DEFTYPE directive. This defines a data type in terms of a type that is already known. A known type is one that has been defined by a previous .type statement, or else one of the following keywords:

        .struct      -- structure type
        .union       -- union type
        .enum        -- enumerated class
        .void        -- the C void type
        .char        -- C char (signed)
        type
        .lchar       -- C long char type
        .short       -- C short int type
        .int         -- C int type
        .long        -- C long type
        .uchar       -- C unsigned char
        type
        .ulchar      -- C unsigned long
        char
        .ushort      -- C unsigned short
        type
        .uint        -- C unsigned int type
        .ulong       -- C unsigned long
        type
        .float       -- C float type
        .double      -- C double type
        .ldouble     -- C long double type
        .label       -- statement label
        .block       -- block label
As an example,
          X:  .type  .int
defines a type with the name X and the type int.

A .type pseudo-op may define several names for the same simple type, as in

          [NAME1,NAME2,...]:  .type  .int
However, if the type is a typedef, a structure, a union, or an enum class, there can only be one name.

Type Operators

Type operators are used in .type pseudo- op statements to create derived types. For example, the ptr operator creates pointer types.

          IPTR:   .type    ptr .int
declares IPTR as a type which is a pointer to int values. This is similar to the C statement
          typedef int *IPTR;

All .type statements are similar in purpose to C typedef statements. However, if you want to generate a debugging directive that corresponds to true C typedef declaration, you have to use the typedef operator (discussed below).

The following operators are currently supported:

          ptr        -- pointer
          const      -- C const modifier
          volatile   -- C volatile modifier
          far        -- C far modifier
          near       -- C near modifier
          huge       -- C huge modifier
          func       -- function (see below)
          field      -- bit field (see below)
          typedef    -- C typedef
          incomplete -- incomplete type
          "array"     -- see below
Note that "array" is not an operator; the form of an array operator is described later on.

Function Operators

Function types in .type pseudo-ops may be specified by the operator func followed by a list of argument types, in parentheses, followed by the type of the return value:

          NAME:  .type
          func (argtypes) result_type
For example,
          FUNC:  .type   func (.int,.double) .int
says that the FUNC type of function takes an int and double argument, and returns an int value. Argument and return value types take the usual form: op op op ... known_type, as in
          IPTR:    .type   ptr .int
          IFUNC:   .type   func (IPTR,.int) IPTR
The IFUNC function returns a pointer to an integer. It takes two arguments: one a pointer to an integer and one a normal integer. This corresponds to the C declaration
          typedef  int *IFUNC(int *,int);

If a function takes no arguments, just use empty parentheses, as in

          X:   .type   func () .int
which takes no arguments and returns an int value. If the argument types are unspecified or unknown, omit the parentheses, as in
          Y:   .type   func .int
which returns an int value and has unspecified arguments.

The "..." notation of ANSI C can be used for functions which take a list of arguments with indeterminate numbers or types. For example,

          PRINTF_TYPE:  .type  func (ptr .char,...) .int
defines a function type that takes a char pointer plus an indeterminate number of other arguments, and returns an int. The format
          func (...)
is allowed, when there are no fixed arguments.

Bit Fields

The two formats for defining a bit field type are

          field (expr) .int
          field (expr) .uint
where expr is any expression yielding an integer value. This gives the number of bits in the bit field. Only int and unsigned int bit fields are supported.

If the expr giving the length of the bit field is a single integer constant, the parentheses may be omitted.

Array Operators

Array operators may have three forms. The first form is:

          [number]
This indicates an array with the given number of elements. The lowest subscript is assumed to be zero. For example,
          IARRAY:   .type   [10] .int
defines a type named IARRAY. This type corresponds to an array of ten integers, numbered 0 through 9.

The second array operator form is

          [lower : upper]
where lower is an expression giving the lower bound of array subscripts and upper is an expression giving the upper bound of array subscripts. For example,
          FORTRAN:   .type   [1:10] .int
defines a type named FORTRAN. This type corresponds to an array of ten integers, number 1 through 10.

The final array operator form is

          [lower : upper : stride]
where lower is an expression giving the lower bound of array subscripts, upper is an expression giving the upper bound of array subscripts, and stride is an expression indicating the distance between the starts of two adjacent elements. For example, in a Pascal packed array of char, characters are placed in adjacent bytes, while in unpacked array of char, characters are distributed one per machine word. Using a stride also lets you specify a "slice" of a multi-dimensional array (e.g. a column from a matrix stored in column major order).

Order of Type Operators

The operators in a .type pseudo-op are parsed strictly left to right. For example,

          X:   .type   [10] ptr .char
defines a type which is an array of 10 pointers to char values. This corresponds to the C declaration
          typedef char *(X[10]);

Defining Derived Types

To define a type as a structure, union, or enumerated class, you simply use

          NAME:   .type   .struct
          NAME:   .type   .union
          NAME:   .type   .enum
This treats the specified name as if it were the tag of a structure, union, or enumerated class. Note that no modifiers are allowed before the keywords .struct, .union, or .enum. Later in this chapter, we will show how to define the contents of a structure, union, or enumerated class.

To create a typedef type, create the type first and then create the typedef itself. For example, to do the equivalent of

          typedef IPTR *int;
use
          tmp:  .type    ptr .int
          IPTR: .type    typedef tmp
or equivalently
          IPTR: .type    typedef ptr .int

10.2 Type Functions

Several functions can be used to work with named types: .sizeof, .alignof, and .typecompare.

10.2.1 The Size of a Data Type: .sizeof

Use:

          .sizeof(type)

Where:

type
is a known data type. This can be a built-in type, a type defined with a previous .type statement, or a constructed type, as in
        .sizeof(.int)
        .sizeof(IPTR)
        .sizeof(ptr .int)

Description:

The .sizeof function returns an unsigned integer giving the size of the given type in bits. Note that this is different from the C sizeof operator, which gives size in bytes.

10.2.2 The Alignment of a Data Type: .alignof

Use:

          .alignof(type)

Where:

type
is a known data type. This can be a built-in type, a type defined with a previous .type statement, or a constructed type, as in
        .alignof(.int)
        .alignof(IPTR)
        .alignof(ptr .int)

Description:

The .alignof function returns an unsigned integer giving the alignment of the given type in bits. For example, if a type must be aligned on a double-word boundary on DPS-8 machines, the result of .alignof will be 72.

10.2.3 Comparing Types: .typecompare

Use:

          .typecompare(type1,type2)

Where:

type1, type2
are two types. These can be named types, built-in types, or constructed types.

Description:

The .typecompare function compares two types to see if they are equal. The function returns a 1 if the types are equal and 0 if they are not. For the purposes of this function, two types are equal if they are constructed from the same sequence of operators and built-in types. Thus, in

          X:      .type   .int
          XPTR:   .type   ptr X
          IPTR:   .type   ptr .int
XPTR and IPTR would be equivalent for the purposes of .typecompare, since they can be worked back to the same sequence of operators and built-in types. A typedef is not considered equivalent to the underlying type. Thus
          X:     .type   .int
          XD:    .type   typedef .int
are considered different types for the purposes of .typecompare.

10.3 Defining Scopes: .scope

Use:

          name:  .scope   parent,options

Where:

name
is a name to be associated with the scope. Multiple names can also be specified, as in
        [X,Y,Z]: .scope  parent,options
parent
is the name of parent scope (i.e. the scope that encloses this one). This will either be the name of a scope defined in a previous .scope directive or the built-in name .extern.

.extern stands for the external scope. This is the scope that contains all external data objects and functions. In C, this corresponds to extern data objects and non- static functions.

options
is a list of options for the scope being defined. Possible options are described in the LD Object Format Reference Manual. The possibilities include:
        root_scope=>>yes
which turns on the LF_ROOT_SCOPE flag,
        root_scope=>>no
which turns off the LF_ROOT_SCOPE flag,
        same_frame=>>yes
which turns on the LF_SAME_FRAME flag, and
        same_frame=>>no
which turns off the LF_SAME_FRAME flag.

Description:

The .scope pseudo-op defines a new scope in the source code. YAA generates appropriate LD_DEFVLIST and LD_SCOPEFLAGS directives to store this information in the object code.

The name space of scopes is different from named types and normal variables. Thus you can have named types, normal variables, and scopes that all have the same name.

Defining a structure type automatically makes the structure into a scope. For example, the instruction

          X:   .type  .struct
creates a scope for X.

10.4 Defining Objects: .object

Use:

          name:   .object    options

Where:

name
is the name that is used for the object within the YAA source code.
options
are a list of options of the form keyword=>>value. These describe attributes of the object. Possible options are:
        type=>>name
which specifies a type for the object. name can be a built-in type name, a named type defined in a previous .type pseudo-op statement, or a type expression. If this option is not specified, YAA determines if there is a named type with the same name as this object. If there is no type=>> option, a default type of .void is used.
        class=>>keyword
specifies a storage class for the object. Possible keywords are:
        extern       -- external
        static       -- static
        auto     -- auto
        register -- register
        arg      -- function argument
        s_elem       -- structure element
        u_elem       -- union element
        e_elem       -- enumerated class
        element
        display      -- display
If no storage class option is specified, YAA chooses intelligent defaults depending on the context. We will discuss this in greater detail shortly.
        name=>>"string"
specifies a name to be put into the LD_DEFVAR directive. If this is different from the name that labels the .object pseudo-op, the debugging directive is given the name specified with name=>> and the YAA source code uses the name given in the label. If no name=>> option is specified, YAA uses the name given in the label.
        scope=>>parent
specifies the name of the scope that contains the object. This can be a scope named in a .scope directive, or the built-in scopes .extern. If this option is not given, YAA uses the scope of the name given in the label, if the label is associated with a location that has a scope.

Description:

The .object pseudo-op describes the functions and variables of a program. It generates an LD_DEFVAR directive.

.object only generates an appropriate debugging directive. It does not generate space for the object being described. Thus you usually need a .space directive for an object as well as an .object directive generating the debugging directive.

Objects have a different name space from named types, scopes, and normal variables. In fact, there are good reasons to define objects with the same name as a named type, as we will see shortly.

Symbol Attributes

When a program issues an .object statement for a symbol that already exists in the program, the .object statement effectively associates attributes with the existing symbol. For example, consider

               .align     .alignof(.double)
          X:   .space     .sizeof(.double)
          X:   .object    type=>>.double
In this, the symbol X is given the alignment and space of a double object. The .object statement gives X the double type attribute, and generates an appropriate LD_DEFVAR directive for X.

Note that X could be used as a double value even if we didn't use .object to give it an explicit double type. However, the .object directive serves two purposes. It generates a debugging directive; and it provides type- checking information.

In a similar way, .object statements can associate other attributes with existing symbols: storage class, scope, names, etc.

Storage Class Defaults

If a storage class is not specified in the .object statement, YAA supplies a reasonable default. The default depends on the context. For example, consider the C declaration

          struct complex {
              float X;
              float Y;
          } Z1;
You could create the same sort of data object with these declarations:
          complex:   .template
          complex:   .type      .struct
          complex:   .object
          X:         .type      .float
          X:         .object
          X:         .space     .sizeof(X)/36
          Y:         .type      .float
          Y:         .object
          Y:         .space     .sizeof(Y)/36
                     .section
          Z1:        .type      complex
          Z1:        .space     .sizeof(complex)/36
          Z1:        .object
Note that each structure element is defined with a .type, .object, and .space statement. We divided the .sizeof results by 36, since .sizeof returns sizes in bits but we wanted to reserve space in terms of words. Since the .object statements do not explicitly specify a type, YAA looks for a type with the same name as the object. As a variation, here is the same thing, with types specified explicitly in the structure element definitions.
          complex:   .template
          complex:   .object    type=>>.struct
          X:         .object    type=>>.float
          X:         .space     .sizeof(X)/36
          Y:         .object    type=>>.float
          Y:         .space     .sizeof(Y)/36
                     .section
          Z1:        .space     .sizeof(complex)/36
          Z1:        .object    type=>>complex
.space directives are required to reserve space for the structure elements.

Because the definitions of X and Y occur inside a template with a struct type, they are automatically given the "structure element" storage class. If they were in a normal section and the storage class was not specified, YAA would give them the storage class extern if there is a SYMDEF or SYMREF for the symbol, and static otherwise.

If a .space statement has a label that matches the label of a previous .object or .type statement, the default space reserved has the size and alignment of the given object. Thus we could have written

          X:         .object    type=>>.float
          X:         .space
since the .space statement automatically gives X the size and alignment of a float object.

When you omit the size in a .space statement, the type of the object must be known at the time of the .space statement, since .space needs to know how much space to reserve. Otherwise, you have freedom to arrange .type, .object, and .space statements in whatever order you choose.

Implicit Scopes

Code of the form

          NAME:  .template
          NAME:  .object   type=>>.struct
creates an implicit scope variable named NAME that holds the implicit scope created for the structure. The parent of the scope variable NAME is .extern by default. You can specify a different scope with a scope=>> option on the .object pseudo-op.

Note that the scope of the symbol NAME is the parent of the implicit scope created for the structure.

As another example, consider the code

          T:   .template
          T:   .scope     .extern
          A:   .object    type=>>.int
          A:   .space
The scope of the symbol A is not defined, so YAA looks for a scope variable named A. In the above code, there isn't one. YAA next looks to see if the section or template in which A is defined has a scope variable. This is the variable T. The scope of A will therefore be the parent of T. If you want the scope of A to be T itself, you must specify this explicitly with a scope=>>T option on the .object pseudo-op.

If the enclosing section or template does not have a scope variable either, the default scope of A is just .extern.

We emphasize that the scope of a symbol like A is the parent of the scope created by the .scope directive for the enclosing section or template.

Sample Union Declaration

The previous section showed a sample structure declaration. In this section, we show a sample union declaration. We will use the union

          union type {
              char c;
              int  i;
              double x;
          };
The corresponding YAA code is
          type:  .template  word
          type:  .object    type=>>.union

          c:     .object    type=>>.char
          c:     .space

                 .origin    type
          i:     .object    type=>>.int
          i:     .space

                 .origin    type
          x:     .object    type=>>.double
          x:     .space
Note that we used .origin statements to go back to the beginning of the type template section.

Defining Functions

As mentioned earlier, the .object statement normally generates LD_DEFVAR directives to define a variable. However, if YAA finds a scope variable defined with the same name as that on the .object statement, YAA assumes this must be a scope owned by the object and issues an LD_SCOPEVAR instead of LD_DEFVAR. For example,

          function:  .scope  .extern
          function:  .object type=>>(.int,.int) .int
is the usual way of declaring a function. It shows that the function has external scope and the prototype
          int function(int,int);
Since the scope of function is not defined on the .object statement, YAA looks for a scope variable of the same name. Since there is one, YAA issues an LD_SCOPEVAR directive for the .object statement instead of an LD_DEFVAR. The scope function refers to the scope owned by the symbol function, and the scope of the symbol function is the parent of the scope created by the .scope directive.

10.4.1 Defining Enumerated Classes: .element

Use:

          name:   .element   class_name,value

Where:

name
is the name to be given to the enumerated value.
class_name
is the name of a type or object which has the .enum type.
value
is an optional expression giving an integer value for the enumerated value. By default, the first element in a class has the value 0, the next element has the value 1, and so on, if a value is not given explicitly. These are the same rules as in the C language.

Description:

The .element pseudo-op defines an element of an enumerated class. For example, the C declaration

          enum sample {
              elem1,
              elem2 = 10
          };
would correspond to the YAA statements
          sample:   .template
          sample:   .object     type=>>.enum
          elem1:    .element    sample
          elem1:    .object
          elem2:    .element    sample,10
          elem2:    .object
Notice that the elements have both an .element directive to tell which class they belong to, and an .object directive to generate an LD_DEFVAR.

10.5 Line Numbers: .line

Use:

          .line   line_no,stat_type

Where:

line_no
is the line number to assigned to the current program location. If this argument is omitted, YAA uses the current line number inside the YAA source file being assembled.
stat_type
is a symbol indicating the type of statement at the current line. Possible statement types are listed later on.

Description:

The .line pseudo-op shows how source code is broken into text lines. It generates an LD_LINETAB directive.

The possible values for the stat_type argument are:

        expr       -- expression
        assign     -- assignment
        break      -- break statement
        continue   -- continue statement
        goto       -- goto statement
        if         -- if statement
        endif      -- end of if
        else       -- else clause
        endelse    -- end of else-if
        while      -- while statement
        endwhile   -- end of while
        repeat     -- Pascal repeat statement
        do         -- do of do-while
        dowhile    -- while of do-while
        call       -- function call
        until      -- Pascal until
        forinit    -- initialization of for loop
        fortest    -- test of for loop
        forincr    -- increment of for loop
        endfor     -- end of for
        switch     -- switch statement
        endswitch  -- end of switch
        with       -- Pascal with
        endwith    -- end of with
        func       -- beginning of function def
        funcend    -- end of function definition
        return     -- return without expr
        returnexp  -- return with expression
        filename   -- file name
        innerscope -- beginning of inner scope
        write      -- Pascal write or writeln
        read       -- Pascal read or readln
        misc       -- miscellaneous
        restore    -- restore to previous file
        eofline    -- line number of end of file
        endstat    -- marks end of statement, if
                      ambiguous
        pushfile   -- pushes the current source
                      file onto a stack

10.6 Source File Control: .file

Use:

          .file  "name"

Where:

"name"
is a string expression giving a file name. If the "name" operand is omitted, YAA uses the current source file name.

Description:

The .file pseudo-op specifies the name of the source file being compiled. The pseudo-op generates an LC_FILENAME directive.

Appendix A: Conversion From GMAP to B YAA

Existing GCOS-8 assembly programs will probably be written in GMAP. The fundamental tool for converting GMAP to YAA source is the FRED buffer program called gtoa (distributed as part of the YAA package). To execute the buffer, type

          fred gtoa gfile >>yfile
where gfile is a file containing GMAP source and yfile is a file that can receive the equivalent YAA source as output. GTOA can only convert one source file at a time.

In this appendix, we describe the steps that the GTOA program takes in converting GMAP code to YAA. This serves two purposes: to document how GTOA behaves; and to describe what programmers must do if they cannot use GTOA.

A.1 Conventions

YAA code is free format, meaning that instruction fields do not have to begin at any particular column on the line. However, we have found that you get good looking code by setting tab stops every four columns. This is the format used by GTOA.

GMAP uses the etc pseudo-op to allow long instructions to be broken into more than one line. Such tricks are not needed in YAA, since input lines can have any length and the backslash (\) can be used to continue a source code statement onto a new input line. If GTOA finds instructions that are longer than one line, it concatenates all the parts of the instruction into a single (long) line.

GMAP accepts symbols that begin with numeric characters, while YAA does not. Therefore GTOA changes the name of any such GMAP symbol by inserting underscores at the beginning, until the name is six characters long. For example, 1A becomes _ _ _ _1A. If a name is longer than six characters, a single underscore will be added.

Every section in converted GMAP code has the word offset mode, and word-addressing is used in all the code.

Since YAA demands that all registers be referenced with identifiers instead of numbers, instructions must be changed to use the proper names. For example, GTOA converts a GMAP instruction like

          lda  3,1,1
into
          lda  3,x1,ar1

A.2 Numeric Conversion

In YAA, all integers that begin with a leading 0 are considered to be written in octal. In GMAP, all integers are assumed to be in decimal unless prefixed by =o. Therefore YAA must strip off any leading zeros on integers, and must replace the =o construct with a leading zero.

The exceptions to this rule are STCA, STCQ, STBA, and STBQ. In GMAP, the second field of such instructions is always assumed to be in octal. Therefore, the field must be given a leading zero in YAA code.

Constructs of the form

          =ddd,dl
          =ddd,du
are converted by removing the '=' character. All other constructs of the form =ddd are handled by declaring an unnamed literal with the construct
          {.data ddd}

A.3 Alignment

In GMAP, instructions may force a particular alignment by putting an 'e', 'o', or '8' marker in column 8. These are replaced with the pseudo-ops

          .align 2
          .align 2,1
          .align 8
respectively.

A.4 Comments

GMAP has several kinds of comments:

YAA only has one kind of comment: text following a '#' (outside of string or character constants). Thus GTOA must convert all GMAP comments into YAA comments.

A.5 Identification Pseudo-Ops

The GMAP cpr pseudo-ops allow a copyright to be specified in source code. These are converted to equivalent .copyright instructions, placed at the beginning of the YAA source.

The GMAP lbl pseudo-op is replaced by appropriate .module and .title pseudo-ops.

The first GMAP ttl pseudo-op is replaced by an appropriate .revision pseudo-op. GTOA simply prints diagnostic messages for any additional ttl pseudo-ops that are found.

The GMAP ttldat pseudo-op is not converted by GTOA -- the conversion program simply prints a diagnostic message about the use of the pseudo-op. Programmers may convert ttldat operations into appropriate YAA instructions using YAA's .time function.

A.6 DUP Statements

The GMAP DUP statement is replaced by an equivalent .while construction, provided that the argument of DUP is a constant expression. This loop uses a YAA variable named _ _dup_ to control repetition of instructions.

If the argument of DUP is not a constant expression, GTOA cannot figure out what the corresponding .while construction should be. Therefore it leaves the DUP unchanged.

A.7 GMAP IF Statements

The GTOA program only converts some of GMAP's IF statements. Specifically, it converts the statements

          ife    ifg    ifl    ine
provided they have numeric arguments (i.e. arguments that do not contain the single quote character). These are converted to appropriate .if statements in YAA. If the original GMAP IF statements are not properly nested, the resulting YAA statements will not have the correct nesting either. As a result, the YAA code will not have the same behaviour as the GMAP code.

A.8 Constant Data Definition

The GMAP ascii, uasci, and bci statements are all converted to appropriate YAA .data statements. For example,

          ascii 3,abcdefghijkl
becomes
          .data "abcdefghijkl"

A.8.1 VFD Instructions

GMAP VFD instructions are converted to various types of .data statements. In the simplest case,

          vfd  bits/value
becomes
          .data  bits:value

When the bits field is preceded by an 'o' (indicating an octal value), the form of the value must be changed to an octal constant by adding a leading zero. If the value is represented by an expression, the GMAP operators

          *    +    -    /
must be replaced by the YAA (bitwise) operators
          &&    |    ^    ~
respectively.

When the bits field of a VFD instruction is preceded by 'a' or 'u', the conversion process is similar to that for converting ascii and uasci statements. GTOA expects that the number of bits specified in the VFD statement is a round number of ASCII characters, i.e. a multiple of 9. If it is not a multiple of 9, GTOA rounds up to the next multiple of 9 and assumes that many characters follow the '/' in the VFD.

When the bits field of a VFD instruction is preceded by 'r' or 'h', the instruction specifies a BCD string. The steps for converting this to an appropriate .data statement is similar to the steps for converting a VFD specifying ASCII data.

A.8.2 Microps

GTOA converts MICROP instructions into special macros that perform equivalent operations. Macro definitions are obtained with

          .include "mop.a"
GTOA automatically adds this at the beginning of your source code if you use any MICROPs.

A.9 Stuffing Idioms

GMAP uses the constructs ** and *-* to represent operands that are not available at assembly time. These will presumably be filled in with values at execution time. YAA converts such constructs to 0-0. This has the same effect.

The stuffing idiom *** in the opcode field is replaced by a zero pseudo-op.

A.10 Symdef and Symref Declaration

In GMAP, the format of SYMDEF and SYMREF declarations is

          symdef  name,name,...
          symref  name,name,...
This is converted to the YAA form
          [name,name,...]:  symdef
          [name,name,...]:  symref

A.11 End and Privit

The GMAP end and privit pseudo-ops are not needed in YAA. They are simply removed.

A.12 Use

The GMAP use pseudo-op is converted to an appropriate .section pseudo-op, or to an .origin pseudo-op that sets an origin at the end of an existing section (as given by the .highest function). By convention, the first instructions in the source code are considered to be part of a section called _blank_. Since there is no SYMDEF for _blank_, there is no conflict if several modules of the same program are converted with GTOA.

In order to implement the

          use previous
statement, GTOA maintains a YAA variable which is always associated with the highest point in the most recently defined section.

GMAP USE statements basically break up the program into chunks the same way that sections do. The difference is that USE statements are resolved by GMAP and can therefore be used to specify the order in which program chunks are arranged in memory. By default, YAA sections are ordered by the linker and therefore appear in no particular order.

By specifying parent sections judiciously, you can control the order of sections when necessary. Often, this means that you must define extra parent sections.

A.12.1 The .lit Pseudo- Op

The .LIT pseudo-op tells GMAP to position a literal pool at this point in the code being generated. The .pool pseudo-op in YAA is similar, but it is not quite the same. To see the difference, consider this situation:

          code
          .pool
          more code
If the .pool above was a GMAP .LIT, the literal pool would immediately be placed in between the two chunks of code. With .pool, however, the literal pool is created as a child section of the current parent. This will not be placed between the two chunks of code, but will be relocatable within the parent section.

In addition, .pool only covers literals that occur after the .pool directive; .LIT only covers literals that occur before.

To get the effect of the GMAP, you must make both code chunks into separate sections inside the same parent as the literal pool. You can then arrange to arrange these child sections into the right order.

Because of this complication, GTOA does not convert GMAP .LIT pseudo-ops. It leaves these to be converted by hand.

A.12.2 GMAP Common Data Areas

GTOA converts GMAP common data areas into sections with the common type.

A.13 Variables

The GMAP pseudo-ops setb and set manipulate objects that can be represented by YAA variables. GTOA therefore generates .var pseudo-ops to declare such symbols as YAA variables.

GMAP bool statements are implemented in almost the same way. However, a symbol defined with bool cannot be assigned new values, so it will not be declared with a .var statement. This prevents the symbol from being given a new value once it has been .set.

In both bool and setb, the GMAP operators

          *    +    -    /
must be converted to the YAA operators
          &&    |    ^    ~

A.14 Call and Return

GMAP's CALL and RETURN pseudo-ops are not converted. Appropriate diagnostic messages are printed.

A.15 ERLK

YAA has no equivalent of GMAP's ERLK instruction. On the other hand, ERLK is usually used in constructions of the form

          erlk
          org *-2

These are simply deleted -- they are not necessary in YAA code. Other ERLK instructions are not converted.

A.16 EIS Instructions

EIS instructions are complicated to convert. The most important point to note is that YAA does not remember the options field of an EIS instruction when analyzing the descriptors for the instruction. Thus GTOA must incorporate options into descriptors when necessary. In particular, GTOA must figure out when the length field of a descriptor should be a number and when it should be written as the name of an X register.

Note also that YAA uses a very different format to specify options in EIS instructions. See the description of mlr in Section 6.7 for further details.

A.17 Listing Control

GMAP has a number of instructions that control listings. These include

          list on
          list off
          list save
          list save,on
          list save,off
          list restore
and similar instructions with list replaced by pmc, pcc, and ref. If source code contains any of these, GTOA puts
          .include "gtoa.a"
at the beginning of your source file. This defines opcode macros list, pmc, pcc, and ref. In this way, a GMAP instruction like
          list off
becomes a call to the YAA macro list.

The macros reproduce the effects of the GMAP instructions by turning listing options on or off. The list below shows which options each macro controls.

          list   -- .l_source
          pmc    -- .l_macro
          pcc    -- .l_list
          ref    -- .l_crossref

All of the macros assume that you begin with the standard listing options. If you add .list pseudo-ops of your own to manipulate listing options, the macros may not work as expected. In addition, you should bear in mind that the macros are macros and will be expanded in the listing when the appropriate options are set.

A.18 Inhibits

The file gtoa.a mentioned in the previous section also includes the definition of a macro named inhib. This macro properly handles the GMAP instructions

          inhib on
          inhib off
            ...and so on
by generating a corresponding YAA .inhibit pseudo-op.

A.19 Miscellaneous Conversions

GMAP's oct and dec statements can be converted to .data statements in a straightforward way.

The GMAP date statement is replaced by

          .data .time()
Note that the time string format created by .time will be different from the GMAP date format.

The GMAP block statement is replaced by

          .section  common,data,word
An instruction of the form
          X: bfs     10
becomes
             .space  10
          X: .null
while
          X: bss
just becomes
          X:  .space

The following conversions are also made.

          GMAP                 YAAYAA
          null                .null
          org                 .origin
          even                .align 2
          odd                 .align 2,1
          eight               .align 8

A.20 B Environment Macros

GTOA attempts to convert standard the B environment macros to YAA constructs that have the same effect. This includes the following GMAP macros:

          advnce    aentry    argdef    aschar
          auto      backup    bend      bentry
          bmac      bretrn    cheap     chmac
          ifalse    iftrue    incall    scall
          sentry    switch

A.21 Unconverted Lines

In the output file produced by GTOA, lines that could not be converted are marked with an '<<' character in column 1. These will cause errors if put through the YAA assembler. Programmers should look at each of these lines by hand to determine if and how they should be converted.