Original Parser

version 1 by Ron Newcomb

  • Home page
  • Beginning
  • Previous
  • Next



  • Chapter - Unpacking Grammar Lines

    [Grammar lines are sequences of tokens in an array built into the story file,
    but in a format which differs depending on the virtual machine in use, so
    the following code unpacks the data into more convenient if larger arrays
    which are VM-independent.]

    To decide which number is the size of the grammar line's header: (- (WORDSIZE/2 + 1) -).
    To decide which number is the size of a single token in the grammar line: (- (WORDSIZE + 1) -).

    [To decide which 0-based index based rulebook producing structs is the token string:
    (-line_address-).]


    To decide which token type is the token type at (packed data - 0-based index based rulebook producing grammar tokens): (- ({packed data}->0 & $$1111) -). [byte]

    To decide which grammar token is the grammar token at (packed data - 0-based index based rulebook producing grammar tokens): (- ({packed data}+1)-->0 -).

    To decide which grammar token is the token properties at (packed data - 0-based index based rulebook producing grammar tokens): (- ({packed data}) -). [byte]

    To decide if (packed data - 0-based index based rulebook producing grammar tokens) is not at the end of the grammar line:
    (- ({packed data}->0 ~= ENDIT_TOKEN) -).



    To decide which 0-based index based rulebook producing grammar tokens is the verb's next understand-as line after (line address - a 0-based index based rulebook producing grammar tokens) (this is UnpackGrammarLine):
        unpack the action to be and whether the action's nouns swapped places from the line address;
        advance the line address by the size of the grammar line's header;
        let this be 0;
        now the number of parameters for this line is 0;
        while the line address is not at the end of the grammar line:
            change this element of the grammar line tokens to the token properties at the line address;
            change this element of the grammar line types to the token type at the line address;
            change this element of the grammar line data to the grammar token at the line address;
            if this element of the grammar line types is not '<understood word>':
                increment the number of parameters for this line;
            increment this;
            advance the line address by the size of a single token in the grammar line;
        repeat with unused running from this to 31:
            change the unused element of the grammar line tokens to the end of line token;
            change the unused element of the grammar line types to '<grammar token>';
            change the unused element of the grammar line data to the end of line token;
        advance the line address by 1;
    [ repeat with unused running from 0 to 31:
            say "[unused]: [the unused element of the grammar line tokens as a debugging number] [the unused element of the grammar line types] [if the unused element of the grammar line types is '<understood word>'][the unused element of the grammar line data as an understood word][else][the unused element of the grammar line data][end if][line break][run paragraph on]";]
        decide on the line address.

    Include(-
    [ AnalyseToken token;
    if (token == ENDIT_TOKEN) {
    found_ttype = ELEMENTARY_TT; ! token type
    found_tdata = ENDIT_TOKEN; ! grammar token
    return;
    }
    found_ttype = (token->0) & $$1111;
    found_tdata = (token+1)-->0;
    ];
    -).


    To split (t - a [0-based index based rulebook producing] grammar tokens) into the current grammar token & current token type: (- AnalyseToken({t}); -).
    [ if t is not at the end of the grammar line:
            now the current token type is the token type at t;
            now the current grammar token is the grammar token at t;
        otherwise:
            now the current token type is '<grammar token>';
            now the current grammar token is the end of line token.]

    [ UnpackGrammarLine line_address i size;
    for (i=0 : i<32 : i++) {
    line_token-->i = ENDIT_TOKEN;
    line_ttype-->i = ELEMENTARY_TT;
    line_tdata-->i = ENDIT_TOKEN;
    }
    #Ifdef TARGET_ZCODE;
    action_to_be = 256*(line_address->0) + line_address->1;
    action_reversed = ((action_to_be & $400) ~= 0);
    action_to_be = action_to_be & $3ff;
    line_address--;
    size = 3;
    #Ifnot; ! GLULX
    @aloads line_address 0 action_to_be;
    action_reversed = (((line_address->2) & 1) ~= 0);
    line_address = line_address - 2;
    size = 5;
    #Endif;
    params_wanted = 0;
    for (i=0 : : i++) {
    line_address = line_address + size;
    if (line_address->0 == ENDIT_TOKEN) break;
    line_token-->i = line_address;
    AnalyseToken(line_address);
    if (found_ttype ~= PREPOSITION_TT) params_wanted++;
    line_ttype-->i = found_ttype;
    line_tdata-->i = found_tdata;
    }
    return line_address + 1;
    ]