Original Parser

version 1 by Ron Newcomb

  • Home page
  • Beginning
  • Previous
  • Next



  • Section - Noun Domain

    To decide which pattern union is the noun domain (first domain - an object) and (second domain - an object) under (context - a grammar token) (this is NounDomain):
        if trace 4, say " [bracket]NounDomain called at word [the parser's current word position][line break] seeking [the descriptor status]";
        now the number of words matched per object is 0;
        now the number of objects in the match list is 0;
        now the next word to parse's position is the parser's current word position;
        search the scope of the first domain and second domain under the context;
        if trace 4, say " [bracket]ND made [the number of objects in the match list] matches[close bracket][line break]";
        now the parser's current word position is the next word to parse's position + the number of words matched per object;
        [ If nothing worked at all, leave with the word marker skipped past the first unmatched word...]
        if the number of objects in the match list is zero:
            increment the parser's current word position;
            Decide on noun domain's no match;
        [ Suppose that there really were some words being parsed (i.e., we did not just infer). If so, and if there was only one match, it must be right and we return it...]
        if the next word to parse's position <= the number of words in the command:
            if the number of objects in the match list is 1:
                Decide on the 0th element of the match list as a successful match;
         [ ...now suppose that there was more typing to come, i.e. suppose that the user entered something beyond this noun. If nothing ought to follow, then there must be a mistake, (unless what does follow is just a full stop, and or comma) ]
            if the parser's current word position <= the number of words in the command:
                let the word be the next word;
                decrement the parser's current word position;
                unless the word is 'AND\THEN\BUT' or the word is the comma:
                    if the next token is the end of line token:
                        Decide on noun domain's no match;
        [ Now look for a good choice, if there's more than one choice...]
        now the number of match groups is 0;
        let the likely choice be noun domain's no match;
        if the number of objects in the match list is 1:
            now the likely choice is the 0th element of the match list as a successful match;
        if the number of objects in the match list is at least 2:
            let assume all objects indistinguishable be true;
            repeat through the match list of size the number of objects in the match list:
                if the chosen element is not indistinguishable from the 0th element of the match list:
                    we shouldn't assume all objects indistinguishable;
            if we still assume all objects indistinguishable, we needn't make inferences;
            now the likely choice is the result of adjudicating under the context;
            if the likely choice is the null pattern:
                Decide on noun domain's no match;
            if the likely choice is noun domain's bunch of objects:
                Decide on noun domain's bunch of objects;
        [ If i is non-zero here, one of two things is happening: either
    (a) an inference has been successfully made that object i is the intended one from the user's specification, or
    (b) the user finished typing some time ago, but we've decided on i because it's the only possible choice.
        In either case we have to keep the pattern up to date, note that an inference has been made and return. (Except, we don't note which of a pile of identical objects.) ]
        unless the likely choice is noun domain's no match:
            if we shouldn't make inferences, decide on the likely choice;
            if where inferring the pattern from is at the 0th position, now where inferring the pattern from is the understood command's current position;
            change (understood command's current position) element of the player's understood command to the likely choice;
            Decide on the likely choice;
        [ If we get here, there was no obvious choice of object to make. If in fact we've already gone past the end of the player's typing (which means the match list must contain every object in scope, regardless of its name), then it's foolish to give an enormous list to choose from - instead we go and ask a more suitable question...]
        if the next word to parse's position > the number of words in the command:
            Decide on the incomplete noun under the context;
        [ Otherwise, now we print up the question using the equivalence classes as worked out by Adjudicate() so as not to repeat ourselves on plural objects...]
        begin THE ASKING WHICH DO YOU MEAN ACTIVITY;
        if handling THE ASKING WHICH DO YOU MEAN ACTIVITY:
            let assume all objects people be true;
            repeat through the first item of each group:
                if the chosen element does not provide the property animate directly:
                    we shouldn't assume all objects people;
            if we still assume all objects people, issue the 45th response "Who do you mean, ";
            otherwise issue the 46th response "Which do you mean, ";
            repeat through the first item of each group:
                if we should say the group as A\AN\SOME, say "[a chosen element]";
                otherwise say "[the chosen element]";
                if the current group-together number < the number of match groups - 1, say ", ";
                if the current group-together number is the number of match groups - 1, say "[if the serial comma option is active and the number of match groups is not 2],[end if] or ";
            issue the 57th response "?[line break]";
        end THE ASKING WHICH DO YOU MEAN ACTIVITY;
        [ ...and get an answer: ]
        change the 1st word of the second parsed command to 'ALL' simply to enter the loop;
        while the 1st word of the second parsed command is 'ALL': [ another test-at-the-bottom loop ]
            fill the secondary input buffer with spaces starting at 2 for the maximum buffer size - 2 elements;
            let the answer be the number of words TYPED IN AT THE KEYBOARD into the secondary input buffer and the second parsed command;
            [ Take care of "all", because that does something too clever to do later on: ]
            if the 1st word of the second parsed command is 'ALL': [ "X HAT" Which? "ALL" Sorry, you.... ]
                if the context is either 'things' or 'things preferably held' or 'other things' or 'things inside':
                    repeat through the match list of size the number of objects in the match list:
                        add the chosen element to the multiple-object list, allowing duplicates;
                    decide on the noun domain's bunch of objects;
                issue the 47th response "Sorry, you can only have one item here. Which exactly?";
        [ Look for a comma, and interpret this as a fresh conversation command if so: ]
        if the 'comma' is listed in the second parsed command:
            copy the secondary input buffer into the player's input buffer;
        [ If the first word of the reply can be interpreted as a verb, then assume that the player has ignored the question and given a new command altogether. (This is one time when it's convenient that the directions are not themselves verbs - thus, "north" as a reply to "Which, the north or south door" is not treated as a fresh command but as an answer.) ]
        otherwise:
            let the first word be the 1st word of the second parsed command;
            [#Ifdef LanguageIsVerb; [ Non-English languages may place the verb in other positions. ]
            if first word is not a word unknown by the game:
                let the saved position be the parser's current word position;
             now the first word is LanguageIsVerb(buffer2, parse2, 1);
                now the parser's current word position is the saved position;
            #Endif; ! LanguageIsVerb]
            if the first word is not a word unknown by the game and the usages of the first word include being a verb and the first word cannot be a name or adjective:
                copy the secondary input buffer into the player's input buffer;
            otherwise:
                [ Now we insert the answer into the original typed command: TAKE HAT becomes TAKE RED HAT. ]
                replace (the insertion point at the next word to parse's position) with (the number of letters in the secondary input buffer) in letters from 1;
        now the parser's current word position is 1;
    [ consider the convert to subject–verb–object format rule; [ A hook for non-English player languages. ]]
        PARSE the player's input buffer into the player's parsed command;
        now the number of words in the command is the word count;
        now the player's command is the empty snippet lengthened by the word count;
        now the actor's scopewise location is the scope ceiling of the player;
        follow the AFTER READING A COMMAND rules;
        decide on the misunderstood command.


    To decide which pattern union is the incomplete noun under (context - a grammar token) (this is NounDomain's Incomplete): [ can be reached by EXAMINE THE or even just EXAMINE ]
        if the context is 'someone':
            issue the 48th response "Whom do you want[if the person asked is not the player] [the person asked][end if] to [recap of command]?[line break]";
        otherwise:
            issue the 49th response "What do you want[if the person asked is not the player] [the person asked][end if] to [recap of command]?[line break]";
        fill the secondary input buffer with spaces starting at 2 for the maximum buffer size - 2 elements;
        let the answer be the number of words TYPED IN AT THE KEYBOARD into the secondary input buffer and the second parsed command;
        let the first word be the 1st word of the second parsed command;
        [#Ifdef LanguageIsVerb;
        if first word is not a word unknown by the game:
            let the saved position be the parser's current word position;
         now the first word is LanguageIsVerb(buffer2, parse2, 1);
            now the parser's current word position is the saved position;
        #Endif; ! LanguageIsVerb]
        [ Once again, if the reply looks like a command, give it to the parser to get on with and forget about the question ]
        if the first word is not a word unknown by the game and the usages of the first word include being a verb:
            copy the secondary input buffer into the player's input buffer;
            decide on the misunderstood command;
        [ ...but if we have a genuine answer, then:
         (1) we must glue in text suitable for anything that's been inferred. ]
        if where inferring the pattern from is not at the 0th position: [ reached by ASK ME; the FOR will be inferred]
            repeat with Nth running from where inferring the pattern from to the understood command's current position - 1:
                let this pattern be the Nth element of the player's understood command;
                if this pattern is the null pattern, next;
                if trace 5, say "[bracket]Gluing in inference with pattern code [this pattern as a debugging number][close bracket][line break]";
                let the inferred word be a word unknown by the game;
                if this pattern is currently an object:
                    [ Because object names are so complicated and prone to overlap, we'll set a pronoun to it, then use the pronoun. (This is imperfect, but it's very seldom needed anyway.)]
                    set pronouns from this pattern as an object;
                    if this pattern is listed as one of the antecedents in the language's pronoun list:
                        now the inferred word is the pronoun element;
                        if trace 5, say "[bracket]Using pronoun '[inferred word]'[close bracket][line break]";
                otherwise: [ it's an inferred preposition.]
                    now the inferred word is this pattern as an understood word;
                    if trace 5, say "[bracket]Using preposition '[inferred word]'[close bracket][line break]";
                unless the inferred word is a word unknown by the game:
                    append a space to the player's input buffer;
                    append the inferred word to the player's input buffer;
        [ (2) then glue in the newly-typed text onto the end.]
        append a space to the player's input buffer;
        append (the number of letters in the secondary input buffer) letters from the secondary input buffer to the player's input buffer;
        [ (3) we fill up the buffer with spaces, which is unnecessary, but may help incorrectly-written interpreters to cope.]
        fill the player's input buffer with spaces starting at (the number of letters in the player's input buffer + 1) for (the maximum buffer size - the number of letters in the player's input buffer) elements;
        decide on the misunderstood command.


    [ NounDomain domain1 domain2 context
        first_word i j k l answer_words marker;
    #Ifdef DEBUG;
    if (parser_trace >= 4) {
    print " [NounDomain called at word ", wn, "^"; ! ]
    print " ";
    if (indef_mode) {
    print "seeking indefinite object: ";
    if (indef_type & OTHER_BIT) print "other ";
    if (indef_type & MY_BIT) print "my ";
    if (indef_type & THAT_BIT) print "that ";
    if (indef_type & PLURAL_BIT) print "plural ";
    if (indef_type & LIT_BIT) print "lit ";
    if (indef_type & UNLIT_BIT) print "unlit ";
    if (indef_owner ~= 0) print "owner:", (name) indef_owner;
    new_line;
    print " number wanted: ";
    if (indef_wanted == INDEF_ALL_WANTED) print "all"; else print indef_wanted;
    new_line;
    print " most likely GNAs of names: ", indef_cases, "^";
    }
    else print "seeking definite object^";
    }
    #Endif; ! DEBUG

    match_length = 0; number_matched = 0; match_from = wn;

    SearchScope(domain1, domain2, context);

    #Ifdef DEBUG;
    if (parser_trace >= 4) print " [ND made ", number_matched, " matches]^";
    #Endif; ! DEBUG

    wn = match_from+match_length;

    ! If nothing worked at all, leave with the word marker skipped past the
    ! first unmatched word...

    if (number_matched == 0) { wn++; rfalse; }

    ! Suppose that there really were some words being parsed (i.e., we did
    ! not just infer). If so, and if there was only one match, it must be
    ! right and we return it...

    if (match_from <= num_words) {
    if (number_matched == 1) {
    i=match_list-->0;
    return i;
    }

    ! ...now suppose that there was more typing to come, i.e. suppose that
    ! the user entered something beyond this noun. If nothing ought to follow,
    ! then there must be a mistake, (unless what does follow is just a full
    ! stop, and or comma)

    if (wn <= num_words) {
    i = NextWord(); wn--;
    if (i ~= AND1__WD or AND2__WD or AND3__WD or comma_word
    or THEN1__WD or THEN2__WD or THEN3__WD
    or BUT1__WD or BUT2__WD or BUT3__WD) {
    if (lookahead == ENDIT_TOKEN) rfalse;
    }
    }
    }

    ! Now look for a good choice, if there's more than one choice...

    number_of_classes = 0;

    if (number_matched == 1) i = match_list-->0;
    if (number_matched > 1) {
         i = true;
         if (number_matched > 1) ! I removed this if statement cause it does nothing.
         for (j=0 : j<number_matched-1 : j++)
                    if (Identical(match_list-->j, match_list-->(j+1)) == false)
                        i = false; ! "unless all elements are indistinguishable from each other", now i is false
            if (i) dont_infer = true; ! the meaning of dont_infer is flipped because it reads better
    i = Adjudicate(context);
    if (i == -1) rfalse;
    if (i == 1) rtrue; ! Adjudicate has made a multiple
    ! object, and we pass it on
    }

    ! If i is non-zero here, one of two things is happening: either
    ! (a) an inference has been successfully made that object i is
    ! the intended one from the user's specification, or
    ! (b) the user finished typing some time ago, but we've decided
    ! on i because it's the only possible choice.
    ! In either case we have to keep the pattern up to date,
    ! note that an inference has been made and return.
    ! (Except, we don't note which of a pile of identical objects.)

    if (i ~= 0) {
    if (dont_infer) return i; ! meaning of dont_infer toggled
    if (inferfrom == 0) inferfrom=pcount;
    pattern-->pcount = i;
    return i;
    }

    ! If we get here, there was no obvious choice of object to make. If in
    ! fact we've already gone past the end of the player's typing (which
    ! means the match list must contain every object in scope, regardless
    ! of its name), then it's foolish to give an enormous list to choose
    ! from - instead we go and ask a more suitable question...

    if (match_from > num_words) jump Incomplete;

    ! Now we print up the question, using the equivalence classes as worked
    ! out by Adjudicate() so as not to repeat ourselves on plural objects...

        BeginActivity(ASKING_WHICH_DO_YOU_MEAN_ACT);
        if (ForActivity(ASKING_WHICH_DO_YOU_MEAN_ACT)) jump SkipWhichQuestion;
        j = 1; marker = 0; ! "marker" becomes "ct_1" while "i" becomes a I7 global "the current group-together number"
        for (i=1 : i<=number_of_classes : i++) {
            while (((match_classes-->marker) ~= i) && ((match_classes-->marker) ~= -i))
                marker++;
            if (match_list-->marker hasnt animate) j = 0;
        }
        if (j) L__M(##Miscellany, 45); else L__M(##Miscellany, 46);

    j = number_of_classes; marker = 0;
    for (i=1 : i<=number_of_classes : i++) {
    while (((match_classes-->marker) ~= i) && ((match_classes-->marker) ~= -i)) marker++;
    k = match_list-->marker;

    if (match_classes-->marker > 0) print (the) k; else print (a) k;

    if (i < j-1) print (string) COMMA__TX;
    if (i == j-1) {
                #Ifdef SERIAL_COMMA;
                if (j ~= 2) print ",";
    #Endif; ! SERIAL_COMMA
    print (string) OR__TX;
    }
    }
    L__M(##Miscellany, 57);

        .SkipWhichQuestion; EndActivity(ASKING_WHICH_DO_YOU_MEAN_ACT);

    ! ...and get an answer:

    .WhichOne;
    #Ifdef TARGET_ZCODE;
    for (i=2 : i<INPUT_BUFFER_LEN : i++) buffer2->i = ' ';
    #Endif; ! TARGET_ZCODE
    answer_words=Keyboard(buffer2, parse2);

    ! Conveniently, parse2-->1 is the first word in both ZCODE and GLULX.
    first_word = (parse2-->1);

    ! Take care of "all", because that does something too clever here to do
    ! later on:

    if (first_word == ALL1__WD or ALL2__WD or ALL3__WD or ALL4__WD or ALL5__WD)
    {
    if (context == MULTI_TOKEN or MULTIHELD_TOKEN or MULTIEXCEPT_TOKEN or MULTIINSIDE_TOKEN)
        {
    l = multiple_object-->0;
    for (i=0 : i<number_matched && l+i<MATCH_LIST_WORDS : i++)
         { ! I don't check the max size in my version as I use the AddMulti function, which stops adding when its full.
    k = match_list-->i;
    multiple_object-->(i+1+l) = k;
    }
    multiple_object-->0 = i+l;
    rtrue;
    }
    L__M(##Miscellany, 47);
    jump WhichOne;
    }

        ! Look for a comma, and interpret this as a fresh conversation command
        ! if so:

        for (i=1 : i<=answer_words : i++)
            if (WordFrom(i, parse2) == comma_word) {
    VM_CopyBuffer(buffer, buffer2);
    jump RECONSTRUCT_INPUT;
            }

    ! If the first word of the reply can be interpreted as a verb, then
    ! assume that the player has ignored the question and given a new
    ! command altogether.
    ! (This is one time when it's convenient that the directions are
    ! not themselves verbs - thus, "north" as a reply to "Which, the north
    ! or south door" is not treated as a fresh command but as an answer.)

    #Ifdef LanguageIsVerb;
    if (first_word == 0) {
    j = wn; first_word = LanguageIsVerb(buffer2, parse2, 1); wn = j;
    }
    #Endif; ! LanguageIsVerb
    if (first_word ~= 0) {
    j = first_word->#dict_par1;
    if ((0 ~= j&1) && ~~LanguageVerbMayBeName(first_word)) {
    VM_CopyBuffer(buffer, buffer2);
    jump RECONSTRUCT_INPUT;
    }
    }

    ! Now we insert the answer into the original typed command, as
    ! words additionally describing the same object
    ! (eg, > take red button
    ! Which one, ...
    ! > music
    ! becomes "take music red button". The parser will thus have three
    ! words to work from next time, not two.)

    #Ifdef TARGET_ZCODE;
    k = WordAddress(match_from) - buffer; l=buffer2->1+1;
    for (j=buffer + buffer->0 - 1 : j>=buffer+k+l : j-- ) j->0 = 0->(j-l); ! shift right original contents
    for (i=0 : i<l : i++ ) buffer->(k+i) = buffer2->(2+i); ! copy from 2nd buffer to the new hole in 1st
    buffer->(k+l-1) = ' '; ! ...plus a space, so the patched command isn't "musicred button" since a space wasn't typed.
    buffer->1 = buffer->1 + l; ! increase the size as well
    if (buffer->1 >= (buffer->0 - 1)) buffer->1 = buffer->0; ! cap the size at the maximum allowed size

    #Ifnot; ! TARGET_GLULX

    k = WordAddress(match_from) - buffer;
    l = (buffer2-->0) + 1;
    for (j=buffer+INPUT_BUFFER_LEN-1 : j>=buffer+k+l : j-- ) j->0 = j->(-l);
    for (i=0 : i<l : i++) buffer->(k+i) = buffer2->(WORDSIZE+i);
    buffer->(k+l-1) = ' ';
    buffer-->0 = buffer-->0 + l;
    if (buffer-->0 > (INPUT_BUFFER_LEN-WORDSIZE)) buffer-->0 = (INPUT_BUFFER_LEN-WORDSIZE);
    #Endif; ! TARGET_

    ! Having reconstructed the input, we warn the parser accordingly
    ! and get out.

        .RECONSTRUCT_INPUT;

        num_words = WordCount();
    wn = 1;
    #Ifdef LanguageToInformese;
    LanguageToInformese();
    ! Re-tokenise:
    VM_Tokenise(buffer,parse);
    #Endif; ! LanguageToInformese
        num_words = WordCount();
    players_command = 100 + WordCount();
    actors_location = ScopeCeiling(player);
        FollowRulebook(Activity_after_rulebooks-->READING_A_COMMAND_ACT, true);

    return REPARSE_CODE;

    !!!! NEW FUNCTION : INCOMPLETE NOUN DOMAIN. WHAT PARAMETERS DOES IT NEED?
    !!! context is needed.

    ! Now we come to the question asked when the input has run out
    ! and can't easily be guessed (eg, the player typed "take" and there
    ! were plenty of things which might have been meant).

    .Incomplete;

    if (context == CREATURE_TOKEN) L__M(##Miscellany, 48);
    else L__M(##Miscellany, 49);

    #Ifdef TARGET_ZCODE;
    for (i=2 : i<INPUT_BUFFER_LEN : i++) buffer2->i=' ';
    #Endif; ! TARGET_ZCODE
    answer_words = Keyboard(buffer2, parse2);

    first_word=(parse2-->1);
    #Ifdef LanguageIsVerb;
    if (first_word==0) {
    j = wn; first_word=LanguageIsVerb(buffer2, parse2, 1); wn = j;
    }
    #Endif; ! LanguageIsVerb

    ! Once again, if the reply looks like a command, give it to the
    ! parser to get on with and forget about the question...

    if (first_word ~= 0) {
    j = first_word->#dict_par1;
    if (0 ~= j&1) {
    VM_CopyBuffer(buffer, buffer2);
    return REPARSE_CODE;
    }
    }

    ! ...but if we have a genuine answer, then:
    !
    ! (1) we must glue in text suitable for anything that's been inferred.

    if (inferfrom ~= 0) {
    for (j=inferfrom : j<pcount : j++) {
    if (pattern-->j == PATTERN_NULL) continue;
    #Ifdef TARGET_ZCODE;
    i = 2+buffer->1; (buffer->1)++; buffer->(i++) = ' ';
    #Ifnot; ! TARGET_GLULX
    i = WORDSIZE + buffer-->0;
    (buffer-->0)++; buffer->(i++) = ' ';
    #Endif; ! TARGET_

    #Ifdef DEBUG;
    if (parser_trace >= 5)
    print "[Gluing in inference with pattern code ", pattern-->j, "]^";
    #Endif; ! DEBUG

    ! Conveniently, parse2-->1 is the first word in both ZCODE and GLULX.

    parse2-->1 = 0;

    ! An inferred object. Best we can do is glue in a pronoun.
    ! (This is imperfect, but it's very seldom needed anyway.)

    if (pattern-->j >= 2 && pattern-->j < REPARSE_CODE) {
    PronounNotice(pattern-->j);
    for (k=1 : k<=LanguagePronouns-->0 : k=k+3)
    if (pattern-->j == LanguagePronouns-->(k+2)) {
    parse2-->1 = LanguagePronouns-->k;
    #Ifdef DEBUG;
    if (parser_trace >= 5)
    print "[Using pronoun '", (address) parse2-->1, "']^";
    #Endif; ! DEBUG
    break;
    }
    }
    else {
    ! An inferred preposition.
    parse2-->1 = VM_NumberToDictionaryAddress(pattern-->j - REPARSE_CODE);
    #Ifdef DEBUG;
    if (parser_trace >= 5)
    print "[Using preposition '", (address) parse2-->1, "']^";
    #Endif; ! DEBUG
    }

    ! parse2-->1 now holds the dictionary address of the word to glue in.

    if (parse2-->1 ~= 0) {
    k = buffer + i;
    #Ifdef TARGET_ZCODE;
    @output_stream 3 k;
    print (address) parse2-->1;
    @output_stream -3;
    k = k-->0;
    for (l=i : l<i+k : l++) buffer->l = buffer->(l+2);
    i = i + k; buffer->1 = i-2;
    #Ifnot; ! TARGET_GLULX
    k = Glulx_PrintAnyToArray(buffer+i, INPUT_BUFFER_LEN-i, parse2-->1);
    i = i + k; buffer-->0 = i - WORDSIZE;
    #Endif; ! TARGET_
    }
    }
    }

    ! (2) we must glue the newly-typed text onto the end.

    #Ifdef TARGET_ZCODE;
    i = 2+buffer->1; (buffer->1)++; buffer->(i++) = ' ';
    for (j=0 : j<buffer2->1 : i++,j++) {
    buffer->i = buffer2->(j+2);
    (buffer->1)++;
    if (buffer->1 == INPUT_BUFFER_LEN) break;
    }
    #Ifnot; ! TARGET_GLULX
    i = WORDSIZE + buffer-->0;
    (buffer-->0)++; buffer->(i++) = ' ';
    for (j=0 : j<buffer2-->0 : i++,j++) {
    buffer->i = buffer2->(j+WORDSIZE);
    (buffer-->0)++;
    if (buffer-->0 == INPUT_BUFFER_LEN) break;
    }
    #Endif; ! TARGET_

    ! (3) we fill up the buffer with spaces, which is unnecessary, but may
    ! help incorrectly-written interpreters to cope.

    #Ifdef TARGET_ZCODE;
    for (: i<INPUT_BUFFER_LEN : i++) buffer->i = ' ';
    #Endif; ! TARGET_ZCODE

    return REPARSE_CODE;

    ]