-
Notifications
You must be signed in to change notification settings - Fork 40
Narsese Parser
Patrick Hammer edited this page Sep 19, 2024
·
2 revisions
This Narsese parser, instead of a general grammar based parsing approach, parses by first transforming the Narsese into S-expression format and then building a binary tree:
- First the Narsese is preprocessed, so that it can be tokenized with space separation.
- Then the token order is changed such that the operators in infix order appear in prefix order (the only case where token lookahead is required).
- Then a binary tree, encoded as an array is constructed from left to right in a single go through.
Compared to OpenNARS parser:
- The parser supports copulas to appear in infix order. It does not need (though supports) <> parentheses for statements, and does not need commas, so (a --> (b & c)) is fine, which I find more readable.
- The parser does not construct expensive objects for individual subterms.
Compared to ALANN parser:
- The parser is not automatically constructed from a grammar definition.
Compared to both:
- The parser encodes compound terms as an array, allowing for unmatched efficiency in processing compounds for inference.
- Only operators in infix order demand token lookahead, if they are in prefix order in the input, it is detected and there won't even be a lookahead. This means in this case, the complete parse happens in a single go-through, scanning the tokens from left to right while building the binary tree.
Average parsing speed for parsing
"<(<$sth --> (&,[furry,meowing],animal)> &| <a --> b>) =/> <$sth --> [good]>>"
tested on November 30 with i7 laptop:
2.7us in OpenNARS for Applications
15.4us in OpenNARS for Research
80us in ALANN
See https://github.com/opennars/OpenNARS-for-Applications/blob/master/src/Narsese.c for the current implementation.