Skip to content

Narsese Parser

Patrick Hammer edited this page Sep 19, 2024 · 2 revisions

Approach

This Narsese parser, instead of a general grammar based parsing approach, parses by first transforming the Narsese into S-expression format and then building a binary tree:

  1. First the Narsese is preprocessed, so that it can be tokenized with space separation.
  2. Then the token order is changed such that the operators in infix order appear in prefix order (the only case where token lookahead is required).
  3. Then a binary tree, encoded as an array is constructed from left to right in a single go through.

Compared to OpenNARS parser:

  • The parser supports copulas to appear in infix order. It does not need (though supports) <> parentheses for statements, and does not need commas, so (a --> (b & c)) is fine, which I find more readable.
  • The parser does not construct expensive objects for individual subterms.

Compared to ALANN parser:

  • The parser is not automatically constructed from a grammar definition.

Compared to both:

  • The parser encodes compound terms as an array, allowing for unmatched efficiency in processing compounds for inference.
  • Only operators in infix order demand token lookahead, if they are in prefix order in the input, it is detected and there won't even be a lookahead. This means in this case, the complete parse happens in a single go-through, scanning the tokens from left to right while building the binary tree.

Performance

Average parsing speed for parsing

"<(<$sth --> (&,[furry,meowing],animal)> &| <a --> b>) =/> <$sth --> [good]>>"

tested on November 30 with i7 laptop:

2.7us in OpenNARS for Applications

15.4us in OpenNARS for Research

80us in ALANN

Implementation

See https://github.com/opennars/OpenNARS-for-Applications/blob/master/src/Narsese.c for the current implementation.