-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic actions on layout terminals are delayed #52
Comments
As a general rule, semantic actions are run when a terminal is consumed by the parser, rather than when it is originally scanned (potentially as yet-unconsumed lookahead). Since all the layout preceding a terminal is scanned at the same time as the terminal, the parser sees it all as one unit, and they are all considered "consumed" at the same time as the terminal (i.e., when the terminal is shifted). |
Would it be reasonable to consider adding a second type of semantic action for terminals that is run as soon as the terminal is scanned? This is the behavior that we want for recognizing C preprocessor tags to use them in disambiguation. As I mentioned we can achieve this in a hacky way with a bogus lexical ambiguity and disambiguation function, but this is rather suboptimal. |
That sounds like it would just be sweeping the hackiness under the rug -- adding an extra wrinkle to the tool to straighten out the representation of this grammar. We had to add disambiguation functions because Copper could not parse C without them; but in this case, since Copper can do what needs to be done, it does not seem worth adding that kind of complication. |
@schwerdf - do you think the current behavior is the correct one? I think what Copper does now makes sense given our early understanding of things, but if shifting a token, or dropping a layout one, has a semantic action that affects how the following characters are processed it seems that we should be executing these actions as the tokens are processed instead of waiting until a batch is given to the parser. Or am I missing something here? |
On terminals in traditional parsing frameworks, one has scanner actions (applied when the terminal is matched in the scanner) and parser actions (applied when the terminal is shifted in the parser). In Copper, disambiguation functions have to be scanner actions, but the semantic actions on non-layout terminals have to be parser actions, since they return result values that go into the parse tree. This leaves actions on layout terminals (normally invisible to the parser) in a gray area. Historically, we had no choice but to run these actions as a group at the next shift -- scans could potentially be re-run after a reduce action, and ambiguities were allowed between layout terminals. There are still three reasons to do it this way: (1) to prevent complications when re-running the scan in the event of a syntax error; (2) to avoid tightening the screws on the behavior of a compatible parser, by providing tighter guarantees of the relative times at which these actions will be run; and (3) to maintain the former behavior that may be relied on by existing grammars. If these concerns are surmountable, then it should not be very complicated to rearrange things to run the actions on layout terminals when they are matched. |
Thanks, August. Let me look into these concerns a bit. We at least have hacky solution to the immediate problem. |
On a closer examination of the parser engine class, scratch concern (1): the "disjoint scan" after an error used to re-scan preceding layout as well, but now starts at the exact point of failure, so the layout is never re-scanned. |
That is good. Thanks. |
So I found a reason why we don't want to simply change the behavior of semantic actions on layout terminals to be scanner actions instead of parser actions - we were using the action block on CPP tag terminals to update the location information, which should happen in-order with the actions of other non-layout terminals to ensure they receive the correct location (moving this entire action block into a disambiguation function introduced a bug with location tracking - fixed in melt-umn/ableC#171.) So it seems for this case we really do want to be able to define seperate scanner and parser actions for terminals. |
The change as I envisioned it would not alter the order in which the actions are run relative to each other; the only noticeable change would be relative to disambiguation functions. |
For context, this problem showed up in ableC, where we are trying to use the C preprocessor tags to control the disambiguation of identifiers, based on whether we are currently parsing a system header file. These preprocessor tags are recognized as a layout terminal, with a semantic action that sets a parser attribute which is later referenced in a disambiguation function.
The following is a simple example in Silver:
When run on the input
#a
, this printsThe disambiguation function is being called before the semantic action for '#' has been run, thus giving the wrong result.
As a (very hacky) workaround, one can introduce a second, ambiguous layout terminal and use a disambiguation function to update the parser attribute in the disambiguation function:
This gives the desired output of
This workaround does work in the context of ableC but is something that we would greatly wish to avoid.
Since we evidently know which layout terminal is going to be shifted as soon as the disambiguation function is run, why aren't the semantic actions executed immediately, before attempting to disambiguate the next terminal? I vaguely remember there being a reason for this, but I don't remember what it was.
The text was updated successfully, but these errors were encountered: