Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ,- parsing #1277

Open
AlienKevin opened this issue Apr 22, 2024 · 8 comments
Open

Fix ,- parsing #1277

AlienKevin opened this issue Apr 22, 2024 · 8 comments
Labels

Comments

@AlienKevin
Copy link
Contributor

AlienKevin commented Apr 22, 2024

Screenshot 2024-04-22 at 12 12 13 PM

Current requires a space to separate the , and the -.

@disconcision
Copy link
Member

@dm0n3y curious how the new system handles such cases. this is an awkward one in the current arrangement, as we have a notion of operator characters, where a (possibly user-defined soon) operator can consist of any run of those characters. there are a number of ways this categorization could be made more precise, but the case of characters which can be both prefix operators and parts of infix operators like this one seems pernicious.

@cyrus- cyrus- added the bug label Apr 22, 2024
@dm0n3y
Copy link
Contributor

dm0n3y commented Apr 22, 2024

First thought is that comma is special in the same way parens/braces are and would not be included in the arbitrary operator token class

@disconcision
Copy link
Member

@dm0n3y solves ,- but not e.g. *-

@dm0n3y
Copy link
Contributor

dm0n3y commented Apr 23, 2024

Hard to solve in general short of doing some more elaborate context-informed lexing. ,- lexing into an unrecognized operator is esp annoying though and worth specializing. I'm more ok with *- lexing into an unrecognized operator. OCaml makes the same distinction.
image

@cyrus-
Copy link
Member

cyrus- commented Apr 23, 2024

could try to restrict infix operators to not end in a token that can also be used as a prefix operator?

@disconcision
Copy link
Member

@cyrus- could work but would involve some slightly grody intermediate states, e.g. is "-" was an operator then it goes from being one operator to two back to one again. could say more restrictively that prefix operator characters can't be used as non-initial characters in infix ops.

i don't find the ocaml approach fully satisfying but the fact that they're doing it suggests it's at least annoying to do better

@disconcision
Copy link
Member

@dm0n3y i feel like in principle there could be something analogous to your error-counting metric at the lexing level. an invalid token gets broken up if doing so results in a state with less total errors

@dm0n3y
Copy link
Contributor

dm0n3y commented Apr 23, 2024

Yeah ultimately I think there should be character-level molding, which is what I really meant by context-informed lexing above. I agree with @disconcision that OCaml approach is not perfect but best bang for buck short of full solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Team Editor
Development

No branches or pull requests

4 participants