Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature removal request] Disallow tokens after numbers without separation #987

Closed
aaaaaa123456789 opened this issue Mar 7, 2022 · 6 comments
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM

Comments

@aaaaaa123456789
Copy link
Member

After the long discussion the other way about whether something like 123abc should be valid or not, the issue right now is that it's impossible to safely implement any feature that extends numeric syntax in any way, because 123abc parses as 123 followed by abc.
This is also confusing (unexpected by virtually everyone), almost certainly bug-inducing (consider operator precedence issues), and used by virtually nobody (only one use case found in our brief research, which could be equally or better served by separating the suffix from the number with a space, and would definitely be better served by user-defined functions).

While the space for expansion in terms of number syntax is large, and we could probably have a long debate about what's useful and what's not, all of that is blocked by this syntax.

Therefore, my request is that this parsing becomes deprecated in 0.5.x and removed in 0.6.0. That way we can begin talking about any features using this space after 0.6.0 comes out someday.

@Rangi42
Copy link
Contributor

Rangi42 commented Mar 7, 2022

When or if we add a feature involving letters following numbers, we can just change how lexing works then, and have users add spaces. If we never do so, there's no need to change it.

That said, when I get back to #958 and add 12.34q8 syntax for fixed-point values not using the current default -Q precision, that will fix this issue.

@aaaaaa123456789
Copy link
Member Author

If we never do so, there's no need to change it.

It's a breaking change (as @ISSOtm raised on Saturday), and thus it should have an explicit deprecation/removal unless we know nobody is using it.

@Rangi42
Copy link
Contributor

Rangi42 commented Mar 7, 2022

It's a breaking change on the same level as introducing a STRFMT function and breaking users' labels with that name. We don't need to delay the feature by a release while we add a deprecation warning for labels named "STRFMT".

Anyway, whether or not we add a deprecation step, should 123abc lex as a new token type?

@ISSOtm
Copy link
Member

ISSOtm commented Mar 7, 2022

There's no need to extend numeric syntax right now, so this change is unnecessary. We can deprecate when the needs arise, but I don't see any downsides with the current situation.

@aaaaaa123456789
Copy link
Member Author

Anyway, whether or not we add a deprecation step, should 123abc lex as a new token type?

It should lex as a number (and then fail, like 2.2.2 would).

@Rangi42 Rangi42 added enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM labels Jul 9, 2022
@Rangi42
Copy link
Contributor

Rangi42 commented Aug 6, 2024

2.2.2 lexes as a fixed-point number 2.2 followed by a local identifier .2. It can't possibly lex as a number and then fail, because it would have to have a numeric value to be lexed as one, with the exact source characters no longer mattering. (10, $A, and 0.00015 all lex as the same TOK_NUMBER with value 10.)

Big-name C compilers have special handling for these sort of inputs, but others act like rgbasm does, lexing two separate tokens and then getting a parse error.

Input:

int x = 123abc;
int y = 12.34pq;
int z = 12.34.56;

gcc handles it specially:

<source>:1:9: error: invalid suffix "abc" on integer constant
    1 | int x = 123abc;
      |         ^~~~~~
<source>:2:9: error: invalid suffix "pq" on floating constant
    2 | int y = 12.34pq;
      |         ^~~~~~~
<source>:3:9: error: too many decimal points in number
    3 | int z = 12.34.56;
      |     

SDCC works like RGBASM:

<source>:1: syntax error: token -> 'abc' ; column 14
<source>:2: syntax error: token -> 'pq' ; column 15
<source>:3: syntax error: token -> '.56' ; column 16

I think having the GB assembler and the GB C compiler behave similarly here is fine.

@Rangi42 Rangi42 closed this as not planned Won't fix, can't repro, duplicate, stale Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Typically new features; lesser priority than bugs rgbasm This affects RGBASM
Projects
None yet
Development

No branches or pull requests

3 participants