Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove lexbuffer in favor of sedlex #369

Merged
merged 21 commits into from
Nov 16, 2023

Conversation

lubegasimon
Copy link
Contributor

@lubegasimon lubegasimon commented Nov 3, 2023

Changes made:

  • removed Lex_buffer.re because we no longer have to call from it, instead we rely on sedlex-defined functions and a custom
  • created a customlatin1 function

Copy link

vercel bot commented Nov 3, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
styled-ppx ⬜️ Ignored (Inspect) Visit Preview Nov 16, 2023 0:21am

Copy link
Owner

@davesnx davesnx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Have you tried updating sedlex to latest?

packages/parser/lib/util.re Outdated Show resolved Hide resolved
packages/parser/lib/Driver_.re Outdated Show resolved Hide resolved
@lubegasimon
Copy link
Contributor Author

Have you tried updating sedlex to latest?

Yes, in my switch v3.2 is installed (which is the latest in opam).

@davesnx
Copy link
Owner

davesnx commented Nov 4, 2023

Can you add it on the .opam file?

@lubegasimon
Copy link
Contributor Author

Can you add it on the .opam file?

Done in #375

Copy link
Owner

@davesnx davesnx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to merge this, but before that can you add a tests for an error of the parser?

I believe this will be very beneficial while we change the location and container_lum.

@lubegasimon lubegasimon force-pushed the remove-Lexbuffer-in-favor-of-sedlex branch from cba1f90 to fcc8741 Compare November 7, 2023 11:17
@lubegasimon
Copy link
Contributor Author

I would like to merge this, but before that can you add a tests for an error of the parser?

I believe this will be very beneficial while we change the location and container_lum.

I guess this is resolved here fcc8741

@lubegasimon lubegasimon force-pushed the remove-Lexbuffer-in-favor-of-sedlex branch 2 times, most recently from 5c049ed to bc043cd Compare November 8, 2023 12:19
@lubegasimon lubegasimon force-pushed the remove-Lexbuffer-in-favor-of-sedlex branch from a3a519b to c0150eb Compare November 14, 2023 15:59
@lubegasimon lubegasimon force-pushed the remove-Lexbuffer-in-favor-of-sedlex branch from c0150eb to e81ae0c Compare November 14, 2023 18:31
@lubegasimon lubegasimon force-pushed the remove-Lexbuffer-in-favor-of-sedlex branch from e81ae0c to 3e40059 Compare November 16, 2023 12:21
Copy link
Owner

@davesnx davesnx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

let escape = [%sedlex.regexp?
unicode | ('\\', Compl('\r' | '\n' | '\012' | hex_digit))
];
let escape = [%sedlex.regexp? '\\'];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this regex has changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other parts were redundant and caused issues while starting with escape chars

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

let char = uchar_of_int(char_code);
let _ = consume_whitespace(buf);
char_code == 0 || is_surrogate(char_code)
// U+FFFD is a character used as a substitute for an uninterpretable character from another encoding
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where this U+FFFD is being treated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is referenced as a string on the next two lines. Though I am not so convinced with it.

Comment on lines +7 to +17
| Error((loc, msg)) =>
let pos = loc.Css_types.loc_start;
let curr_pos = pos.pos_cnum;
let lnum = pos.pos_lnum;
let pos_bol = pos.pos_bol;
let err =
Printf.sprintf(
"%s on line %i at position %i",
msg,
lnum,
curr_pos - pos_bol,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think is worth it add this level of detail on the error reporting by the ppx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is for clarity.

@davesnx davesnx merged commit 0c3deab into davesnx:main Nov 16, 2023
@lubegasimon lubegasimon deleted the remove-Lexbuffer-in-favor-of-sedlex branch December 6, 2023 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants