-
-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove lexbuffer in favor of sedlex #369
Remove lexbuffer in favor of sedlex #369
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
w.i.p: unify lexers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Have you tried updating sedlex to latest?
Yes, in my switch v3.2 is installed (which is the latest in opam). |
Can you add it on the .opam file? |
Done in #375 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to merge this, but before that can you add a tests for an error of the parser?
I believe this will be very beneficial while we change the location and container_lum.
cba1f90
to
fcc8741
Compare
I guess this is resolved here fcc8741 |
5c049ed
to
bc043cd
Compare
a3a519b
to
c0150eb
Compare
c0150eb
to
e81ae0c
Compare
e81ae0c
to
3e40059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
let escape = [%sedlex.regexp? | ||
unicode | ('\\', Compl('\r' | '\n' | '\012' | hex_digit)) | ||
]; | ||
let escape = [%sedlex.regexp? '\\']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this regex has changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other parts were redundant and caused issues while starting with escape chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼
let char = uchar_of_int(char_code); | ||
let _ = consume_whitespace(buf); | ||
char_code == 0 || is_surrogate(char_code) | ||
// U+FFFD is a character used as a substitute for an uninterpretable character from another encoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where this U+FFFD is being treated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is referenced as a string on the next two lines. Though I am not so convinced with it.
| Error((loc, msg)) => | ||
let pos = loc.Css_types.loc_start; | ||
let curr_pos = pos.pos_cnum; | ||
let lnum = pos.pos_lnum; | ||
let pos_bol = pos.pos_bol; | ||
let err = | ||
Printf.sprintf( | ||
"%s on line %i at position %i", | ||
msg, | ||
lnum, | ||
curr_pos - pos_bol, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think is worth it add this level of detail on the error reporting by the ppx?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it is for clarity.
Changes made:
Lex_buffer.re
because we no longer have to call from it, instead we rely onsedlex
-defined functions and a customlatin1
function