-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode code-point escape identifiers #92
Comments
AFAIK, you are correct. Keep in mind that an identifier can have more than one Unicode code point escaped character, so the only possible flag would be to say, "somewhere in this identifier, there was at least one extended escape sequence," which also isn't enough information to get back to the raw representation.
I don't think is a goal of ESTree, rather, you can return a representation of the AST as code but not necessarily the representation from which the AST was generated. Since you could use the actual character or the escape sequence, it would be up to your serializer to evaluate the identifier and determine how it should best be represented in the output. |
Isn't this a specialized subset of #41, to be addressed by a CST plan? After all, |
I suppose it is. I was just trying to understand why I can get this out of acorn from {
"start": 0,
"value": "𠮷",
"raw": "'\\u{20BB7}'",
"type": "Literal",
"end": 11
} But from {
"start": 0,
"name": "𠮷",
"type": "Identifier",
"end": 9
} Seems like a strange/inconsistent limitation. If CST is my only option here, just adds more weight to why I really want to figure that out. |
I would characterize "raw" as a
Indeed. |
seems in most ways equivalent to:
IIUC, the tree (at least as I see it with acorn) will take the former of these two and represent it as if it'd originally been the latter, even in the
raw
representation. Is that correct?Unfortunately, it is possible to have an engine that supports the latter and not the former (I have it installed right now: Chrome 43). And therein lies my problem. I am trying to parse an ES6 file to see if it uses a unicode code-point escape form (the former) for the identifier, because that requires a different test than the symbol form itself (the latter).
Am I understanding this correctly? Is there no way via the estree format to tell the difference or to determine if the former was used? Even a flag on the
Identifier
node to indicate it was originally in the escaped form would be helpful. Is that possible?On a similar note, if a tool wanted to parse a program and then recreate exactly as-written without changing this identifier, how could you go back to the former from the latter represented in the tree?
The text was updated successfully, but these errors were encountered: