-
-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge KDL v2 #286
base: main
Are you sure you want to change the base?
Merge KDL v2 #286
Conversation
Honestly, they're just too implementation-specific
As I read the grammar in the spec, `"//"` wouldn't parse as a single-line-comment as it requires as least one non-newline character after the slashes.
/cc @CAD97 |
@@ -0,0 +1 @@ | |||
node "\\" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there also be an output case for this, and one for forwardslash (node "/"
) then?
Seeing as the escaped form was removed
EDIT: mixing up my slashes again
I have a preference for #204, because the primary use case I can see for That preference is not terribly strong, though. Edit: I misread, I'm fine with either |
To clarify, #241 allows Argument for allowing: transliterating CSS selectors, for e.g. CSS-in-KDL. Argument against allowing: using the syntax in KQL as a selector like CSS. |
My inclination is to prefer #241 as well, as I think being able to write hashtags is neat. It also allows for doing things like writing Nix flake references as bare words, e.g. |
Can we squeeze #213 into this? The specific proposal is the addition of escaped whitespace in string literals– that |
I'm also a fan of #213, though it seems like there's some ambiguity in the discussion. Namely, does
Translate to |
@Lucretiel do you have time to put together a PR with this grammar+prose change? I'm game. |
Yes, tonight I can put that together :) should it be in the form of an amendment to SPEC.md? |
I agree there's some ambiguity in the original. That example would translate to |
yep!
Is this what Rust does? I would've expected that to at least preserve the first newline. Then again, this is consistent with KDL's existing |
[src/main.rs:2] dbg!("\
here\
is\
an\
example\
") = "hereisanexample" |
It's worth noting that bash behaves similarly as far as just dropping the newline, though it doesn't consume space afterward:
With that I think Edit: Scratch that, I'm a space cadet:
I'm more prone to emulating bash over rust, but I'm curious how others feel |
Bash's behavior is concerned with syntactic whitespace (ie, allowing commands to spread over multiple lines with line continuations). It doesn't meaningfully behave in terms of consuming or not consuming specific whitespace so much as it extends a line to the next line while retaining the separation of tokens for a command. In your
Kaydle has basically the same behavior with its own line continuation syntax, where you can use a
#213 is instead concerned with treatment of escaped whitespace in strings, where I think the plain consumption of unescaped whitespace makes the most sense
Rust does just consume all whitespace, regardless of type. The canonical way to add newlines to a whitespace-escaped string to to escape them: assert_eq!(
"line 1\n\
line 2\n\
line 3\n",
"line 1
line 2
line 3
"
); Though more commonly I use it to stretch out long sentences with simple spaces: assert_eq!(
"This is a sentence with a \
lot of words in it.",
"This is a sentence with a lot of words in it."
); |
JSON-IN-KDL.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the JiK spec mention that #inf
, #-inf
, and #nan
are invalid values in a JiK context?
It is already implied, as JSON can't represent these values, but I think it would make sense to make this explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good for a warning, yeah.
* `null`, `true`, and `false` are now `#null`, `#true`, and `#false`. Using | ||
the unprefixed versions of these values is a syntax error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry if this has been explained somewhere before, a 100+ comment discussion can be hard to follow.
Speaking purely from a user's perspective, this specific change feels a bit unnecessary. I assume it's being done to prevent ambiguity, but null
, true
and false
are keywords common enough, that I don't think anyone with the slightest experience writing code would be surprised if they have special meanings unlike normal identifiers. If anything, being a Rust user, I would come into this expecting the exact opposite: that true
means the boolean and #true
means the raw identifier, similar to how it works in Rust sans the r
.
Is there something I'm missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided this would be enough of a potential footgun since the change to allow unquoted strings. Programming languages like Rust have other defenses against this kind of confusion, but kdl would need to do something different (like this prefixing) to prevent, say, the kind of things you see happen in plain JavaScript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, and fair enough, if that's the point of balance you've decided on for your project.
How exactly do multi-line strings interact with whitespace escapes ( That is, are these allowed?
|
Co-authored-by: Bannerets <[email protected]>
Co-authored-by: Bannerets <[email protected]>
Co-authored-by: Bannerets <[email protected]>
Co-authored-by: Bannerets <[email protected]>
@Bannerets I've made some changes to clarify this. wdyt? |
🤔 maybe it should be the other way around: it might be easier to implement to have escapes process first, and then do multiline stuff. It might also make more sense, too. But if you're using multiline escapes in multiline strings, you're asking for trouble anyway. |
I think interpreting escapes after dedenting seems intuitive, in the sense that the string is parsed exactly as if indentation were just not in the source code. Quickly checking multi-line strings in Elixir, it looks like the string is dedented before escapes, and, e.g., The kdlua implementation expands escapes before dedenting, I think (link). |
ahhh yes. I see your point. I'll change it back to be dedent-before-escape. |
There, that's done :) |
opening `"`, and a final newline plus whitespace preceding the closing `"`. | ||
* SMALL EQUALS SIGN (`U+FE66`), FULLWIDTH EQUALS SIGN (`U+FF1D`), and HEAVY | ||
EQUALS SIGN (`U+1F7F0`) are now treated the same as `=` and can be used for | ||
properties (e.g. `お名前=☜(゚ヮ゚☜)`). They are also no longer valid in bare |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example doesn't look right: surely (
and )
aren't allowed here?
Implementing the multi-line string and whitespace escaping rules is proving quite subtle.
This sounds simple enough: if there are newlines in the string, I check that the indentation is consistent and remove it. Then, I handle the various backslash escapes. That should take care of illegal strings like
(from in the spec), and legal strings like
(which is equal to This algorithm does not work for this example in the spec:
Before considering the The spec prose appears to have a solution to this conundrum: (emphasis mine)
So if all newlines in the string are escaped, it is not a multi-line string? To my mind that should imply that escaped newlines are not newlines for the purposes of dedenting, and contradicts the rule that dedenting comes before backslash escapes. Overall, not very satisfying. I would suggest:
Edit: I've added a PR - #391 |
I wonder if the more intuitive way for strings to work would be:
This should have the same result as the #391 rule for all strings valid under that rule, but also accept more cases with escaped newlines. |
Here it is! The long-awaited KDL v2, which is where we go ahead and make a handful of technically-breaking changes to address some corner cases we've run into over the past year while KDL has been getting implemented in a bunch of languages by various people.
I'd love to get feedback on what we have slated, and whether there's anything else we should definitely include when this goes out.