-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jsonlines.org and ndjson.org #22
Comments
This site links to ndjson from http://jsonlines.org/on_the_web/ and ndjson.org links back here from its footer, is that not sufficient? |
In the interest of converging on a single standard, would it not be beneficial for these two sites co-ordinate and agree on items, and ideally just be one .org. Having two similar sites each promoting an emerging 'standard' with differences gives a sense it's not ready for interoperation. |
created similar issue in the ndjson repo: |
👍 for just one standard and one web site. Based on a quick look there aren't any real differences except to the extension (. |
Cross linking is not sufficient. Anyone would google with I'm sure the author was not confused. :) But I'm confused.
|
Observation: If repository issue activity is any metric for discoverability, then JSON Lines has an advantage.
I understand that, historically, there were spec differences due to potential ambiguity (UTF-8 encoding, required JSON data on every line, etc.), but it seems as though they are now aligned. At this point in time, are there any remaining spec differences? And are there any other issues which are preventing convergence (e.g. copyright credit, etc.)? I think the community will greatly benefit from a single, unified standard with an RFC, registered IANA media type, etc. The involved parties appear to be reasonable and responsive. Can we make this happen? |
I prefer the name "JSON lines" because that seemed like the obvious name to me :-) but, the ndjson folks did go the extra mile and write a spec. If we're fully aligned I like the idea of settling on a single name. Is there an unbiased measure we can use for deciding? |
@wardi Names are names and will always be arbitrary/subjective. 😅 I think it's just up to the party that submits the RFC and registers. IMO, a unified standard with either name is better than two ambiguously identical alternatives. |
The ndjson repo hasn't seen any maintainer activity in years. That makes it both impossible to pick this and have them redirect, and a bad idea to pick them and redirect from here. |
The owner of the ndjson domain seems to be fine going forward with jsonlines.org. |
This is a mess. Let's finally get to some decision. My proposition is to take the already existing JSON Text Sequences RFC 7464 and enrich it with additions: add a file extension A good overview of all streaming formats https://en.wikipedia.org/wiki/JSON_streaming
It's basic idea to have "unambiguous JSON" resilient to many forms of damage such as truncation, multiple writers incorrectly configured to write to the same file, corrupted JSON, etc. An example sequence: ␞{"d":"2014-09-22T21:58:35.270Z","value":6} From the spec:
So basically for a simplest case when I know that the data is not corrupted I can simply use a concatenated JSON. I can use line separators too and they'll just ignored as in usual JSON. Example 1:
Example 2: two documents but formatted with a newline
If I may have corrupted JSONs then a newline may be used. But here may be a problem to distinguish when the newline was used just for a formatting and when to split two documents. Example 3: the first document is broken and doesn't have a closing bracket but
Example 4: first doc is broken, then newline, and the second doc is formatted with a newline
But visually we still can distinguish where the first doc ends and the second starts. If I need to have top level values then the @nicowilliams you are the author of the RFC 7464. Please give us your thoughts. Is it possible to make some errata for the spec? Related: already was discussed an idea to use the The file extension: both |
@stokito updating RFC 7464 as you describe sounds good to me. |
Could we include the MIME type |
from issue in #65 (comment) I don't think jsonlines is going into any direction to allow incomplete record, empty lines, or other type of linebreaks that doesn't separate valid JSON records. I am not sure amending RFC 6474 will be valid in that context. The examples you gave seems to allow that. To me streaming JSON is a whole other problem, I think jsonlines is about a succession of valid JSON, like you would do a succession of API call for batching input or reading some process results (we've been using it with Amazon Comprehend to manage training corpus for example, or the recognition job inputs) |
Imagine the file extension being a format like Taking inspiration from:
|
The difference is that a So |
Point taken. I would still be for
Anyways, just throwing this out. Glad this concept has been seen. Seems like all emoji interactions like the concept, but just preferred it swapped around. I'm completely down for that. |
I would rather see an extension that specifically says which it is. We don't use |
hey I noticed http://ndjson.org/ and http://jsonlines.org/ are very similar, I was just wondering if maybe they could link to each other to reduce confusion? I like both names personally and use them interchangeably
cc @chrisdew
The text was updated successfully, but these errors were encountered: