-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interleaving can happen if you insert elements in reverse order into a list #395
Comments
Thanks for opening this issue. Just wanted to drop in here to say that the current YATA implementation in Yjs (which is based on the same mathematical model as the original YATA algorithm) outperforms RGA. Yjs applies any update in However, it is true that YATA requires more fields (one more integer). But the size of the encoded Yjs document is usually smaller than Automerge, even without garbage collection. Furthermore, since everything is kept in a single linked list, Yjs/YATA doesn't require additional bookkeeping for potential child elements and therefore has a smaller memory footprint. If you are looking to switch algorithms, it might make sense to adapt Yjs/Yrs in the backend. You can express a custom view (e.g. using immutable data structures). Another advantage might be that you could make use of all the existing editor bindings for Yjs. Also, it would mean that we could work together :) One critique point I heard a lot from Automerge folks is that Yjs does perform garbage collection. You can turn that off to keep the complete history or configure it to keep only relevant parts of the history (e.g. information that is necessary to restore snapshots). |
Thank you @dmonad for chiming in. I have looked at this a bit further, with help from @josephg, and found that Yjs can also exhibit interleaving when list elements are inserted in reverse order. The following code demonstrates this happening: const Y = require('yjs')
let doc1 = new Y.Doc(), doc2 = new Y.Doc(), doc3 = new Y.Doc()
doc1.clientID = 1
doc2.clientID = 2
doc3.clientID = 3
// First doc3 inserts a bunch of 'a's, then doc1 prepends another bunch of 'a's
// (note these might actually be the same user in different editing sessions)
doc3.getArray().insert(0, ['a', 'a', 'a', 'a', 'a'])
Y.applyUpdateV2(doc1, Y.encodeStateAsUpdateV2(doc3, Y.encodeStateVector(doc1)))
doc1.getArray().insert(0, ['a', 'a', 'a', 'a', 'a'])
// Concurrently, doc2 inserts a bunch of 'B's
doc2.getArray().insert(0, ['B', 'B', 'B', 'B', 'B'])
// In the merged final state, the 'B's are interleaved among the 'a's
Y.applyUpdateV2(doc1, Y.encodeStateAsUpdateV2(doc2, Y.encodeStateVector(doc1)))
console.log(JSON.stringify(doc1.getArray().toArray()))
// Prints: ["a","a","a","a","a","B","B","B","B","B","a","a","a","a","a"] Interleaving happens less often in Yjs than in RGA because (as far as I can tell) it happens only if the reverse order insertions use multiple clientIDs. However, I can easily imagine scenarios in which that would be the case (e.g. a user does some work on one device, and then continues that work on another device, while concurrently another user is editing the document offline). I believe @josephg's variant of YATA, which is called Thank you for the offer of collaborating on Yjs. For me, Automerge is a research project as much as it is an open source product, and I have tons of CRDT research ideas that I want to try. For that purpose it is useful to have a self-contained implementation with minimal dependencies, giving us the freedom to change the internals in whatever way supports our research. For example, we want to explore Git-style branching workflows in which users don't just immediately apply every edit as soon as they receive it, but where users can explicitly manage and compare several versions/branches/forks of their document. This requires not only retaining the editing history, but also being able to diff document versions to produce "suggested changes", and selectively merging those changes. I'm not sure how well Yjs would be able to support such workflows currently. More generally, the CRDT space is big, and I think there is plenty of room for several CRDT libraries to coexist happily, each targeting a different point in the trade-off space. It's healthy to have several projects that try different approaches, so that we can learn what works and what doesn't. For that reason I think it's good for Automerge and Yjs to remain independent, and I strongly believe that both are good projects that can both thrive. Moreover, I hope we can still work together, even if our projects remain independent! For example, I would like to write a paper on this interleaving issue, and I'd love for you and @josephg to contribute to this if you're interested. We can discuss that privately. |
Martin thinks this might not be a big deal in practice, but it is technically a breaking API change, so if we want to change that algorithm we need to slightly change the way we record insert operations by adding an extra field to the operations.
We should either fix this before 1.0 or at least put the extra field into the data format so that we don't need to do a second migration.
(We should probably just adopt the algorithm Y-js uses for this which is marginally more expensive but not significantly, so let's just standardize.)
The text was updated successfully, but these errors were encountered: