Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose a design for generic lists #351

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Propose a design for generic lists #351

wants to merge 2 commits into from

Conversation

jackfirth
Copy link
Collaborator

@jackfirth jackfirth commented Sep 5, 2023

This builds off #201, #221, #259, and #330, providing a more concrete design and plan for a generic interface for immutable lists. Currently still in draft status.

I'm working on this with @OxSon, who is handling most of the RRB tree implementation and benchmarking.

Rendered

@jackfirth jackfirth added the enhancement New feature or request label Sep 5, 2023
@samth
Copy link
Member

samth commented Sep 12, 2023

A few scattered thoughts:

  • translating [1,2,3] to builder notation seems like it would be a significant performance hit; can't we translate it to something like List.unsafeFromVector(racket.vector(1,2,3)) (or use quoted vectors for the argument)?
  • what does cons on various kinds of lists do? Does it copy?
  • does it make sense to have hybrid representations to avoid copying?

@jackfirth
Copy link
Collaborator Author

* translating `[1,2,3]` to builder notation seems like it would be a significant performance hit; can't we translate it to something like `List.unsafeFromVector(racket.vector(1,2,3))` (or use quoted vectors for the argument)?

Sure. Alternatively, we could translate it to List(1, 2, 3) and then do whatever optimization we do for List(x, ...). Now that I think of it that's probably better.

* what does `cons` on various kinds of lists do? Does it copy?

There is no List.cons() method. You could write List.insert(0, element), which would do whatever the underlying insert() method does. That method is expected to avoid copying in any persistent implementation of List. For ConsList specifically, there could be a cons() function that takes an element and another ConsList. That's what we've got today anyway.

* does it make sense to have hybrid representations to avoid copying?

Maybe. We've certainly got the option to try it in the future. I'd like to get more done before exploring it though, since mutable collections and non-list collections might complicate the picture significantly.

@OxSon OxSon mentioned this pull request Sep 17, 2023
7 tasks
@jackfirth jackfirth marked this pull request as ready for review September 19, 2023 01:58
@jackfirth
Copy link
Collaborator Author

This is ready for review now.

@rfindler
Copy link
Member

One question I have is about append: will that be part of the builder interface? (It seems like a common pattern to build up a list with appends, not that I've done any formal studies or anything!)

@jackfirth
Copy link
Collaborator Author

One question I have is about append: will that be part of the builder interface? (It seems like a common pattern to build up a list with appends, not that I've done any formal studies or anything!)

It will, in the form of a ListBuilder.addAll() method. I'm not planning on implementing it or the other batch methods until everything else is done though.

@mflatt
Copy link
Member

mflatt commented Sep 29, 2023

The proposal should make clearer that ConsList is intended as distinct from a Racket list (that I learned via Discord). A ConsList might have Racket pairs internally, but with a wrapper around the head pair or empty list. That's different than the current Rhombus List, which is a Racket list (so, I did not understand that facet of this proposal, at first).

In retrospect, if List can refer to something with a different representation than Racket lists (e.g., using an RBB-tree), then I have to agree that a Racket list cannot be a Rhombus List. Trying to make that correspondence work would create more problems than it solves in terms of implementing ==, bridging to Racket, and managing programmer expectations. A Racket list can be used as a sequence, I expect, but not as something that is == to any List.

This seems like a significant performance challenge right at the start, since not using Racket lists is likely to impose a cost (relative to Racket) on most programs. I had previously imagined that multiple List implementations might be handled by inlining dispatch to the common(?) case of cons cells—which would inflate code but maybe not make things much slower—but I doubt that it works out if lists have an extra wrapper.

@samth
Copy link
Member

samth commented Sep 29, 2023

More generally related to what @mflatt said, I think we need to be aware that most of the models we've considered for collections generally are in typed languages with compilers (often the JVM) that can eliminate more of the cost of dynamic dispatch and particularly wrappers relative to what Racket can do. We may need to consider different collections of tradeoffs than those systems ended up with (such as different choices about equality).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants