Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in SeqUtils.transform's Spliterator.characteristics() and Spliterator.getComparator() #311

Open
tlinkowski opened this issue May 1, 2017 · 7 comments

Comments

@tlinkowski
Copy link

I think I found some inconsistencies in SeqUtils.transform concerning the method characteristics().

We have:

            @Override
            public int characteristics() {
                return delegate.characteristics() & Spliterator.ORDERED;
            }
            
            @Override
            @SuppressWarnings("unchecked")
            public Comparator<? super U> getComparator() {
                
                // This implementation works with the JDK 8, as the information
                // is really only used in 
                // java.util.stream.StreamOpFlag.fromCharacteristics(Spliterator<?> spliterator)
                // Currently, the point of this method is only to be used for
                // optimisations (e.g. to avoid sorting a stream twice in a row)
                return (Comparator) delegate.getComparator();
            }

But since in calculating the characteristics above we use bitwise AND (&) with Spliterator.ORDERED, ths Spliterator will never report Spliterator.SORTED, and so the method getComparator will never be called in StreamOpFlag.fromCharacteristics:

(characteristics & Spliterator.SORTED) != 0 && spliterator.getComparator() != null

I think characteristics() should return something like this:

            public int characteristics() {
                return (delegate.characteristics() | Spliterator.ORDERED) & ~(Spliterator.SIZED | Spliterator.SUBSIZED);
            }

which would mean "we preserve the original characteristics, adding ORDERED if absent, and removing SIZED and SUBSIZED if present".

@lukaseder
Copy link
Member

Interesting observation. Indeed, perhaps the existing operation is too agressive. Would you mind explaining the rationale behind your version (the part about SIZED and SUBSIZED)? Are none of the other characteristics affected, in your opinion?

@tlinkowski
Copy link
Author

Well, as far as I understand, SeqUtils.transform always returns a subset of elements from the source (never adding any elements) so:

  • it cannot be SIZED nor SUBSIZED (some element may have been removed)
  • it is still DISTINCT, NONNULL, and IMMUTABLE (nothing added)
  • it is still SORTED (nothing reordered)
  • I don't know about CONCURRENT - I guess it's safer to remove it too
  • as for ORDERED, I assumed we want to impose it because the docs say that Seq is seqential and ordered

@lukaseder
Copy link
Member

Thanks for taking the time to explain this (and sorry again for my laziness here. jOOQ has been keeping me rather busy, recently). You made good points. If you're interested, I think this looks good for a PR. I think we could put your explanation there as a comment, too with a reference to this issue, so we'll still remember this in the future.

@tlinkowski
Copy link
Author

Sure, I'll provide a PR.

PS. As to response time, I'm absolutely fine with responses in the spirit of "eventual consistency" ;)

@lukaseder
Copy link
Member

Yeah. "Eventually", my inbox with notifications reaches zero, right? :)

@tlinkowski
Copy link
Author

Heh, you're right - the analogy is flawed :)

tlinkowski pushed a commit to tlinkowski/jOOL that referenced this issue May 9, 2017
@tlinkowski
Copy link
Author

Hm, I took a much closer look and I found the following:

  • in Seq.onEmptyGet we cannot preserve NONNULL because the Supplier may return null
  • in Seq.cycle we cannot preserve DISTINCT nor SORTED (but I'd like to change its implementation to one using SeqBuffer so it wouldn't matter)
  • in Seq.shuffle we cannot preserve SORTED (but I've already proposed a new implementation using SeqUtils.lazy so it wouldn't matter either)
  • in Seq.scanLeft we have a mapper function so we cannot preserve DISTINCT, NONNULL, nor SORTED but it seems this implementation can be easily simplified to the following:
return Seq.of(seed).concat(stream.map(t -> value[0] = function.apply(value[0], t)));

(which seems safe unless someone calls stream().parallel() - to alleviate that we'd need some DelegatingSequentialSpliterator class that would delegate all methods except trySplit and that would not report CONCURRENT)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants