WIP: NonEmptySet and NonEmptyMap #616

Ericson2314 · 2019-04-02T02:45:57Z

One downside to this PR as writing is a made a mess of various vertical alignments in the code. I'd be happy to fix that if this looks promising.

treeowl · 2019-04-02T03:10:30Z

My personal alignment preferences are not aligned with the historical standards of the package. If there are to be major alignment shifts, I would prefer to see the style change as well. In particular, I like fairly minimal alignment when readability doesn't demand more, and I like things to stay pretty far to the left. The classic "we can make it look like traditional mathematical notation, so we should" argument doesn't carry much weight with me. That said, try not to change much more than you have to. A readable diff is a high priority when you're doing something complicated like this.

Ericson2314 · 2019-04-02T17:17:32Z

@treeowl Great! yes I think the current style is mainly a recipe for tons of merge conflicts. The "sed style" I did does keep the diff smaller, and when I changed it more it was to wrap lines and move to the left.

treeowl · 2019-04-02T17:31:36Z

There's a build failure.

Ericson2314 · 2019-04-02T18:49:14Z

Oops. Took out the pattern synonyms but left the extension.

Ericson2314 · 2019-04-03T00:49:47Z

Weird CI failures. Maybe restart it?

treeowl · 2019-04-03T01:49:31Z

Weird build error fixed. Real build errors remain.

treeowl · 2019-04-03T01:59:07Z

Looks like you still have some pattern synonyms going on.

Ericson2314 · 2019-04-03T02:19:56Z

Sorry! I removed that one import----didn't see anything else and built it locally again---hopefully it works now.

treeowl · 2019-04-03T02:35:35Z

It's a good start, but there are some significant issues.

The constructor names are now really awful. Maybe make Bin the constructor of NonEmptyMap and make NE (short!) the first constructor of Map? I'm not really sure what's best, but it's not this.
Since you want to expose the nonempty map type, it will need a bunch of functions. A great many of those are best expressed through mutual recursion with the Map equivalents. The current intermediate state is very awkward and will all have to change again, so let's skip that. There will be massive naming conflict issues (ugh!). Let's yank off the band-aid and deal with it. One likely option: give the nonempty functions horrible names in Data.Map.Internal and rename them all elsewhere for public consumption.

treeowl · 2019-04-03T03:13:59Z

Er ... My second point is only valid if we end up exporting that stuff. Is there enough demand for nonempty sets and maps to justify it? That at least needs a libraries list discussion.

Ericson2314 · 2019-04-03T13:52:28Z

Oh I made this intermediate state PR just because of the benchmarking of the type change alone. If you want to go all or nothing re exporting the stuff that's fine; I can email the list.

I'll do the constructor renames in any event.

Ericson2314 · 2019-04-19T18:54:45Z

OK fixed conflicts and did

Bin -> NE
NonEmpty* (constructor) -> Bin

as requested.

treeowl · 2019-04-19T19:07:21Z

I think it's pretty safe to assume we're going to expose the nonempty types (by some names or other). So if you have time, could you start working on breaking up the functions into mutually recursive pairs? Don't squash into the current commit yet, in case we change our minds.

treeowl · 2019-04-19T19:08:24Z

Use NE suffixes for the nonempty functions; we'll rename them in the nonempty modules.

Ericson2314 · 2019-04-19T19:50:11Z

@treeowl OK sounds good. I renamed the PR accordingly.

Ericson2314 · 2019-04-19T21:55:46Z

@treeowl OK pushed a select few Set functions just to double check the style looks good with you.

Relatedly, the wording of Note: Local 'go' functions and capturing seems to warn against capturing arguments in go, but the commits that touched it, 9be5ec2 and 78c1e52, say in the commit message that it's OK, and in the code always seem to capture more, not less. I took the liberty of capturing more to be less noisy to compensate for the noise of mutual recursion.

Ericson2314 · 2019-04-19T23:34:20Z

@treeowl hold up, github is loosing comments, ugh. (It normally just marks outdated comments as such and force pushes are fine.)

Ericson2314 · 2019-04-19T23:37:45Z

(9555fb4 some missing comments here)

Ericson2314 · 2019-04-19T23:38:25Z

I'll just not force-push for now; that seems easiest.

treeowl · 2019-04-19T23:44:27Z

Yeah, GitHub is rough. I meant member, not lookup. Sorry. Things are about to get crazy here.... Passover is starting.

Ericson2314 · 2019-04-20T00:21:01Z

Oh no worries at all. I'll keep poking at it, taking those into account.

Sync non-empty branch with upstream

alexfmpe · 2022-09-28T21:05:26Z

@treeowl I'm helping with some legwork on this PR.
Merged with master and dusting it off a bit before having a go at benchmarking for regressions.
I have some questions about the exported NE variants:

There are a good deal of functions that differ only by calling nonEmpty or fmap nonEmpty at the very end. These don't seem to add much benefit. I could see a case for those if they were very common, but not in general.

intersectionNE :: Ord k => NonEmptyMap k a -> NonEmptyMap k b -> Maybe (NonEmptyMap k a)
intersectionNE t1 t2 = nonEmpty $ intersection (NE t1) (NE t2)

traverseMaybeWithKeyNE
  :: Applicative f
  => (k -> a -> f (Maybe b)) -> NonEmptyMap k a -> f (Maybe (NonEmptyMap k b))
traverseMaybeWithKeyNE f = fmap nonEmpty . traverseMaybeWithKeyFromNE' f

Some functions are asymmetrical. For instance, insert produces a map with at least one element with possibly empty maps as children, so it's 'fundamental' type is Map k a -> NonEmptyMap k a. Currently we have
```
insert :: Ord k => k -> a -> Map k a -> Map k a
insertNE :: Ord k => k -> a -> NonEmptyMap k a -> NonEmptyMap k a
```
but I would instead expect some
```
insert' :: Ord k => k -> a -> Map k a -> NonEmptyMap k a
```
since the other two variants can be obtained by forgetting non-emptiness
```
insert k a = forget . insert' k a
insertNE = insert' k a . forget 
```
with forget being a placeholder for some exposed version of NE.
The advantage of having more functions is that some pattern-matching can be skipped by avoiding client code tacking on a NE only to have the library immediately doing a case on it. But maybe we can avoid exporting a bunch of variants by adding rewrite rules for these cases?

Now, replacing the type of insert would be one big breaking change, so one option is to provide the Map -> NonEmptyMap variant instead of current insertNE, which would be a non-breaking change and yet keep open the possibility of a future breaking change.

phadej · 2022-09-29T11:38:25Z

My personal wish is that Data.Map module doesn't mention NEMap.

E.g. there is no consNE :: a -> [a] -> NonEmpty a in Data.List and having such would be weird. There are both
(:|) :: a -> [a] -> NonEmpty a and cons :: a -> NonEmpty a -> NonEmpty a in Data.List.NonEmpty though (the naming problem is cleverly avoided)

But I agree that nonEmpty and fmap nonEmpty are better left off for NEMap functions, e.g. there is

filter :: (a -> Bool) -> NonEmpty a -> [a]

in Data.List.NonEmpty. https://hackage.haskell.org/package/base-4.17.0.0/docs/Data-List-NonEmpty.html#v:filter, and no -> Maybe (NonEmpty a) variant.

phadej · 2022-09-29T11:39:40Z

To be clear:

I expect this change to cause minimal or even no changes to the Data.Map and Data.Set module interfaces.

treeowl · 2022-09-29T11:42:37Z

@phadej Why do you think that's the right thing to expect?

konsumlamm · 2022-09-29T13:30:58Z

Has it already been decided that we really want non-empty maps/sets in containers? They don't seem particularly useful to me (I'm generally not a fan of NonEmpty* types, because now you have to duplicate every function that should work on both variants or constantly convert between the variants), especially for a core package like containers. nonempty-containers already exists, if you really want nonempty containers.

As for the names, adding NE isn't very ergonomic, or would that be renamed for exporting (if so, that would add quite a bit of bloat, wouldn't it)?

phadej · 2022-09-29T14:43:13Z

@treeowl I expect no breaking changes, which implies there won't be massive changes in Data.Map or Data.Set. If you change the type of Data.Map.insert, I'm sure people will be very angry. I will be for sure.

@konsumlamm if you look at the implementation, it's almost free:

+data Map k a  = Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)
-data Map k a  = NE {-# UNPACK #-} !(NonEmptyMap k a)
              | Tip

+data NonEmptyMap k a = Bin' {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)

It's silly to duplicate the code in nonempty-contrainers, especially as the implementation there is bad:

data NEMap k a =
    NEMap { nemK0  :: !k   -- ^ invariant: must be smaller than smallest key in map
          , nemV0  :: a
          , nemMap :: !(Map k a)
          }
  deriving (Typeable)

OTOH. It might be a good idea to first only do the changes in Internal modules, and then do a companion package which would try to figure out the interfaces for non-empty modules, without forcing the major version bumps of containers version. These are very disruptive. I don't believe that any person or committee can figure out the good interface first time, so experimentation/stabilisation is better done in a separate package.

phadej · 2022-09-29T14:46:37Z

The idea of doing only internal changes first also solves the name bikeshedding problem. As soon as internals are done, you can make a release.

konsumlamm · 2022-09-29T14:58:58Z

if you look at the implementation, it's almost free:

+data Map k a  = Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)
-data Map k a  = NE {-# UNPACK #-} !(NonEmptyMap k a)
              | Tip

+data NonEmptyMap k a = Bin' {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)

This PR does a lot more than just that though, it adds a bunch of new functions (more than 1000 additional LOC).

phadej · 2022-09-29T15:01:23Z

@konsumlamm that is why I suggest to keep changes to the Internal modules, to reduce the size of the change.

jwaldmann · 2022-09-29T16:01:13Z

Has it already been decided that we really want non-empty maps/sets in containers? They don't seem particularly useful to me

I think the same.

I can see that it's nice to know statically whether some collection (sequence, list) is non-empty (although it feels a bit arbitrary - perhaps I want the size to be greater than 42, or prime)

Specifically for Map and Set, we will (mostly?) ultimately look up some key, and we can't know statically whether it's there, even if we have static info on emptiness. Well, unless we only do bulk operations - but then intersection has a similar problem: the result can be empty.

On the other hand, I do follow the "curried map" motivation (see beginning of #608 (comment)). But then, what will we do with these curried maps? Probably some look-up, so .. see above.

In all, I probably won't use guaranteed-non-empty map/set much (in my code, in my teaching), so I prefer that the well-known API (Data.Map, Data.Set) does not change - as @phadej was already saying in #616 (comment)

treeowl · 2022-09-29T16:08:35Z

The existing public API can't change, but the implementation can, as long as performance doesn't suffer.

konsumlamm · 2022-09-29T16:34:45Z

The existing public API can't change, but the implementation can, as long as performance doesn't suffer.

Has anyone already done benchmarks to see how this change affects performance?

alexfmpe · 2022-10-02T21:29:04Z

Has anyone already done benchmarks to see how this change affects performance?

I've ran set-benchmarks and map-benchmarks which seem to indicate, respectively, a geometric mean of 10% and 35% slowdown, but it also seems like the slowdown is mostly concentrated on insert/delete related methods, so I suspect it pretty much boils down to slowdown creeping into primitives like link/glue. Oddly enough some things seems straight up faster, like Set.member or Set.filter.

Here's the result for set-benchmarks following https://github.com/Bodigrim/tasty-bench#comparison-against-baseline

Name,Old,New,Ratio
All.member,221875789,209289636,0.943274
All.insert,1034989168,1407236575,1.35966
All.map,200213961,224134561,1.11948
All.filter,79314771,71061466,0.895942
All.partition,166230030,182848472,1.09997
All.fold,37304,36632,0.981986
All.delete,510582421,960032359,1.88027
All.findMin,16609,17358,1.0451
All.findMax,18064,17922,0.992139
All.deleteMin,100079,99664,0.995853
All.deleteMax,98491,99977,1.01509
All.unions,171477906,195428890,1.13967
All.union,172355501,196027891,1.13735
All.difference,114564277,133476880,1.16508
All.intersection,72926835,56555242,0.775507
All.fromList,78321178,99341937,1.26839
All.fromList-desc,958907987,1326825750,1.38368
All.fromAscList,176746806,178201611,1.00823
All.fromDistinctAscList,47820229,59936910,1.25338
All.disjoint:false,30067,30159,1.00306
All.disjoint:true,165839725,160507622,0.967848
All.null.intersection:false,72714958,55893992,0.768673
All.null.intersection:true,218427448,132499310,0.606606
All.alterF:member,867611900,867927140,1.00036
All.alterF:insert,1221301962,1186877518,0.971813
All.alterF:delete,489001276,469350754,0.959815
All.alterF:four,852520328,851147737,0.99839
All.alterF:four:strings,1756882975,1694781500,0.964652
All.alterF_naive:four,1490742162,4849000625,3.25274
All.alterF_naive:four:strings,4308662200,7881103600,1.82913
Geometric mean,,,1.09951

Specifically for Map and Set, we will (mostly?) ultimately look up some key, and we can't know statically whether it's there, even if we have static info on emptiness. Well, unless we only do bulk operations - but then intersection has a similar problem: the result can be empty.

I would say it's not just lookups. A recurring use-case is to perform a bunch of Map/Set operations and then either spit out a NonEmpty list or iterate the Map/Set somehow. For instance, it is extremely common in UI code to represent an empty container with a "no results" state, rather than merely an empty list. While this is at times achievable by converting to a list and calling nonEmpty, sometimes we want to preserve the underlying structure (e.g. using Traversable NonEmptyMap) or simply avoid superfluous empty states, as per the curried map example.

My plan going forward is roughly to:

PR to master benchmarks for some pre-existing exported utils that don't have them to get a more comprehensive picture of performance impact.
Investigate the existing speed-ups and try to standalone PR them into master if applicable
Remove all external facing exports from this PR.
Remove all the utils in this PR that merely add a nonEmpty on top of other functionality
Reduce new variants as much as possible, in the spirit of (2) in WIP: NonEmptySet and NonEmptyMap #616 (comment)
Try to fix remaining regressions in the PR
Get the PR into a state where no external API changes whatsoever exist, but the internals are restructured to use NonEmpty where applicable.

Once that's all done and merged we can start thinking about external facing API.

Thoughts?

phadej · 2022-10-02T21:32:47Z

@alexfmpe that sounds great!

alexfmpe · 2022-10-12T18:23:47Z

The proliferation of split functions and combinations has been gnawing at me, so I did a little experiment on 'emptiness polymorphism'.

Looks like we can have both Set and NonEmptySet exist without having to immediately change implementations. This simple change to master compiles:
alexfmpe@83f14b3

A bunch of functions that do not produce sets or preserve emptyness can be 'generalized' by simply changing the types and adding a dummy to argument to Tip:
alexfmpe@460932f

This approach would make the diff massively smaller, and less room for regressions since the implementations are essentially the same. Benchmark for the above commit in 9.4.2:

All
  member:                    OK (0.14s)
    132  μs ±  12 μs,   0 B  allocated,   0 B  copied, 8.0 MB peak memory,       same as baseline
  insert:                    OK (0.32s)
    626  μs ±  52 μs, 2.6 MB allocated,  56 KB copied, 8.0 MB peak memory,       same as baseline
  map:                       OK (0.21s)
    110  μs ± 7.6 μs, 557 KB allocated,  16 KB copied, 8.0 MB peak memory,       same as baseline
  filter:                    OK (0.17s)
    80.6 μs ± 6.1 μs,  80 KB allocated, 1.2 KB copied, 8.0 MB peak memory,       same as baseline
  partition:                 OK (0.17s)
    158  μs ±  13 μs, 239 KB allocated, 3.7 KB copied, 8.0 MB peak memory,       same as baseline
  fold:                      OK (0.17s)
    35.4 ns ± 3.2 ns, 695 B  allocated,   0 B  copied, 8.0 MB peak memory,       same as baseline
  delete:                    OK (0.22s)
    421  μs ±  37 μs, 1.3 MB allocated, 349 B  copied, 8.0 MB peak memory,       same as baseline
  findMin:                   OK (0.58s)
    16.7 ns ± 1.5 ns,  31 B  allocated,   0 B  copied, 8.0 MB peak memory,       same as baseline
  findMax:                   OK (0.29s)
    16.7 ns ± 1.6 ns,  31 B  allocated,   0 B  copied, 8.0 MB peak memory,       same as baseline
  deleteMin:                 OK (0.23s)
    103  ns ± 9.7 ns, 471 B  allocated,   0 B  copied, 8.0 MB peak memory, 10% less than baseline
  deleteMax:                 OK (0.26s)
    121  ns ± 8.4 ns, 511 B  allocated,   0 B  copied, 8.0 MB peak memory, 13% more than baseline
  unions:                    OK (0.35s)
    167  μs ± 7.1 μs, 472 KB allocated,  10 KB copied, 8.0 MB peak memory, 20% more than baseline
  union:                     OK (0.35s)
    169  μs ± 6.9 μs, 472 KB allocated,  10 KB copied, 8.0 MB peak memory, 20% more than baseline
  difference:                OK (0.22s)
    107  μs ± 5.4 μs, 239 KB allocated, 2.4 KB copied, 8.0 MB peak memory,       same as baseline
  intersection:              OK (0.21s)
    53.3 μs ± 3.4 μs,  80 KB allocated, 834 B  copied, 8.0 MB peak memory, 16% more than baseline
  fromList:                  OK (0.29s)
    69.0 μs ± 3.4 μs, 271 KB allocated, 6.9 KB copied, 8.0 MB peak memory,       same as baseline
  fromList-desc:             OK (0.15s)
    618  μs ±  47 μs, 2.6 MB allocated,  55 KB copied, 8.0 MB peak memory,       same as baseline
  fromAscList:               OK (0.13s)
    123  μs ±  12 μs, 414 KB allocated, 8.3 KB copied, 8.0 MB peak memory,       same as baseline
  fromDistinctAscList:       OK (0.20s)
    50.5 μs ± 3.6 μs, 159 KB allocated, 3.2 KB copied, 8.0 MB peak memory,       same as baseline
  disjoint:false:            OK (0.24s)
    26.7 ns ± 1.5 ns,  31 B  allocated,   0 B  copied, 8.0 MB peak memory,       same as baseline
  disjoint:true:             OK (0.16s)
    149  μs ±  11 μs, 214 KB allocated, 130 B  copied, 8.0 MB peak memory,       same as baseline
  null.intersection:false:   OK (0.23s)
    54.3 μs ± 4.2 μs,  80 KB allocated, 834 B  copied, 8.0 MB peak memory, 16% more than baseline
  null.intersection:true:    OK (0.16s)
    163  μs ±  16 μs, 293 KB allocated, 173 B  copied, 8.0 MB peak memory,       same as baseline
  alterF:member:             OK (0.21s)
    816  μs ±  42 μs, 2.4 MB allocated, 393 B  copied, 8.0 MB peak memory,       same as baseline
  alterF:insert:             OK (0.19s)
    747  μs ±  50 μs, 3.6 MB allocated, 106 KB copied, 8.0 MB peak memory,       same as baseline
  alterF:delete:             OK (0.26s)
    493  μs ±  27 μs, 1.9 MB allocated, 416 B  copied, 8.0 MB peak memory,       same as baseline
  alterF:four:               OK (0.19s)
    787  μs ±  66 μs, 2.4 MB allocated, 378 B  copied, 8.0 MB peak memory,       same as baseline
  alterF:four:strings:       OK (0.22s)
    1.69 ms ±  99 μs, 2.5 MB allocated, 423 B  copied, 8.0 MB peak memory,       same as baseline
  alterF_naive:four:         OK (0.18s)
    1.41 ms ±  89 μs, 2.0 MB allocated, 315 B  copied, 8.0 MB peak memory,       same as baseline
  alterF_naive:four:strings: OK (0.13s)
    4.15 ms ± 391 μs, 2.0 MB allocated, 469 B  copied, 8.0 MB peak memory,       same as baseline

Notice the changed methods member/map/fold have the same time performance. Haven't yet looked into what's up with delete/union/intersection but since the only thing that actually changed for those was adding () on Tip, and given what I've seen while looking at improvements/regressions in the current PR, I expect a good deal of noise in whether optimizations kick in, and probably GHC needs some hand holding here. The unexpected one is deleteMin, but I've also seen filter get a speedup on this PR by virtue of what seems to be grotesque inlining (https://pastebin.com/AKAwCkbX vs https://pastebin.com/2GBaEbN8), so it's not too surprising.

This lets a bunch of the current functions consume nonemptys, so the variants we need to add should significantly go down in size. To be able to actually create NonEmptySet, we'd need to add singleton/insert/etc variants
alexfmpe@5debc86.
The original motivation, curried-map, also introduces nonempty-ness.

Another advantage is that since we change from "a set is either empty or an element and two sets" to "a non empty set is a set that can not be empty", that is, changed the nullary constructor instead of the recursive one, it now also works when there's structurally no element in the recursive case, as is the case for IntSet/IntMap:
alexfmpe@7ee91fd

There's a drawback though - a breaking change.
Changing Set to a type synonym means instances requires consumers that add instances of Set to use TypeSynonymInstances/FlexibleInstances (maybe generalize Set' themselves). For instance, trying to test/benchmark the above fails with

src/Test/QuickCheck/Function.hs:328:33: error:
    • Illegal instance declaration for ‘Function (Set.Set a)’
        (All instance types must be of the form (T t1 ... tn)
         where T is not a synonym.
         Use TypeSynonymInstances if you want to disable this.)
    • In the instance declaration for ‘Function (Set.Set a)’
    |
328 | instance (Ord a, Function a) => Function (Set.Set a) where

If folks think this approach makes sense and that whether it should be pursued depends on the magnitude of breakage, I'd be up for doing impact assessment.

alexfmpe · 2022-10-27T02:11:20Z

Another advantage is that we can in theory expose Set' itself so downstream consumers are also able to write emptiness-polymorphic functions/instances. However, we don't want to allow parameterizing with nonsense, (e.g. Set' Int) so we'd need to whitelist with DataKinds. Examples:

with TypeFamilies

data Restriction = Unrestricted | NonEmpty

type Foo :: Restriction -> *
type family Foo r where
  Foo Unrestricted = ()
  Foo NonEmpty = Void

data Set' (r :: Restriction) a
  = Bin {-# UNPACK #-} !Int !a !(Set a) !(Set a)
  | Tip (Foo r)

with GADTs

data Restriction e where
  Unrestricted :: Restriction ()
  NonEmpty :: Restriction Void

type Set' :: forall e. Restriction e -> * -> *
data Set' (r :: Restriction e) a
  = Bin {-# UNPACK #-} !Size !a !(Set a) !(Set a)
  | Tip e

Since this uses GHC-specific extensions, it'd require the more general definitions (type/instances/code) to be behind CPP to preserve compatibility. AFAICT it should still result in a much smaller diff than the current approach, since It would only apply to the types/signatures, rather than split every function in half.

Impact assessment

An attempt to build 2660 packages via clc-stackage with a ghc using containers 0.6.5.1 plus the Set type synonym change results in the build stopping roughly half-way through, blocked on 17 packages that won't build. So that's roughly 50% of packages indirectly affected, and (extrapolating) around 1% directly.
16 of those failures are due to requiring TypeSynonymInstances/FlexibleInstances while the last one, doclayout, uses Tip directly from the internal modules: https://github.com/jgm/doclayout/blob/ec0cdddeb828a71f8a78aedc8b2fff530534ef54/src/Text/DocLayout.hs#L1111

> cat log | grep 'requires download & build' | wc -l
2660
> cat log | grep 'Completed' | wc -l
1350
> cat log | grep 'Failed to build' | wc -l
17

Additionally, binary / Cabal / haddock also needed FlexibleInstances to build ghc itself.

FlexibleInstances can be added in advance to prevent breakage. After a release, if Set', etc is exported, downstream functions/instances can be generalized on a per-package basis where applicable, possibly behind CPP to maintain backwards compatibility. The GADTs approach requires StandaloneKindSignatures (ghc 8.10.1+) which might make for more CPP.

This encoding is technically breaking as the assessment shows, but it seems much more useful than completely separate types which is likely to also lead to function proliferation downstream, and in principle can also be applied to other types: say, to de-duplicate Data.List and Data.List.NonEmpty (though that's a trickier, possibly prohibitive, compatibility story).

@treeowl @Ericson2314

treeowl · 2022-10-27T02:42:05Z

It's been a while since I looked in on the progress here. Why do you want to make Set a type synonym? There would have to be a very, very good reason.

alexfmpe · 2022-10-27T03:16:18Z

See #616 (comment)

treeowl · 2022-10-27T03:31:08Z

Why not use a newtype instead of a type synonym? There aren't that many wrapper functions to write.

alexfmpe · 2022-11-06T17:24:03Z

Why not use a newtype instead of a type synonym? There aren't that many wrapper functions to write.

Don't think I understand what's being proposed. The attempt to parameterize emptiness had as goal unifying Foo / NonEmptyFoo whenever there's an empty constructor, so AFAICT Foo / NonEmptyFoo need to be specializations of the same type. Does "use a newtype" refer to the current approach?

treeowl · 2022-11-06T17:32:30Z

I mean instead of type Set a = Set' a a, why not newtype Set a = Set (Set' a a)?

Fix build in ghc 9.4

alexfmpe · 2022-11-06T17:59:15Z

I mean instead of type Set a = Set' a a, why not newtype Set a = Set (Set' a a)?

That'd be newtype Set a = Set (Set' () a) right?

So that'd mean

Unify Foo/NonEmptyFoo types and implementations internally - e.g. in Data.Set.Internal
Add coerce wrappers for the externally exported bindings - e.g. in Data.Set

I see. That would prevent breakage, allow also adding non-empty-ness for the Int-trees, and keep the implementations the same, up to coerce, so I do think it seems an improvement to the current approach.
Downside compared to the type-synonym is that it wouldn't allow downstream to use emptyness-agnostic code (unless they imported Internal modules I guess?), so consumers might end up writing variants both for Foo and NonEmptyFoo. I didn't want to lock out that option right out the gate, and it seems the newtype wrapper would allow that discussion in the future, so I'm happy to go with that

alexfmpe · 2022-11-06T18:02:50Z

Oh that technically still breaks at least one package - as mentioned above, there's at least one use of Tip in the wild
https://github.com/jgm/doclayout/blob/ec0cdddeb828a71f8a78aedc8b2fff530534ef54/src/Text/DocLayout.hs#L1111

Once this PR is in shape, I should run another impact analysis

treeowl · 2022-11-06T18:33:19Z

The newtype constructor would be available from the Internal module, so code using internals can be adjusted pretty easily.

alexfmpe · 2023-07-10T21:42:54Z

Trying to write the wrappers with coerce gives me

src/Data/Set.hs:183:1-27: error:
    Data.Coerce: Can't be safely imported!
    The module itself isn't safe.
    |
183 | import Data.Coerce (coerce)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^

Do we un-mark the module as safe, or write the wrappers the old way?
I assume the newtype wrapping/unwrapping would still be optimized away.

import Data.Set.Internal as S hiding (empty, insert, member)
import qualified Data.Set.Internal as S

empty :: Set a
empty = Set S.empty

insert :: Ord a => a -> Set a -> Set a
insert a (Set s) = Set $ S.insert a s

member :: Ord a => a -> Set a -> Bool
member a (Set s) = S.member a s

treeowl · 2023-07-10T21:51:51Z

Switching to Trustworthy is fine.

Ericson2314 force-pushed the non-empty branch from 66a0549 to 68de675 Compare April 2, 2019 18:59

Ericson2314 force-pushed the non-empty branch from 68de675 to 28b1285 Compare April 3, 2019 02:08

Ericson2314 and others added 2 commits April 19, 2019 14:14

Create NonEmptyMap type

4f29d02

Create NonEmptySet type

c99b359

Ericson2314 force-pushed the non-empty branch from 28b1285 to c99b359 Compare April 19, 2019 18:52

Ericson2314 changed the title Split Map and Set types with mutually recursive NonEmpty* WIP: NonEmptySet and NonEmptyMap Apr 19, 2019

Ericson2314 force-pushed the non-empty branch from 9555fb4 to ceaf193 Compare April 19, 2019 23:04

WIP: NonEmptySet functions

54bfc62

Ericson2314 force-pushed the non-empty branch from ceaf193 to 54bfc62 Compare April 19, 2019 23:23

Merge pull request #2 from alexfmpe/non-empty

3239184

Sync non-empty branch with upstream

alexfmpe and others added 3 commits November 6, 2022 17:40

Merge branch 'master' into non-empty

375e7c2

Fix build in ghc 9.4

7a0acbb

Merge pull request #3 from alexfmpe/non-empty

3ae0075

Fix build in ghc 9.4

WIP: NonEmptySet and NonEmptyMap #616

Are you sure you want to change the base?

WIP: NonEmptySet and NonEmptyMap #616

Conversation

Ericson2314 commented Apr 2, 2019 • edited Loading

treeowl commented Apr 2, 2019

Ericson2314 commented Apr 2, 2019

treeowl commented Apr 2, 2019

Ericson2314 commented Apr 2, 2019

Ericson2314 commented Apr 3, 2019

treeowl commented Apr 3, 2019

treeowl commented Apr 3, 2019

Ericson2314 commented Apr 3, 2019 • edited Loading

treeowl commented Apr 3, 2019

treeowl commented Apr 3, 2019 • edited Loading

Ericson2314 commented Apr 3, 2019

Ericson2314 commented Apr 19, 2019 • edited Loading

treeowl commented Apr 19, 2019

treeowl commented Apr 19, 2019

Ericson2314 commented Apr 19, 2019

Ericson2314 commented Apr 19, 2019

Ericson2314 commented Apr 19, 2019

Ericson2314 commented Apr 19, 2019

Ericson2314 commented Apr 19, 2019

treeowl commented Apr 19, 2019

Ericson2314 commented Apr 20, 2019

alexfmpe commented Sep 28, 2022

phadej commented Sep 29, 2022

phadej commented Sep 29, 2022

treeowl commented Sep 29, 2022

konsumlamm commented Sep 29, 2022

phadej commented Sep 29, 2022 • edited Loading

phadej commented Sep 29, 2022

konsumlamm commented Sep 29, 2022

phadej commented Sep 29, 2022 • edited Loading

jwaldmann commented Sep 29, 2022

treeowl commented Sep 29, 2022

konsumlamm commented Sep 29, 2022

alexfmpe commented Oct 2, 2022 • edited Loading

phadej commented Oct 2, 2022

alexfmpe commented Oct 12, 2022 • edited Loading

alexfmpe commented Oct 27, 2022

Impact assessment

treeowl commented Oct 27, 2022

alexfmpe commented Oct 27, 2022

treeowl commented Oct 27, 2022

alexfmpe commented Nov 6, 2022

treeowl commented Nov 6, 2022

alexfmpe commented Nov 6, 2022 • edited Loading

alexfmpe commented Nov 6, 2022

treeowl commented Nov 6, 2022

alexfmpe commented Jul 10, 2023

treeowl commented Jul 10, 2023

Ericson2314 commented Apr 2, 2019 •

edited

Loading

Ericson2314 commented Apr 3, 2019 •

edited

Loading

treeowl commented Apr 3, 2019 •

edited

Loading

Ericson2314 commented Apr 19, 2019 •

edited

Loading

phadej commented Sep 29, 2022 •

edited

Loading

phadej commented Sep 29, 2022 •

edited

Loading

alexfmpe commented Oct 2, 2022 •

edited

Loading

alexfmpe commented Oct 12, 2022 •

edited

Loading

alexfmpe commented Nov 6, 2022 •

edited

Loading