You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This package uses KnownNat whenever it is necessary to pass length information via types. A KnownNat is backed by a Natural, an algebraic datatype with two constructors, which in general cannot be unboxed (unless the simplifier is certain which constructor it is, which is rare). On the other hand Vector stores its length in an Int, and so at all interfaces we have to convert between Natural and Int. Both passing Natural to functions, and having them convert to/from Int incurs a performance penalty.
Worse yet, the conversions are lossy. fromIntegral :: Natural -> Int will silently truncate the higher bits, leaving you with a sized vector whose underlying unsized vector doesn't match its type:
Needless to say this breaks invariants of many functions. Worse yet, the package exports a way to access the reverse conversion from Int to Natural to KnownNat:
knownLength' uses unsafeCoerce under the hood, which in this case is actually unsafe. It generates an instance dictionary for KnownNat 0x10000000000000000, but populates it with the natural 0 instead. This incoherent dictionary breaks type safety, as can be observed via Typeable or simply sameNat:
>Just r =Data.Vector.Sized.knownLength' (Data.Vector.Sized.replicate@0x10000000000000000()) (GHC.TypeNats.sameNat @0Proxy)
>:t r
r::0:~:18446744073709551616> r
Refl
With a little type family you can construct an unsafeCoerce.
As a quick and dirty workaround, consider using fromEnum :: Natural -> Int which will at least error if the argument is out of bounds (this behavior doesn't seem consistent with e.g. fromEnum @Integer). You will have to migrate from the legacy GHC.TypeLits api that uses Integer (btw incurring additional Natural<->Integer conversions), to the new GHC.TypeNats (GHC 8.2+) that uses Natural.
As a more long term solution, over at https://github.com/mniip/finite-typelits/tree/integral I've been experimenting with an extension to Finite that would allow the underlying type to be a different type other than Integer, possibly Int. I've quickly run into essentially all the same issues, and the way the API is currently shaping to be is that an instance dictionary for a "known Int n" can only be solved if n <= maxBound @Int. This would ensure invariants of things like Vector or an Int-based Finite are satisfied, and reifying an Int back into a dictionary remains type-safe.
A challenge there is that much like having to prove KnownNat (n + m) one would have to prove n + m <= maxBound, except this time the constraint might not actually hold. However with the way Vector is using KnownNat this seems like a non-issue: if all places where the vector's underlying length is taken from KnownNat are annotated with the inequality, it should be impossible to construct a vector whose underlying length has overflowed, because the components from which the vector is constructed will need to have size just under maxBound, which should be impossible due to address space limitations. concatMap would be at risk if it weren't using size-unaware concat under the hood.
I don't know how the use-cases of vector-sized would fare with such inequalities sprinkled about, and would love to hear alternative suggestions.
The text was updated successfully, but these errors were encountered:
@mniip: To be honest, I'd rather move to TypeNats anyway: the only reason we didn't were backcompat concerns, which I no longer have. I like the idea of using something like your integral branch: how close are you to a working solution? Efficiency of representation, and added safety, are two things I would really like to have.
This package uses
KnownNat
whenever it is necessary to pass length information via types. AKnownNat
is backed by aNatural
, an algebraic datatype with two constructors, which in general cannot be unboxed (unless the simplifier is certain which constructor it is, which is rare). On the other handVector
stores its length in anInt
, and so at all interfaces we have to convert betweenNatural
andInt
. Both passingNatural
to functions, and having them convert to/fromInt
incurs a performance penalty.Worse yet, the conversions are lossy.
fromIntegral :: Natural -> Int
will silently truncate the higher bits, leaving you with a sized vector whose underlying unsized vector doesn't match its type:Needless to say this breaks invariants of many functions. Worse yet, the package exports a way to access the reverse conversion from
Int
toNatural
toKnownNat
:knownLength'
usesunsafeCoerce
under the hood, which in this case is actually unsafe. It generates an instance dictionary forKnownNat 0x10000000000000000
, but populates it with the natural0
instead. This incoherent dictionary breaks type safety, as can be observed viaTypeable
or simplysameNat
:With a little type family you can construct an
unsafeCoerce
.As a quick and dirty workaround, consider using
fromEnum :: Natural -> Int
which will at least error if the argument is out of bounds (this behavior doesn't seem consistent with e.g.fromEnum @Integer
). You will have to migrate from the legacyGHC.TypeLits
api that usesInteger
(btw incurring additionalNatural
<->Integer
conversions), to the newGHC.TypeNats
(GHC 8.2+) that usesNatural
.As a more long term solution, over at https://github.com/mniip/finite-typelits/tree/integral I've been experimenting with an extension to
Finite
that would allow the underlying type to be a different type other thanInteger
, possiblyInt
. I've quickly run into essentially all the same issues, and the way the API is currently shaping to be is that an instance dictionary for a "known Int n" can only be solved ifn <= maxBound @Int
. This would ensure invariants of things likeVector
or anInt
-basedFinite
are satisfied, and reifying anInt
back into a dictionary remains type-safe.A challenge there is that much like having to prove
KnownNat (n + m)
one would have to proven + m <= maxBound
, except this time the constraint might not actually hold. However with the wayVector
is usingKnownNat
this seems like a non-issue: if all places where the vector's underlying length is taken fromKnownNat
are annotated with the inequality, it should be impossible to construct a vector whose underlying length has overflowed, because the components from which the vector is constructed will need to have size just under maxBound, which should be impossible due to address space limitations.concatMap
would be at risk if it weren't using size-unaware concat under the hood.I don't know how the use-cases of vector-sized would fare with such inequalities sprinkled about, and would love to hear alternative suggestions.
The text was updated successfully, but these errors were encountered: