Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Typeable #4097

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft

Implement Typeable #4097

wants to merge 3 commits into from

Conversation

TheMatten
Copy link

@TheMatten TheMatten commented May 29, 2021

Description of the change

Purescript currently doesn't have Typeable constraint in style of Haskell, which makes it basically impossible to implement some functionality, e.g. safe dynamically typed values or memoizing in VDOM where memoized nodes change type of argument. Generic doesn't help there, because it doesn't see difference between constructors defined in different modules, and external libraries don't help either, because they would require users to implement instances for possibly huge amount of types and would prohibit them from using it with external types (because of orphans being dissallowed).

This PR implements minimal version of Typeable dictionary generation, expecting API to be defined as follows, e.g. in prelude:

module Type.Rep (TypeRep, class Typeable, typeRep, sameTypeRep) where

import Data.Eq ((==))
import Data.Monoid ((<>))
import Data.Show (class Show, show)

newtype TypeRep a = TypeRep String

type role TypeRep nominal

instance showTypeRep :: Show (TypeRep a) where
  show (TypeRep n) = "(TypeRep " <> n <> ")"

class Typeable :: forall k. k -> Constraint
class Typeable a where
  typeRep :: TypeRep a

sameTypeRep :: forall a b. TypeRep a -> TypeRep b -> Boolean
sameTypeRep (TypeRep a) (TypeRep b) = a == b

(edit: disallowed instances in style of Coercible)
where Typeable gets resolved automatically in style of IsSymbol. TypeRep contains statically generated hash created from normalized string representation of the type (using GHC.Fingerprint).

I figured that the change is small enough that instead of just talking about it, I can create prototype that can be either dismissed or refined, so I've created PR instead of an issue. Technically, it fixes #2243, though using much more minimal interface as described above, only providing what's needed to implement functionality like Dynamic or

cast :: forall a b. Typeable a => Typeable b => a -> Maybe b

in some external library.

As far as implementation goes, things I'm not completely sure about are whether it may interact badly with redundant syntax (seems like kind annotations, parens, operators, synonyms and rows passed to rowToSortedList get normalized at that point?) and whether PolyKinds may create some unexpected problems (they're hashed too by extracting them from KindApp right now).

When it comes to compatibility guarantees, I would expect same hashes when building using the same compiler version modulo bugfixes - this means that changes to GHC.Fingeprint should probably result in minor version change.


Checklist:

  • Added the change to the changelog's "Unreleased" section with a reference to this PR (e.g. "- Made a change (#0000)")
  • Added myself to CONTRIBUTORS.md (if this is my first contribution)
  • Linked any existing issues or proposals that this pull request should close
  • Updated or added relevant documentation
  • Added a test for the contribution (if applicable)

@rhendric
Copy link
Member

rhendric commented May 29, 2021

I'm reading the conversation in #2243 and, like several of the participants there, I've never had the need for Typeable or Dynamic. The thing this most reminds me of, though, isn't Generic or Foreign as discussed over there, but Variant. Speaking as someone who's never used them, the Typeable/Dynamic approach seems to be about constructing something like a universal sum over every possible type, whereas Variant allows for open but limited disjoint sums. I think this means that you get less boilerplate when working with a Dynamic than a Variant, but you also completely decouple your producers and your consumers, such that it's easy to change one side only and get run-time errors that a Variant-based approach would catch.

Is that basically right? If so, my current weakly-held opinion is that I'd rather spend compiler code on making working with rows and Variants less painful, if possible, than on promoting the use of Typeable. If not, I'd love to hear about a compelling use case for Typeable that Variant doesn't adequately handle.

@garyb
Copy link
Member

garyb commented May 29, 2021

The Variant comparison seems apt yeah, it's pretty much what Nate was talking about here: #2243 (comment)

I think the big difference though is that variants are tagged with their constructor in the encoding, but Typeable forgoes that even?

@rhendric
Copy link
Member

The TypeRep in a Dynamic seems like it does the same job as the tag in a Variant.

@TheMatten
Copy link
Author

TheMatten commented May 29, 2021

@garyb yeah, problems with subcomponents are precisely what I'm working with myself - for context, I'm currently porting Shpadoinkle (https://shpadoinkle.org) into purescript (https://gitlab.com/thematten/purescript-fluorine).

Now, thing is, premise of it's approach is to make composition simple, and while that's easily doable with simple trees of elements, problems arise once you try to add memoization, because while patching, you have no way to know whether new memoized node isn't actually a completely different one, with different type of argument. Halogen solves this with typed approach as mentioned, but such solution degrades user experience a lot, introducing need for additional boilerplate, possibility of new, unproductive type errors and leaks internal details about components being memoized. All of that only to tell compiler which memoized types may appear in some position. And with runtime representation being basically the same, as @rhendric mentioned. This may bite programmers in pretty arbitrary situations, e.g. during refactoring of components.

In Haskell, Typeable gets misused from time to time by users which are accustomed to approaches common to dynamically typed languages. But there still are some genuine use-cases for it, mostly when some interface allows for embedding of relatively self-contained parts, like e.g. when creating APIs for caching, serializing or debugging. Plus it's safe - in Purescript, people may be inclined to implement such functionality using hacks in raw JS, but because it isn't possible to distinguish newtype's at runtime, they can easily introduce bugs by breaking safety guarantees this way. Personally I would rather see code that overuses Typeable, because there's no way to use it unsafely.

@timjs
Copy link

timjs commented May 30, 2021

I'd love to have Typeable in PureScript. I use it a lot in combination with Dynamic to parse user input, wrap it, pass it around, and match it against the type of an (Html) input field when needed. Wanted to do this in PureScript, but switched to Haskell in the end. Something along these lines:

parse :: Text -> Either Text Dynamic
parse val
  | Just v <- scan val :: Maybe Unit = okay <| Dynamic v
  | Just v <- scan val :: Maybe Bool = okay <| Dynamic v
  | ...
  | otherwise = error <| unwords ["!! Error parsing value", display val |> quote]

insert :: Component a -> Dynamic -> Component a
insert (Field b) (Dynamic b') =
  | Just Refl <- b ~= b' -> done <| Field b'
  | otherwise -> throw <| CouldNotChangeVal (someTypeOf b) (someTypeOf b')
...

I'm not aware of a trick using open variants to achieve the same thing...

@TheMatten
Copy link
Author

TheMatten commented May 30, 2021

@timjs Yeah, once again you could probably create some top level synonym used in place of Dynamic for variant with all possible types that may appear during parsing, but that would be burdensome and would provide no additional safety.

Really, Variant is in a sense "type-directed generation of local, restricted Typeable constraints" - so it basically shifts work from the compiler to users, asking them to create "lookup table" of tokens corresponding to specific types at type level. Question is, whether it's worth it to take that work from them - I would say yes, in same way as I would say yes to e.g. IsSymbol.

@timjs
Copy link

timjs commented Jun 1, 2021

@timjs Yeah, once again you could probably create some top level synonym used in place of Dynamic for variant with all possible types that may appear during parsing, but that would be burdensome and would provide no additional safety.

@TheMatten, that is indeed a possibility, but it assumes a closed world doesn't it? In my example above, the library in which the insert function lives, assumes only that a type implements Typeable. This is an open set of types.

The application, containing the parse function, chooses a particular set of types which it is able to parse. Another application could choose a different set of types. The only restriction by the library is that all types should adhere to Typeable.

As a library author, I don't want to make that choice upfront. So closed variants are not an option. Or is there a way an open variant could model that?

@TheMatten
Copy link
Author

@timjs I imagine you should be able to parameterize context you work in with some variant supplied by the user, similarly to how slots in halogen work. We're technically working with "open" world from programmer's perspective, but the other way of viewing this (which may be closer to what @rhendric is saying) is that in reality you only ever work with finite set of types in your application, and thus you can create some Variant that covers all of them.

What I'm trying to say is that while it's hard to come up with usecase which wouldn't be possible to encode with "big enough" Variant, I think it's putting extra work and mental overhead on programmer and makes APIs more complicated, just to shave off what seems to amount to roughly 60 lines of code in the compiler.

@hdgarrood
Copy link
Contributor

It does not follow that the maintenance cost is low just because this PR has only 60 lines. The potential maintenance cost of supporting an extra feature which ends up being depended on by users is very difficult to predict, but could be significantly more than you might expect. If this implementation turns out to be incorrect, we'll need to diagnose and fix the bug - and sometimes this significantly complicates the implementation. The initial Coercible PR seemed fine but it needed something like 10 followup PRs to get the feature to a stage where it worked correctly, and by the end I think we had quite a bit more code than we did originally. The feature needs documentation, it generates more questions along the lines of "when should I use feature X rather than feature Y", it could interact poorly with other features, etc.

@TheMatten
Copy link
Author

TheMatten commented Jun 1, 2021

@hdgarrood Yeah, I should probably be more concrete about maintenance costs:

  • As far as bugfixes of incorrect behaviour go, current design should be minimalistic and opaque enough to limit possibility of backwards-incompatible changes (unless we later find out that e.g. some types can't be safely covered) - basically, it provides no more than optimized equality of normalized types represented as strings, across single compiler version (maybe comparable to equality of FastStrings in GHC).
  • As far as interaction with other features goes, I think we're enjoying "confluence of design decisions" in this case - compared to GHC, purs comes with much simpler typechecker, doesn't provide features like wired-in type equalities and it's Coercible doesn't construct any interesting runtime proofs. So it should be actually very easy to use them together - something like
    eqTypeRep :: forall a b r. TypeRep a -> TypeRep b -> (Coercible a b => r) -> Maybe r
    can be implemented at library level using sameTypeRep (and unsafeCoerce) without any support from the compiler. (Edit: and as I understand it, the fact that generation happens after desugaring means that types we're working with are already normalized, making hash generation much simpler.)
  • When it comes to related documentation and guidance around it's use, I would be happy to put it together in case Typeable gets implemented. Well written comparison with Variant should be enough to make it clear where each of those makes sense in my opinion.

@f-f
Copy link
Member

f-f commented Jun 1, 2021

I am familiar with Variant, but less familiar with Typeable - since here we're talking about implementing a compiler extension, I'm wondering:

  • if implementing Typeable means that Variant can be implemented on top of it (or viceversa?)
  • would the performance improve? (both of the compiled code, and the compiler)
  • more generally, since it looks like there are two similar usecases here it's worth thinking if it's possible to cover both with the same feature

@TheMatten
Copy link
Author

TheMatten commented Jun 1, 2021

@f-f To answer your points:

  • One can implement unnamed Variant on top of Typeable, because it would basically be type-restricted version of Dynamic (that approach is actually common in Haskell, because it's the fastest safe solution). With named ones, you could use TypeReps as additional keys to access different types with same label. But because Typeable is "open" while Variant is "closed", if you were to implement function that extracts whatever field that is already used, you need to cover case of "it's neither of listed options" to be exhaustive from compiler's point of view. On the other hand, you can't really implement "open" variant using Variant - you can only approximate it by creating global list of types that may appear in variant in your application, as mentioned in the discussion above.
  • Depends on how it's used, but it would be always better or roughly-equal. E.g. if you were to avoid unsafeCoerce in your implementation of variant, you would probably end up with some inductive GADT approximation, which is necessarily going to be slower. Plus, because we don't have actual unnamed variants without Typeable (you still need some index, it doesn't matter whether it's position in type-level list found by a typeclass or name inside of row), one has to provide name of field as an argument to every field-specific operation (Proxy). With (unsafely implemented) named variant though, representation itself would be basically the same, except for length of keys.
  • I would actually say that given the description above, their usecases are actually sligthly different:
    • Variant works better with APIs oriented around extraction, because of explicit labels. It may be better fit for cases where concrete identity of selection matters instead of it's type, like e.g. components in current halogen
    • Typeable works better with APIs oriented around injection, where difference between selections matters mostly from implementation perspective, or where type itself is enough to distinguish purpose of the value. This is the case with Shpadoinkle, which is merely interested in memoization of anonymous nodes.

@hdgarrood
Copy link
Contributor

I'm 👎 because I don't think this provides enough utility/is in enough demand to justify adding a feature to the compiler, sorry.

@TheMatten
Copy link
Author

@hdgarrood no problem 🙂 Alternatively, would it be possible to expose needed information e.g. in #4092?

@hdgarrood
Copy link
Contributor

So adding type information to CoreFn? That is something that comes up from time to time, but I think it's a bigger project than it seems. I don't really know enough about it to be able to comment unfortunately.

@timjs
Copy link

timjs commented Jun 1, 2021

Sorry, I've lost the discussion on Typeable vs Variant a bit.

To make it concrete. Say I've some container type, which is empty or filled with a value of type a. Given a Dynamic, I'd like to update the value inside the container, but obviously only when the types match.

{-# LANGUAGE GADTs #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}

module Typeable where

import Data.Type.Equality
import Type.Reflection

(?=) :: TypeRep a -> TypeRep b -> Maybe (a :~: b)
(?=) = testEquality

-- | A container of type `a`
data Container a
  = Empty
  | Filled a

-- | Essentialy `Dynamic`
data Input where
  Input :: Typeable a => a -> Input

update :: forall a. Typeable a => Container a -> Input -> Container a
update (Empty) (Input y)
  | Just Refl <- alpha ?= typeOf y = Filled y
  | otherwise = Empty
  where
    alpha = typeRep :: TypeRep a
update (Filled x) (Input y)
  | Just Refl <- typeOf x ?= typeOf y = Filled y
  | otherwise = Filled x

Above is working Haskell code. I'd do the same thing in Idris by using a sigma type saving a type witness. I can't see how I would use a Variant of some row to model the same thing. Hope somebody can help me understand this.

@rhendric
Copy link
Member

rhendric commented Jun 1, 2021

@timjs

import Prelude

import Data.Maybe (Maybe(..))
import Data.Symbol (class IsSymbol)
import Data.Variant (Variant, prj)
import Prim.Row as Row
import Type.Proxy (Proxy(..))

-- | A container of type `a`
data Container a
  = Empty
  | Filled a

update :: forall s a r r'. IsSymbol s => Row.Cons s a r r' => Container a -> Variant r' -> Container a
update Empty v = case prj (Proxy :: _ s) v of
  Just y -> Filled y
  Nothing -> Empty
update (Filled x) v = case prj (Proxy :: _ s) v of
  Just y -> Filled y
  Nothing -> Filled x

@timjs
Copy link

timjs commented Jun 1, 2021

Thanks a lot @rhendric! Really appreciate it :-) Now I understand the way to use this.

Only thing left is an error on unknown type variables in the instance head of IsSymbol in update when using this solution. Any ideas?

@timjs
Copy link

timjs commented Jun 1, 2021

Really didn't know Variant was so powerful :-O

In that case I agree with @hdgarrood that Typeable is not a necessary extension we'd need to implement and maintain. Maybe focussing on the ergonomics of Variant would be a better idea!

@TheMatten
Copy link
Author

Only thing left is an error on unknown type variables in the instance head of IsSymbol in update when using this solution. Any ideas?

You will either need to fix s into something specifc, or pass in Proxy s that provides it as an argument.

@rhendric
Copy link
Member

rhendric commented Jun 1, 2021

You will either need to fix s into something specifc, or pass in Proxy s that provides it as an argument.

Yes, exactly. It can be worked around with some type-level fun, as long as everything around it is sufficiently well-typed.

Here's a more fully-worked example incorporating your previous parse example as well:

module Main where

import Prelude

import Data.Either (Either(..))
import Data.Maybe (Maybe(..))
import Data.Symbol (class IsSymbol)
import Data.Variant (Variant, inj, prj)
import Effect (Effect)
import Prim.Row as Row
import Prim.RowList (RowList, class RowToList)
import Prim.RowList as RowList
import TryPureScript (p, render, text)
import Type.Data.Boolean (False, True)
import Type.Proxy (Proxy(..))


-- Let's pretend we aren't on try.purescript.org
log :: String -> Effect Unit
log = text >>> p >>> render


-- This is something that should be in a library somewhere (maybe it is?)
class FindTypeInRowList :: forall k. Symbol -> k -> RowList k -> Boolean -> Constraint
class FindTypeInRowList s a rl b | a rl -> s b

instance ftirlNil :: FindTypeInRowList s a RowList.Nil False
else instance ftirlNext :: FindTypeInRowList s a (RowList.Cons s a rl) True
else instance ftirlRec :: FindTypeInRowList s a rl' b => FindTypeInRowList s a (RowList.Cons s' a' rl') b


data Container a
  = Empty
  | Filled a
  
instance showContainer :: Show a => Show (Container a) where
  show Empty = "Empty"
  show (Filled a) = "(Filled " <> show a <> ")"

-- This needs a proxy to determine s.
update' :: forall p s a r r'. IsSymbol s => Row.Cons s a r r' => p s -> Container a -> Variant r' -> Container a
update' p Empty v = case prj p v of
  Just y -> Filled y
  Nothing -> Empty
update' p (Filled x) v = case prj p v of
  Just y -> Filled y
  Nothing -> Filled x

-- This uses type-level shenanigans to determine s, so doesn't require a
-- proxy as long as the types are sufficiently determined at the call site.
update :: forall s a r r' rl. RowToList r rl => IsSymbol s => Row.Cons s a r' r => FindTypeInRowList s a rl True => Container a -> Variant r -> Container a
update = update' (Proxy :: _ s)


class Scan a where
  scan :: String -> Maybe a

instance scanUnit :: Scan Unit where
  scan = case _ of
    "unit" -> Just unit
    _ -> Nothing

instance scanBool :: Scan Boolean where
  scan = case _ of
    "true" -> Just true
    "false" -> Just false
    _ -> Nothing

-- purs warns about this _. You could spell out this type entirely as
-- parse :: forall r. String -> Either String (Variant (u :: Unit, b :: Boolean | r))
-- but it'd be nice if you didn't have to.
parse :: String -> Either String (Variant _)
parse val
  | Just v <- (scan val :: Maybe Unit) = Right $ inj (Proxy :: _ "u") v
  | Just v <- (scan val :: Maybe Boolean) = Right $ inj (Proxy :: _ "b") v
  | otherwise = Left $ "error scanning " <> val

main :: Effect Unit
main = do
  let v :: Either String (Variant (u :: Unit, b :: Boolean))
      v = parse "true"
  let c1 :: Container Unit
      c1 = Empty
  let c2 :: Container Boolean
      c2 = Filled false
  let result1 = update c1 <$> v
  let result2 = update c2 <$> v
  log (show result1)
  log (show result2)

There are some obvious warts here. The biggest is that we need to provide labels u and b in our row type. I could imagine a version of Variant which uses HLists instead of rows, and then the tag could be provided by the equivalent of FindTypeInRowList—this could be written with today's compiler, to my knowledge. Other issues mostly involve type inference and making it easier to elide the full row types without generating warnings. I think improving all of these could be better options than implementing Typeable.

One thing I still don't understand is what Typeable can do for memoization. @TheMatten, can you point me to a small example of what you'd want that for? I'm not familiar with Shpadoinkle, unfortunately.

@TheMatten
Copy link
Author

@rhendric sure - I'm trying to implement depending combinator: https://gitlab.com/platonic/shpadoinkle/-/merge_requests/182/diffs#927128ccef388486a73f825b27dde5f95305bc72_325_348
Basically, what it does is that it stores Typeable argument alongside the rendering function of a memoized node - this way, you can try to cast old argument into type of new one to compare them for equality, or rerender if either of those actions fails. I don't really see a way to implement this with Variant without affecting Html type, but I don't think this should appear in it in the first place - it's an optimization detail, not a safety/corectness concern.

(BTW, this ignores the fact that the rendering function itself may change. But that's problem for StaticPointers, which should be implementable with purs codegen.)

@rhendric
Copy link
Member

rhendric commented Jun 1, 2021

I might not understand that example fully because I don't have the larger context of what else is going on in that MR, but let me try anyway.

If you get/require an implementation of StaticPointer, is that enough?

data DependencyF a = DependencyF (StaticPointer (a -> a -> Boolean)) a
newtype Dependency = Dependency (Exists DependencyF)
instance eqDependency :: Eq Dependency where
  eq (Dependency d1) (Dependency d2) =
    d1 # runExists \(DependencyF staticEqA a) ->
      d2 # runExists \(DependencyF staticEqB b) ->
        staticKey staticEqA == staticKey staticEqB && any (\eq' -> eq' a (unsafeCoerce b)) (derefStaticPtr staticEqA)

depending :: forall a m c. StaticPointer (a -> a -> Boolean) -> (a -> Html m c) -> a -> Html m c
depending staticEq f x = Html (\a d b c -> d (Dependency (mkExists (DependencyF staticEq x)))
                       $ case f x of Html h' -> h' a d b c)

@TheMatten
Copy link
Author

Hmm, very cool idea, but that wouldn't work with something like \_ _ -> true or unsafeRefEq, where the function could be used with multiple nodes, would it?

@rhendric
Copy link
Member

rhendric commented Jun 1, 2021

Yeah, you'd want the production of the StaticPointer to be safe.

But if that general idea is sufficient on the consumer side of things, once you have purs codegen you could make the user side much more foolproof. Just:

class StaticEq a where
  staticEq :: StaticPointer (a -> a -> Boolean)

instance universalStaticEqInstance :: Eq a => StaticEq a where
  staticEq = unsafeThrow "not to be used without some fancy CoreFn transformations"


depending :: forall a m c. StaticEq a => (a -> Html m c) -> a -> Html m c
depending f x = Html (\a d b c -> d (Dependency (mkExists (DependencyF staticEq x)))
                $ case f x of Html h' -> h' a d b c)

Then rewrite any calls to universalStaticEqInstance(...) in CoreFn to StaticEq(static(Data_Eq.eq(...))) prior to running the StaticPointer transformation. If the StaticPointer transformation succeeds, you got safe pointers to unique equality functions. If the Eq instance contains any non-staticable subexpressions, the transformation fails (and the user has to move the StaticEq constraint around probably).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically generate instances for Data.Typeable
6 participants