Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a "syntax AST" for typedefs #175

Open
andrevidela opened this issue Aug 12, 2019 · 9 comments · May be fixed by #204
Open

Add a "syntax AST" for typedefs #175

andrevidela opened this issue Aug 12, 2019 · 9 comments · May be fixed by #204
Labels
code:syntax feature:new Issues for brand new features

Comments

@andrevidela
Copy link
Collaborator

andrevidela commented Aug 12, 2019

All #133 #164 #163 #172 require changes to the AST but do not necessarily change the semantics of the language. This means it would be possible to write a new syntax for typedefs that compiles down to TDef n and that is more ergonomic and feature rich than our core AST TDef n.

Here is a small proposal for a new surface level syntax

bool := 1 + 1

maybe a := nothing := 1 + just := a

either a b := left := a + right := b

eitherAnon a b := a + b

byte := bool ^ 8

list a := nil := 1 + cons := a * list a

this would compile down to

(name bool (+ 1 1))

(name maybe (mu (nothing 1) (just (var 1))))

(name either (mu (left (var 1)) (right (var 2)))

(name eitherAnon (+ (var 0) (var 1)))

(name byte (* (ref bool) (ref bool) (ref bool) (ref bool) (ref bool) (ref bool) (ref bool) (ref bool)))

This would allow us to add new features to typedefs without having to change backends and making sure the semantics of the language are never changed.

I've added a branch which implement this syntax as an EDSL (without the conversion to TDef) here: https://github.com/typedefs/typedefs/tree/surface-syntax the relevant file is src/DSL.idr

This feature would also fix #23

Could you please tell me what you think @clayrat @fredrikNordvallForsberg @wires @epost ?

@epost
Copy link
Member

epost commented Aug 13, 2019

Note: this syntax is mu-free; those should be implicit.

@fredrikNordvallForsberg
Copy link
Contributor

Looks nice, especially with the translation from recursive definitions to explicit mus. This is clearly needed for wider adoption of Typedefs in the long run.

Some bikeshedding:

  • Personally I think reusing := both for named definitions and types of constructors is confusing, especially when associativity is unclear (e.g. a := b := c). How about simply : for the type of constructors (e.g. a := b : c).
  • Similarly, reusing + to separate the different constructors also seems confusing to me (e.g. compare a := b := c + d with a := b := c + d := e). How about using | instead (e.g. (with the previous item also taken into account) a := b : c + d and a := b : c | d : e)?

@andrevidela
Copy link
Collaborator Author

@fredrikNordvallForsberg Your comments make sense, I'll try updating the example so we can see how it looks like.

I currently have no intuition about how to compile recursive types defintions into TMus, for example

Tree e := Leaf : 1 | Branch : e * Tree e * Tree e

This should compile to

(name Tree (mu (Leaf 1)
               (Branch (* (var 1) (var 0) (var 0)
                       )
               )
           )
)

Is it correct to replace every recursive definition by (var 0) and stick a mu at the top level?

@andrevidela
Copy link
Collaborator Author

I've implemented your syntax @fredrikNordvallForsberg at 891c074

I couldn't use | as an operator so I implemented || instead, but the show function shows |. Tell me how it looks.

I was considering replacing * by , but it starts to look a bit strange, see

List a := 1 | a , List 1

instead of

List a := 1 | a * List 1

@fredrikNordvallForsberg
Copy link
Contributor

fredrikNordvallForsberg commented Aug 15, 2019

Is it correct to replace every recursive definition by (var 0) and stick a mu at the top level?

As a first approximation yes, but there is a subtlety with nested mus: every time you go under another mu, you need to shift the index of the variable being substituted in by one.

@fredrikNordvallForsberg
Copy link
Contributor

I've implemented your syntax @fredrikNordvallForsberg at 891c074

That looks better, in my opinion. However I think it would still make sense to keep + for representing TSum?

@andrevidela
Copy link
Collaborator Author

andrevidela commented Aug 15, 2019

@fredrikNordvallForsberg Do you have an example of a recursive definition that compiles to two nested TMu and with a free type variable? I tried to come up with something but my examples don't really work.

Regarding + and | the fact that we have both is just an artifact of me reusing the Num interface for the sake of the prototype. I don't think we should have both syntax in the language. I mean It depends what we want to achieve with the language but as a general rule I don't think it's often beneficial to have multiple ways of achieving the same thing, specially when regarding syntax.

Unless we want to enforce | when separating named constructors and + when using anonymous unions. That would make the grammar a bit more complicated but may be worth exploring

@fredrikNordvallForsberg
Copy link
Contributor

Do you have an example of a recursive definition that compiles to two nested TMu and with a free type variable?

A common example is finitarily branching trees (with data at the leaves, say):

Tree a := Leaf : a | Node : List (Tree a)

which (if we keep the list anonymous) should compile to

treeDef : TDef 1
treeDef = TMu [("Leaf", var 1),
               ("Node", TMu [("Nil", T1),
                             ("Cons", TProd [var 1, var 0])])]

unless I have messed up. The Leaf var 1 refers to the type variable, The Cons var 1 refers to the recursive reference to Tree, and the var 0 refers to the most local mu, e.g. to the list.

Unless we want to enforce | when separating name constructors and + when using anonymous unions.

Yes, that's what I had in mind, sorry if I was unclear!

@andrevidela
Copy link
Collaborator Author

So I tried to retrofit your example into the proposed syntax but since sums involving constructors are automatically converted into a TMu with no scoping rules there is no way to inline the definition of list. It's the same as you've written:

List a := Nil : 1 | Cons : a + List a

Tree a := Leaf : 1 | Node : List (Tree a)

This makes your proposal of having | for constructors a lot stronger since now we can say that every | compiles to a top-level TMu and you can't nest |

andrevidela added a commit that referenced this issue Oct 22, 2019
TDefs are now indexed by the number of free varibales as well as
whether they have references or not.

`TDef' 3 True` means there are three free variables and contains
unbounded references.

`TDef 0 False` means there are no free variables and all references
are resolved.

This allows us to recover totality when computing the Idris Type
of a TDef using the `Ty` function. Also should allow us to remove
the effectful compiling of a TDef in the haskell backend by first
resolving all references into a debrujin index and extending the
previous context with the new indices used by references.

This comes at the cost of the existing dependent parser which is
now basically broken until we find a way to fix it:

- Either only parse TDefR which makes references unusable
- Remove the dependent parse and write an index-free parser #175
andrevidela added a commit that referenced this issue Oct 22, 2019
TDefs are now indexed by the number of free varibales as well as
whether they have references or not.

`TDef' 3 True` means there are three free variables and contains
unbounded references.

`TDef 0 False` means there are no free variables and all references
are resolved.

This allows us to recover totality when computing the Idris Type
of a TDef using the `Ty` function. Also should allow us to remove
the effectful compiling of a TDef in the haskell backend by first
resolving all references into a debrujin index and extending the
previous context with the new indices used by references.

This comes at the cost of the existing dependent parser which is
now basically broken until we find a way to fix it:

- Either only parse TDefR which makes references unusable
- Remove the dependent parse and write an index-free parser #175
andrevidela added a commit that referenced this issue Dec 13, 2019
TDefs are now indexed by the number of free varibales as well as
whether they have references or not.

`TDef' 3 True` means there are three free variables and contains
unbounded references.

`TDef 0 False` means there are no free variables and all references
are resolved.

This allows us to recover totality when computing the Idris Type
of a TDef using the `Ty` function. Also should allow us to remove
the effectful compiling of a TDef in the haskell backend by first
resolving all references into a debrujin index and extending the
previous context with the new indices used by references.

This comes at the cost of the existing dependent parser which is
now basically broken until we find a way to fix it:

- Either only parse TDefR which makes references unusable
- Remove the dependent parse and write an index-free parser #175
andrevidela added a commit that referenced this issue Dec 19, 2019
TDefs are now indexed by the number of free varibales as well as
whether they have references or not.

`TDef' 3 True` means there are three free variables and contains
unbounded references.

`TDef 0 False` means there are no free variables and all references
are resolved.

This allows us to recover totality when computing the Idris Type
of a TDef using the `Ty` function. Also should allow us to remove
the effectful compiling of a TDef in the haskell backend by first
resolving all references into a debrujin index and extending the
previous context with the new indices used by references.

This comes at the cost of the existing dependent parser which is
now basically broken until we find a way to fix it:

- Either only parse TDefR which makes references unusable
- Remove the dependent parse and write an index-free parser #175
@andrevidela andrevidela added this to the Syntax improvements milestone Jan 7, 2020
@andrevidela andrevidela linked a pull request Jun 21, 2020 that will close this issue
@andrevidela andrevidela added the feature:new Issues for brand new features label Jun 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code:syntax feature:new Issues for brand new features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants