Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement let #1426

Merged
merged 15 commits into from
Nov 1, 2017
Merged

implement let #1426

merged 15 commits into from
Nov 1, 2017

Conversation

gilch
Copy link
Member

@gilch gilch commented Sep 17, 2017

Very experimental, unpolished, and not quite finished, but this approach really seems viable--all those tests pass, including those with yield, return, and break. I thought I'd put it up at this point so you guys can look at it and offer thoughts.

Oh, and this also fixes #1420. I think. I needed it working properly to try this.

This let expands its binding symbols in its lexical scope to dict lookups in a hidden (gensymed) dict. The trick is a proper macroexpand-all with "symbol macros". Since macros are pre-expanded, we just (heh) have to account for every special form properly, to know which symbols to expand. But that means it's a monster compared to the old let.

This kind of thing is motivation to keep the special forms as simple as we reasonably can, since it makes this type of thing easier. I sure wished they were simpler when implementing this. Complexity has costs. Some of this is on Python, but the compiler should probably stick as close as reasonably possible to that, without adding features. Most extra features beyond Python belong at the macro level or higher. Any additions or changes to special forms could potentially break macros like this, so the more complex our compiler, the higher the maintenance burden.

This approach could also give us local symbol macros in Hy, since I basically had to implement them to get this far. It might be possible to factor it out of the let implementation.

@gilch
Copy link
Member Author

gilch commented Sep 17, 2017

I'm confused by that test result. It's only failing on version Python versions 3.4 and 3.5. I only tested 3.6 before I pushed, but that failing test is not testing a file I changed. Did a bug sneak through to master in a recent change?

Is it the -1 defaults in HyObject? That's a pretty deep change. I just did that to avoid repl crashes when developing. #1412. It doesn't need to be there. I'm going to remove that and see if it helps.

@gilch
Copy link
Member Author

gilch commented Sep 17, 2017

Yup. That was it.

Any bugs you guys find will need a test. Nested functions have a known issue with default values, I need to expand the value, but protect the argument symbol. That's just not implemented yet.

I'm also thinking that nonlocal and global should protect a symbol from expansion in the relevant scope. But Python is strangely flexible about where it allows you to put them, so finding them is tricky. That's not implemented yet either.

There might be other special cases of special forms that need special treatment that I haven't thought of yet. Python accepts dict lookups in place of plain identifiers in most places though, even in assignments and deletions, so most special forms just work.

@gilch gilch force-pushed the letmacro branch 2 times, most recently from 7d82d7a to 60992cf Compare September 18, 2017 02:22
@gilch
Copy link
Member Author

gilch commented Sep 18, 2017

Functions should be working now. I haven't tested &kwonly yet, but it uses the same implementation that &opitonal does, so it should work.

For now, using nonlocal or global on a let-bound symbol in a let body is an error. But the error message makes no sense unless you know the expansion.

@Kodiologist
Copy link
Member

Kodiologist commented Sep 18, 2017

This way madness lies, but if you're brave enough to try, far be it from me to stop you.

@gilch
Copy link
Member Author

gilch commented Sep 18, 2017

  1. Smile villainously and sneak off to your hydeaway and do unspeakable things.

Madness? Bwa ha ha! So many branches in a 100+ line macro makes me uneasy. This is not design. It's organic growth from the vaguest idea of how this could possibly work. Lisp is good at that. But... the tests all pass!

Once I round out the test harness a bit, I'll try to factor out the symbol macros. If I can refactor this enough it might not seem so scary to others (or me in three months). And simplifying the compiler would also help a bit. I wouldn't have to check for so many edge cases if our special forms didn't have them. We can move some of the special form functionality into macros.

@gilch
Copy link
Member Author

gilch commented Sep 18, 2017

When pondering how nonlocal should work, I realized that we don't need the dict at all. Clever symbol renaming is enough.

This means the let macro creates ordinary locals now, without the overhead of a dict lookup. It's a cost-free abstraction. You can use let without guilt in nested loops!

I also added more tests. That's almost everything I'd planned, but the global form still isn't supported properly. It will do the wrong thing in some cases.

@Kodiologist
Copy link
Member

Kodiologist commented Sep 18, 2017

Probably the reason that defmacro and deftag are special forms instead of macros is that they need to call _compile_time_hack (so you can call the newly defined macro later in the same file). It might be neater to use a generic special form that causes an expression to be evaluated at compile-time.

@gilch
Copy link
Member Author

gilch commented Sep 18, 2017

With #1414, and #1416, would we still need _compile_time_hack?

causes an expression to be evaluated at compile-time.

Does eval-when-compile or eval-and-compile do that? #1229

@gilch
Copy link
Member Author

gilch commented Sep 18, 2017

There's the global form.

The semantics were undefined, since let doesn't exist in Python. I think we're free to declare and document our own semantics, as long as they're sensible and consistent.

Now, a global statement should protect its named symbols from let-expansion thereafter. But that only applies to the rest of the current protection scope. The test demonstrates this.

@Kodiologist
Copy link
Member

With #1414, and #1416, would we still need _compile_time_hack?

I'm not sure. It would probably depend on the details of the new semantics. For example, if each top-level form were executed after being compiled, then _compile_time_hack would still be necessary if you wanted a macro defined in a form to be available later in the same form.

causes an expression to be evaluated at compile-time.

Does eval-when-compile or eval-and-compile do that?

¯\_(ツ)_/¯

Unrelatedly, if you're still interested in symbol namespacing (#277), it's probably worth thinking about how that would interact with let. There may even be a nice to way implement let in terms of it.

@gilch
Copy link
Member Author

gilch commented Sep 19, 2017

symbol namespacing (#277), it's probably worth thinking about how that would interact

This let already handles the dot-access symbols. Either of the syntax proposals from #1407 could be made to work. The root _# symbol could be protected, or symbols with a / could be. I don't foresee any problems with that.

There may even be a nice to way implement let in terms of it.

It's not obvious how. As mentioned in #1407, we could implement dynamic variables with it, but that's different than the lexical variables implemented here. We'd probably call the dynamic form binding instead of let.

@gilch
Copy link
Member Author

gilch commented Sep 19, 2017

I factored out smacrolet for local symbol macros. It doesn't have its own tests, but let is using it now, and that's tested.

@Kodiologist
Copy link
Member

If you haven't already seen it, #1216 has a list of let-related issues that could be useful.

@gilch
Copy link
Member Author

gilch commented Sep 20, 2017

I'll look through those. #180 turned up an issue--macroexpand-all was using the context of the hy.contrib.walk module, not the current module, for expansions. So require wouldn't make a macro available to it, unlike macroexpand and macroexpand-1, which use the context of the calling module.

I was able to make macroexpand-all work the same was as macroexpand, with calling-module-name, but it doesn't seem to work for let, which needs to call macroexpand-all during expansion, but in the context where the let invocation appears. It needs to know the environment it's being expanded in, or at least the name of the module.

We might need to change how defmacro is implemented a little to resolve this, but we were talking about that anyway #1416. I know Clojure macros have a hidden &env parameter. We might need something like that.

@gilch
Copy link
Member Author

gilch commented Sep 20, 2017

Looking at those other let-related issues from #1216--

#576 was about a dynamic let like in Emacs. We'd call it binding if we get around to implementing dynamic variables hylang/hyrule#51 #1407.
#664, the motivating example works.
#703 works.
#1000 seems like @paultag was trying to do what I did. That looks a lot like my original dict lookup version. But doing it correctly takes a lot more code, and I've since advanced past that point.
#1055 was my issue, it's obsolete, besides implementing named break/continue. This let doesn't break break/continue/yield, or return. And async/await (though not implemented yet). should similarly have no problems.
#1056, my issue again, and we won't need a shim if we have a working let.

#1212 is still an issue. I'm expanding a binding that doesn't exist yet.

@gilch
Copy link
Member Author

gilch commented Sep 23, 2017

That should fix #1212.

I also added an xfail for the issue about macros expanding in the wrong module context. The only way I can think of to fix that, short of deferring expansion to runtime (not worth it), is to require the user to pass the name of the current module to let. Not an expression that evaluates to it, the actual name, as a string literal, like (let "__main__" [...) I shouldn't have to do this, since that information should be known to the compiler at expansion time.

@gilch
Copy link
Member Author

gilch commented Sep 23, 2017

That last commit adds an implicit &name parameter to all macro definitions. The compiler will set &name to the name of the module it's expanding in, which allows the macro to use the proper context for preexpansions. And the let macro now does this.

I'm not entirely sure if that's the best way to handle it, but Clojure does something similar.

This makes the error messages for macroexpansions less pretty. But we need to overhaul those anyway #1429. It may be a separate issue best handled in another PR. I've xfailed the relevant test.

@gilch
Copy link
Member Author

gilch commented Sep 23, 2017

Okay @hylang/core, I missed things before, but I think the new let implementation is ready. Care to prove me wrong?

@Kodiologist
Copy link
Member

You seem to still have a few issues listed under ;;;; deferred.

The documentation should explain that the let is implemented by mangling variable names, since this breaks introspection (e.g., (let [x 1] (get (locals) "x")) does the wrong thing). Similarly, it seems worth mentioning that the variable exists at least until its parent function terminates, so leaving a let won't trigger a destructor (__del__), and let variables bound outside a function will hang around as globals. (You might want to begin mangled names with an underscore so import * won't bring in let bindings.) Finally, the complex way that let interacts with macros is not obvious.

defclass will assign the class to the Python scope, even if it shares the name of a let binding.

This is true for the class name, but class member definitions will neither assign to the unmangled name nor the variable introduced by let, but instead, to a class variable with the let-mangled name. For example, (let [x 2] (defclass C [] [x 4])) compiles to :let_1235::x = 2; class C: :let_1235::x = 4. Likewise with (defclass C [] [] (setv x 4)). Is this a bug?

Is it a bug that (let [x 2] (defn f [] (setv x 3) x) (print (f)) (print x)) creates a new local x for f? Given, that's how Python usually works, but I thought let was supposed to override the usual scoping rules. The inner x does get mangled, but without an injected nonlocal or something, it still ends up as a new variable.

@gilch
Copy link
Member Author

gilch commented Sep 24, 2017

You seem to still have a few issues listed under ;;;; deferred.

defmacro deftag defclass import eval_and_compile eval_when_compile require.

I think I should make import and requrie protected from any let mangling.

I don't understand what scope eval_and_compile/eval_when_compile use. Should I let-mangle symbols in them or not?

defclass could perhaps be made to work like defn--expand to a def and class. But this causes problems with class names, which is why I'm not mangling the class name, nor its variable.

If defmacro and deftag become macros instead of special forms, they don't need special treatment. I think I could probably do this.

documentation should explain that the let is implemented by mangling variable names, since this breaks introspection

OK, documenting this doesn't hurt. On Python3, (locals) is effectively read only. It also gets polluted with Hy "anon" vars, which isn't documented either. It's still readable enough for debugging. That's why I wasn't too concerned with introspection.

Similarly, it seems worth mentioning that the variable exists at least until its parent function terminates

Hy special forms with anon vars also have this issue, and so do a lot of other macros that create gensyms with setv. Is the let documentation really the right place to put this?

might want to begin mangled names with an underscore

I feel like the gensym implementation should already be doing that. We don't ever want to import * gensyms, since their names depend on the order of evaluation. Many other macros create these, and will have the same issue if used at toplevel. The let macro is the wrong place to fix it. If it's fixed in gensym, let-mangled names will also have the _ prefix.

Using the : prefix was a bit of a hack to begin with. Normally you can't create such symbols since it would be a keyword instead.

That doesn't stop you from clobbering a gensym with (HySymbol ":G_1235"). Maybe it's enough to tell the user not to do that (on pain of Bad Stuff Happening), but I'm not sure how to enforce that gensyms are unique short of interning all symbols.

But a : isn't the only possibility. We could make gensyms start with _(), for example. You normally can't create symbols like that either.

Finally, the complex way that let interacts with macros is not obvious.

What do you mean? let shouldn't break any macros, since it pre-expands them. The bindings are only supposed to be meaningful at runtime, and nested macros only see the original symbols.

but class member definitions will neither assign to the unmangled name nor the variable introduced by let, but instead, to a class variable with the let-mangled name. [...] Is this a bug?

That does seem undesirable. The earlier dict implementation of let was better behaved here, as it would at least update the let binding. You can still do that with a nonlocal at the top of the defclass body.

Note that a let form inside a defclass will also litter names to the class dict, but so would many other macros and special forms. The problem isn't really the extra names (in most cases--__slots__ and metaclasses might complain), but the inability to set a desired name. It's actually still possible using something like (assoc (vars) 'foo foo), but I feel like it shouldn't be that hard.

Potential options

  1. document this behavior, and explain that nonlocal or (assoc (vars) ... is required.
  2. switch let back to the earlier dict implementation. This adds some runtime overhead for the lookups, (no worse than for self.foo) and changes the scoping rules a bit, but it's easy to do. But then how should nonlocal work on a let-bound name? It could be an error, or protect it thereafter.
  3. magic a nonlocal statement at the top of a defclass for any active let bindings. This would disallow let on Python 2. I'd want to remove the [] assignment block of defclass.
  4. magic a tuple display at the top of a defclass for any active bindings. This has no side effects, so it won't affect the class dict, but it will make assignments to any mangled names an error, since it's referenced before assignment. But this breaks nonlocal too.
  5. protect defclass forms entirely from let mangling. This means you can't read let bindings either, not even in the methods. Also easy, and less undesirable, but makes some legitimate use cases awkward.
  6. protect assignment targets at the top level of a defclass. Complicated. Assignment targets may be unpacking tuples, or a dict index, which may want to use the let binding too, and nonlocal should prevent this protection.
  7. detect if assignment targets are a let binding and raise a macro expansion error. I'm not sure if this is any easier.
  8. make let bindings constants, like in Clojure. Any attempts to assign to one of them would be an error. If you want to "change" its value, you have to shadow it with a new let binding. There are various ways to do this.

I'm leaning towards option 2, with nonlocal on a let-binding an error. I'd make global on a let binding an error too, to be consistent.

Is it a bug that (let [x 2] (defn f [] (setv x 3) x) (print (f)) (print x)) creates a new local x for f?

No, that is the intended behavior. If you want to assign to the let binding, you have to use nonlocal explicitly. It does get mangled even without nonlocal, but (outside of introspection), this is harmless.

I thought let was supposed to override the usual scoping rules

The earlier dict implementation of let bindings were implicitly nonlocal in the entire let body. This version is supposed to work more like Python.

@gilch gilch force-pushed the letmacro branch 3 times, most recently from 94f30c3 to 01f3c40 Compare September 24, 2017 18:10
@gilch
Copy link
Member Author

gilch commented Sep 24, 2017

I implemented defmacro and deftag as macros. This probably conflicts with #1430.

I changed the let implementation back to a dict. Applying global or nonlocal to a let-bound name is now an error.

I changed the gensym format to start with _;.

@Kodiologist
Copy link
Member

Kodiologist commented Sep 24, 2017

The new changes sound good to me. I'm impressed you managed to make defmacro a macro. Yeah, we can kill most of #1430.

I think I should make import and requrie protected from any let mangling.

For require, agreed. let isn't supposed to touch macro names. For import, you don't want to mangle the original module name, since that would cause the import to fail, but you want (import foo) compiled to import foo as _;let|1235['foo'] or so when foo is let-bound, right? Similarly for from foo import.

I don't understand what scope eval_and_compile/eval_when_compile use. Should I let-mangle symbols in them or not?

They're pretty much the same as creating a macro and then immediately calling it. So, they're in the compile-time global scope. Your strategy, as I understand, is to expand all macros in the body of let and only then mangle. So I think the logical thing to do is expand eval-and-compile and eval-when-compile, and then mangle the results in the case of eval-and-compile (for eval-when-compile, there's nothing to mangle).

Hy special forms with anon vars also have this issue, and so do a lot of other macros that create gensyms with setv. Is the let documentation really the right place to put this?

That's a good point. It may be an important difference that no specific lifetime is implied for our other anonymous variables and gensyms, whereas for (let …), it seems to be expected that the created variables are no longer around once the let is done. Ultimately, our let isn't going to have the exact same semantics as that of any other Lisp (because so it is with Hy as a whole), so it's up to us what to do.

Finally, the complex way that let interacts with macros is not obvious.

What do you mean?

Chiefly that let expands the macros inside it before doing anything, unlike all other Hy macros, which treat macro calls like any other form. Consequently, e.g., (setv x 1) (defmacro m [] 'x) (let [x 2] (print (m))) prints 2, not 1 as you'd expect if you expected a more literal notion of lexical binding. Also, (eval-when-compile (setv x 1)) (defmacro m [] x) (let [ignored (eval-and-compile (setv x 2) "ignored")] (print (m))) prints 1, not 2 as you'd get if the (eval-and-compile …) in the binding list was expanded before (m); most of the time, macro calls and eval-X-compiles are expanded in the order they appear. I don't think any of these things are bugs, but they might be surprising.

Copy link
Contributor

@tuturto tuturto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I waded through the code (there's plenty of it) and tried breaking the new implementation without succeeding, so this looks as good as it can get to me.

Copy link
Contributor

@refi64 refi64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is insane. :O

@rcarmo You're going to like this one...

@rcarmo
Copy link

rcarmo commented Nov 1, 2017 via email

@refi64 refi64 merged commit 49d2523 into hylang:master Nov 1, 2017
@Kodiologist
Copy link
Member

Kodiologist commented Nov 1, 2017

Wait a minute, guys, are you sure this is ready to merge? import hasn't yet been tended to and the unusual way let treats macros isn't documented. I'd back this out till it's all done.

@Kodiologist
Copy link
Member

@kirbyfan64 Also, you should rebase rather than leaving a merge commit inside the branch, per CONTRIBUTING.rst.

@refi64
Copy link
Contributor

refi64 commented Nov 2, 2017

Ah, damn it...another repo I work with mandates merge commits, and I keep mixing these up...

Also it seems done enough. @gilch had already requested my review to be the second, this isn't in core, it's tagged as experimental, and it'll be easier reviewing two PRs than one mega-sized one.

@Kodiologist
Copy link
Member

The other project wants you to include a merge from master into the PR's branch, rather than the reverse? I wonder what the point of that is.

Okay, I'll open issues.

@gilch gilch mentioned this pull request Nov 4, 2017
@rcarmo
Copy link

rcarmo commented Nov 14, 2017

Hmmm. Merged already? I wonder... Need to take some time and check this out, I miss Hy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

macroexpand-all expands too much
6 participants