Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn about variable exports from subexpressions #9134

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

richcarl
Copy link
Contributor

@richcarl richcarl commented Dec 2, 2024

Except for "block structures" like case, receive, etc., there is really no good reason to allow variable bindings to be exported from a subexpression. It is very rarely done in practice, and the resulting code is always confusing, having a distinct C flavour. For example:

        case file:open(File,AllOpts=[write,{encoding,utf8}]) of

instead of

        AllOpts=[write,{encoding,utf8}],
        case file:open(File,AllOpts) of

At the same time, the requirement that the compiler must handle the exporting of variable bindings from any and every subexpression is an unnecessary tax on all compiler code and development tooling, which has to trace these possible bindings correctly. Doing this is complicated and error prone. It would be good if such exports could be phased out completely.

This patch adds source information to variables exported from non-block subexpressions, such as arguments to calls, tuples, etc., and adds an unconditional warning (irrespective of the warn_export_vars compiler flag) when such exports are used. It also fixes all such instances in the OTP code (about 20 files in the entire code base - looks like it's been done by a few individual authors only).

Copy link
Contributor

github-actions bot commented Dec 2, 2024

CT Test Results

   13 files    245 suites   4h 9m 11s ⏱️
3 958 tests 3 533 ✅ 425 💤 0 ❌
5 734 runs  5 111 ✅ 623 💤 0 ❌

Results for commit 3e0a184.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

lib/mnesia/src/mnesia_recover.erl Outdated Show resolved Hide resolved
lib/mnesia/src/mnesia_recover.erl Outdated Show resolved Hide resolved
@richcarl
Copy link
Contributor Author

richcarl commented Dec 7, 2024

See #9153 for a way of allowing the use of = in comprehensions, making some of the export fixups in this PR much nicer.

@rickard-green rickard-green added the team:LG Assigned to OTP language group label Dec 9, 2024
@josevalim
Copy link
Contributor

Unfortunately this introduces issues to code generation:

  1. If you have a macro or parse transform that rewrites an expression EXPR into another expression, which may either have a call or a variable, you may accidentally have variables as arguments inside functions, which will now fail. The parse transform author may need additional passes to extract it out or to wrap it.

  2. And because Elixir relies on code generation, we would need to do the same. So this is a breaking change from Elixir's point of view. We could move this concern into the Elixir compiler, although we would probably need time.

Luckily, there is a simple solution to both problems above, which is to simply wrap each argument into a block expression or a case, since this patch seems to check only for =? But to me that means this change is inconsistent: we can still define variables in subexpression, just not by using =. So the compiler still needs to deal with all of the complexity to make this work.

That said, I'd say this should not really be the job of the compiler, it should be the job of a linter, especially because Erlang does not distinguish between statements and expressions, so expressions should be allowed anywhere as possible. At best, leave it as an opt-in/opt-out warning.


FWIW, if I could change one thing about this feature is to make the variable defined in a previous argument always visible. This code warns because A won't ever match:

foo(1, A = 2, A = 3).

But this code does not compile:

foo(1, A = 2, A).

I think I would rather prefer if A was always available for consistency. And I think it would make the Elixir compiler simpler but I am not sure if that's true for Erlang.

@richcarl
Copy link
Contributor Author

@josevalim Note that you would only get a warning if there is an actual use of the variable outside the subexpression context where it is defined, as in foo(X=1), bar(X). You don't generate code like that, do you?

@josevalim
Copy link
Contributor

I think that would be rare but it can still happen. My biggest concern is that macros and parse transforms may traverse AST to perform substitutions and transformations. Except for patterns/guards, they don't really need to worry about placement, this pull request would change that potentially pushing corner cases or harder to understand errors.

@richcarl
Copy link
Contributor Author

FWIW, if I could change one thing about this feature is to make the variable defined in a previous argument always visible. This code warns because A won't ever match:

foo(1, A = 2, A = 3).

But this code does not compile:

foo(1, A = 2, A).

I think I would rather prefer if A was always available for consistency.

It's not that A is available in the second argument in the first example, it's that both argument subexpressions define and export the same new variable (neither subexpression depends on the other), and then a match assertion gets inserted to ensure that they are the same value. It is unlikely that anyone does this sort of thing in practice, but it could happen by mistake, and the compiler must ensure that the variable has a consistent value, not randomly choosing either side. If exporting from subexpressions was dropped, there would be no need for that assertion. Instinctively I don't like the idea of starting to pass on the environment from left to right inside the argument list; it feels like it will just cause more complications.

@josevalim
Copy link
Contributor

Thank you, that's a very good interpretation. I assume if Erlang were to introduce a pin operator, it would have to track the variables from before the call? e.g. foo(1, A = 2, ^A) fail to compile if A was not defined before.

@richcarl
Copy link
Contributor Author

Thank you, that's a very good interpretation. I assume if Erlang were to introduce a pin operator, it would have to track the variables from before the call? e.g. foo(1, A = 2, ^A) fail to compile if A was not defined before.

Right now, both arguments are already evaluated in the environment as it is before the call. E.g. A=0, foo(1, A = 2, A) compiles but the inner = is now a match, and the value of A is 0 everywhere. If you'd add left-to-right binding then if A is a new variable, it would become visible in the second argument, as in foo(1, A = 2, A), without any pinning operator. With pinning added, you could writeA=0, foo(1, A = 2, ^A), but the A = 2 is still a match, not a shadowing new variable, and the ^ annotation is pointless. If you'd also add that bindings would shadow in a sequence, pinning would work, yes, but shadowing is yet another complication.

In general, I think it's best if arguments are always independent subexpressions and can be reordered at will without having to think about variables being defined on the right but not on the left. It would mean that for a simple refactoring you'd have to also move definitions out of one subexpression and into the leftmost one that needs it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:LG Assigned to OTP language group
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants