Warn about variable exports from subexpressions #9134

richcarl · 2024-12-02T14:23:29Z

Except for "block structures" like case, receive, etc., there is really no good reason to allow variable bindings to be exported from a subexpression. It is very rarely done in practice, and the resulting code is always confusing, having a distinct C flavour. For example:

        case file:open(File,AllOpts=[write,{encoding,utf8}]) of

instead of

        AllOpts=[write,{encoding,utf8}],
        case file:open(File,AllOpts) of

At the same time, the requirement that the compiler must handle the exporting of variable bindings from any and every subexpression is an unnecessary tax on all compiler code and development tooling, which has to trace these possible bindings correctly. Doing this is complicated and error prone. It would be good if such exports could be phased out completely.

This patch adds source information to variables exported from non-block subexpressions, such as arguments to calls, tuples, etc., and adds an unconditional warning (irrespective of the warn_export_vars compiler flag) when such exports are used. It also fixes all such instances in the OTP code (about 20 files in the entire code base - looks like it's been done by a few individual authors only).

There is never a good reason for placing variable bindings inside the arguments of an expression.

github-actions · 2024-12-02T14:24:23Z

CT Test Results

13 files 245 suites 4h 9m 11s ⏱️
3 958 tests 3 533 ✅ 425 💤 0 ❌
5 734 runs 5 111 ✅ 623 💤 0 ❌

Results for commit 3e0a184.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

lib/mnesia/src/mnesia_recover.erl

richcarl · 2024-12-07T18:37:05Z

See #9153 for a way of allowing the use of = in comprehensions, making some of the export fixups in this PR much nicer.

josevalim · 2024-12-10T09:37:52Z

Unfortunately this introduces issues to code generation:

If you have a macro or parse transform that rewrites an expression EXPR into another expression, which may either have a call or a variable, you may accidentally have variables as arguments inside functions, which will now fail. The parse transform author may need additional passes to extract it out or to wrap it.
And because Elixir relies on code generation, we would need to do the same. So this is a breaking change from Elixir's point of view. We could move this concern into the Elixir compiler, although we would probably need time.

Luckily, there is a simple solution to both problems above, which is to simply wrap each argument into a block expression or a case, since this patch seems to check only for =? But to me that means this change is inconsistent: we can still define variables in subexpression, just not by using =. So the compiler still needs to deal with all of the complexity to make this work.

That said, I'd say this should not really be the job of the compiler, it should be the job of a linter, especially because Erlang does not distinguish between statements and expressions, so expressions should be allowed anywhere as possible. At best, leave it as an opt-in/opt-out warning.

FWIW, if I could change one thing about this feature is to make the variable defined in a previous argument always visible. This code warns because A won't ever match:

foo(1, A = 2, A = 3).

But this code does not compile:

foo(1, A = 2, A).

I think I would rather prefer if A was always available for consistency. And I think it would make the Elixir compiler simpler but I am not sure if that's true for Erlang.

richcarl · 2024-12-10T19:21:41Z

@josevalim Note that you would only get a warning if there is an actual use of the variable outside the subexpression context where it is defined, as in foo(X=1), bar(X). You don't generate code like that, do you?

josevalim · 2024-12-10T19:58:53Z

I think that would be rare but it can still happen. My biggest concern is that macros and parse transforms may traverse AST to perform substitutions and transformations. Except for patterns/guards, they don't really need to worry about placement, this pull request would change that potentially pushing corner cases or harder to understand errors.

richcarl · 2024-12-11T08:29:51Z

FWIW, if I could change one thing about this feature is to make the variable defined in a previous argument always visible. This code warns because A won't ever match:
foo(1, A = 2, A = 3).
But this code does not compile:
foo(1, A = 2, A).
I think I would rather prefer if A was always available for consistency.

It's not that A is available in the second argument in the first example, it's that both argument subexpressions define and export the same new variable (neither subexpression depends on the other), and then a match assertion gets inserted to ensure that they are the same value. It is unlikely that anyone does this sort of thing in practice, but it could happen by mistake, and the compiler must ensure that the variable has a consistent value, not randomly choosing either side. If exporting from subexpressions was dropped, there would be no need for that assertion. Instinctively I don't like the idea of starting to pass on the environment from left to right inside the argument list; it feels like it will just cause more complications.

josevalim · 2024-12-11T11:06:01Z

Thank you, that's a very good interpretation. I assume if Erlang were to introduce a pin operator, it would have to track the variables from before the call? e.g. foo(1, A = 2, ^A) fail to compile if A was not defined before.

richcarl · 2024-12-11T12:35:21Z

Thank you, that's a very good interpretation. I assume if Erlang were to introduce a pin operator, it would have to track the variables from before the call? e.g. foo(1, A = 2, ^A) fail to compile if A was not defined before.

Right now, both arguments are already evaluated in the environment as it is before the call. E.g. A=0, foo(1, A = 2, A) compiles but the inner = is now a match, and the value of A is 0 everywhere. If you'd add left-to-right binding then if A is a new variable, it would become visible in the second argument, as in foo(1, A = 2, A), without any pinning operator. With pinning added, you could writeA=0, foo(1, A = 2, ^A), but the A = 2 is still a match, not a shadowing new variable, and the ^ annotation is pointless. If you'd also add that bindings would shadow in a sequence, pinning would work, yes, but shadowing is yet another complication.

In general, I think it's best if arguments are always independent subexpressions and can be reordered at will without having to think about variables being defined on the right but not on the left. It would mean that for a simple refactoring you'd have to also move definitions out of one subexpression and into the leftmost one that needs it.

richcarl added 3 commits December 1, 2024 16:38

Use a helper function to avoid code duplication

c0b3be8

Always warn about exported variables from constructs without clauses

da363aa

There is never a good reason for placing variable bindings inside the arguments of an expression.

Eliminate exports from expression arguments

40159fe

legoscia reviewed Dec 2, 2024

View reviewed changes

lib/mnesia/src/mnesia_recover.erl Outdated Show resolved Hide resolved

lib/mnesia/src/mnesia_recover.erl Outdated Show resolved Hide resolved

Fix race condition getting date and time in mnesia_recover

3e0a184

richcarl mentioned this pull request Dec 7, 2024

Make assignment (match) in comprehension work as strict generator #9153

Open

rickard-green added the team:LG Assigned to OTP language group label Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn about variable exports from subexpressions #9134

Warn about variable exports from subexpressions #9134

richcarl commented Dec 2, 2024

github-actions bot commented Dec 2, 2024 •

edited

Loading

richcarl commented Dec 7, 2024

josevalim commented Dec 10, 2024

richcarl commented Dec 10, 2024

josevalim commented Dec 10, 2024

richcarl commented Dec 11, 2024

josevalim commented Dec 11, 2024

richcarl commented Dec 11, 2024

Warn about variable exports from subexpressions #9134

Are you sure you want to change the base?

Warn about variable exports from subexpressions #9134

Conversation

richcarl commented Dec 2, 2024

github-actions bot commented Dec 2, 2024 • edited Loading

CT Test Results

Artifacts

richcarl commented Dec 7, 2024

josevalim commented Dec 10, 2024

richcarl commented Dec 10, 2024

josevalim commented Dec 10, 2024

richcarl commented Dec 11, 2024

josevalim commented Dec 11, 2024

richcarl commented Dec 11, 2024

github-actions bot commented Dec 2, 2024 •

edited

Loading