Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate struct update syntax, such as %URI{x | ...} #13974

Open
josevalim opened this issue Nov 6, 2024 · 17 comments
Open

Deprecate struct update syntax, such as %URI{x | ...} #13974

josevalim opened this issue Nov 6, 2024 · 17 comments

Comments

@josevalim
Copy link
Member

josevalim commented Nov 6, 2024

Instead, we should prefer pattern matching. With the type system, we already get the benefits of checking if fields exist or not, in a more reliable and extensible way.

Here is an example from LiveView:

for entry <- conf.entries do
  %UploadEntry{entry | preflighted?: entry.preflighted? || entry.ref in refs}
end

However, the code above does not catch typos on entry.ref. This code, however, would:

for %UploadEntry{} = entry <- conf.entries do
  %{entry | preflighted?: entry.preflighted? || entry.ref in refs}
end

Since we attach type information to variables when they are defined.

@stevejonesbenson
Copy link

Is there a discussion about this anywhere? I don't see anything in the elixir-lang-core mailing list but I could have missed it.

I know this isn't the place to comment on the issues validity and I 100% trust the core team whatever you decide - however I would like to see any discussion on the issue to see the different points of view and reasoning if possible.

For what its worth I like the current syntax given that structs are made to feel slightly special than normal maps and I think the status quo is quite pleasant when you are returning an updated struct at the end of a function, It is immediately obvious that you are returning a struct without looking up at what the entry variable is from the example above.

Saying that, I am speaking only from the developers POV and not from the compiler which this issue seems primarily concerned with.

Sorry to add this here but I'm not sure where else to comment 🫡

@josevalim
Copy link
Member Author

It is completely fine to discuss it here. :)

For what its worth I like the current syntax given that structs are made to feel slightly special than normal maps and I think the status quo is quite pleasant when you are returning an updated struct at the end of a function, It is immediately obvious that you are returning a struct without looking up at what the entry variable is from the example above.

While I do agree this is true, it is worth noting this is pretty much a special case of structs. You don't denote in return types when you are returning an integer, updating a tuple, etc. Typically in Elixir, you specify the types of variables in patterns and guards, and not throughout your function.

Other than that, the rationale is pretty much in the issue description. All cases where you use the update syntax, you will be better served by pattern matching on it instead.

@whatyouhide
Copy link
Member

Just wanted to add here that doing this with mix format --migrate should be trivial, so "fixing" this warning will be a piece of cake 🍰

@stevejonesbenson
Copy link

That's a good point and you make it well - Thank you

@LostKobrakai
Copy link
Contributor

LostKobrakai commented Nov 12, 2024

Is there really a need to deprecate the struct update syntax? I can see the point of prefering pattern matching, but it doesn't feel like a great justification for getting rid of the struct update at the same time. Things are easily visible here in this short example, but consider 20ish lines between the for and the struct update and suddenly it's no longer so obvious that this is mapping struct to updated struct vs. struct to some random map derived from the entry. Removing the signaling of "this is an UploadEntry struct" feels like a step backwards for people reading code, even if technically it wouldn't matter.

@josevalim
Copy link
Member Author

I'd say so, because, if our official advice is to use the pattern matching syntax 99% of the cases, there is no reason to keep a pattern in the language that we wouldn't recommend.

As said above, I understand the signaling can be helpful but it is also inconsistent. Structs are the only data structure that get those and only for updates. You don't signal integers, tuples, struct access, etc. And it will always do a worse job than pattern matching.

@LostKobrakai
Copy link
Contributor

LostKobrakai commented Nov 12, 2024

Structs are the only data structure that get those and only for updates.

I can kinda see that, but also should we deprecate struct! then as well? Or at least replace existing instances of struct updates with struct! if it is decided to not deprecate struct!?

And it will always do a worse job than pattern matching.

I don't necessarily agree with that, given I don't see a pattern matching as an alternative to the struct update syntax. To me %URI{uri | …} vs. %{map | …} is more like choosing between Keyword.update! vs. Map.update! or using MapSet apis instead of manually messing with the set internals.

I guess I'd be fine with deprecating the native syntax for struct updates, but if that happens imo the proper replacement would be struct!/2 not nothing.

@josevalim
Copy link
Member Author

To me %URI{uri | …} vs. %{map | …} is more like choosing between Keyword.update! vs. Map.update! or using MapSet apis instead of manually messing with the set internals.

So a more apt comparison is int + int and float + float, which also do not different themselves, or any other polymorphic function call. Also note we don't make a syntax distinction either between map.key and struct.field.

I can kinda see that, but also should we deprecate struct! then as well? Or at least replace existing instances of struct updates with struct! if it is decided to not deprecate struct!?

struct! is also used for dynamic fields, so there is no plan to deprecate it. You could use struct!(struct, foo: 123) for updating it, although currently that is done at runtime. However, we could convert struct! to a macro and make it emit more optimal code if the keys are statically known.

@sabiwara
Copy link
Contributor

However, the code above does not catch typos on entry.ref. This code, however, would:

for %UploadEntry{} = entry <- conf.entries do

Just mentioning for the record, but this change is not strictly equivalent and can possibly accidentally filter non-matching entries (e.g. nils) instead of raising. One might want to use Enum.map/2 here to use the pattern-matching version?

@zachallaun
Copy link
Contributor

zachallaun commented Nov 13, 2024

Is there a reason that this example...

for entry <- conf.entries do
  %UploadEntry{entry | preflighted?: entry.preflighted? || entry.ref in refs}
end

cannot provide useful typing information for expressions inside the block? It seems to me that, in order to succeed, entry must be an %UploadEntry{} in the current block, so the above is equivalent to:

for entry <- conf.entries do
  %UploadEntry{} = entry
  %UploadEntry{entry | preflighted?: entry.preflighted? || entry.ref in refs}
end

If that "phantom match" could be created automatically, it would find typos in entry.ref.

I'll acknowledge that this is only one example and this code could be modified such that you get more/better errors by pattern matching where entry is assigned, but this would be something and might lessen the "sting" of keeping the struct update syntax around.

@josevalim
Copy link
Member Author

josevalim commented Nov 13, 2024

Your proposal above would work for that example but what if someone wrote this:

for entry <- conf.entries do
  preflighted? = entry.preflighted? || entry.ref in refs
  %UploadEntry{entry | preflighted?: preflighted?}
end

What do we rewrite it to? If we do this:

for entry <- conf.entries do
  preflighted? = entry.preflighted? || entry.ref in refs
  %UploadEntry{} = entry
  %UploadEntry{entry | preflighted?: preflighted?}
end

Then we still wouldn't have caught a bug if entry.ref had a typo. Perhaps we hoist it all the way up? But that would require a separate pass in the type system only to pre-assign structs updates to variables. Do we want to make compilation 2-5% slower because of this single feature?

The simplest way to reason about the type system is that types are assigned to variables when they are defined. When we have type signatures, that makes total sense and it should just work as you expect, because the signatures will provide the type. But all Elixir code written today relies on inference and doing any sort of refinement during inference, which is partially what you are suggesting, is expensive because you need to either hoist it or go back and retype all prior uses of entry. If we don't do that, then it is inconsistent, you get different typing errors depending on where you use the update syntax.

All in all, it is much simpler to say "variables get their types when they are defined" and those are the patterns we should promote.*

We also need to accept that, as we work on the type system, we will discover Elixir idioms which are just not useful if we had types. Some struct features were designed exactly because we didn't have a type system. And other features are too lax. For example, the Kernel.struct/2 function allows anyone to create a struct without enforcing keys... which is equivalent to saying that all structs today may have nullable fields, which is not a desirable long term property.

Overall, we want to keep the deprecations too a minimum, but I think we have to revisit our practices to make the best use of our new tools as we move forward.

*There is a feature we will add to the type system called occurrence tying, which allows us to refine types in if, cond, case, etc. But the difference is exactly that those structs introduce a new scope, where the type is refined, instead of hoisting or refine the typing within an existing one.

@zachallaun
Copy link
Contributor

Then we still wouldn't have caught a bug if entry.ref had a typo. Perhaps we hoist it all the way up? But that would require a separate pass in the type system only to pre-assign structs updates to variables. Do we want to make compilation 2-5% slower because of this single feature?

I was indeed considering hoisting it to the top of the scope (or just after entry = ... if it's defined that way).

I do agree that a separate pass just to pre-assign structs, resulting in a 2-5% slow down, would not be worth it. I know there's a delicate balance between inference quality and expense, so introducing things like this may not be worth it. If there were other cases, however, where inference could be improved by a separate pass, then the cost/benefit might change, but I'm not sure that those cases exist.

Overall, we want to keep the deprecations too a minimum, but I think we have to revisit our practices to make the best use of our new tools as we move forward.

Agreed 👍. To be clear, though I do like the update syntax occasionally for the various reasons that others have stated, I'm not opposed to this change.

Another thought that crossed my mind is that, in the future when typing information is exposed through some compiler API, editors will be able to re-expose this information through things like inlay hints.

@whatyouhide whatyouhide changed the title Deprecate struct update syntax, such as %URI{x | ...} Deprecate struct update syntax, such as %URI{x | ...} Nov 14, 2024
@Coffei
Copy link

Coffei commented Nov 22, 2024

One difference between the original examples is if you mistype the field in the updated struct. The struct update syntax produces a compilation error, while the pattern match only produces a warning, if I am updating a field that doesn't exist. That can worse the experience for projects not using mix compile --warnings-as-errors.

Would changing these warnings to errors be a part of the deprecation to keep the same developer experience?

@josevalim
Copy link
Member Author

I would say having them as warnings is a benefit, as it allows compilation to continue and find other warnings, rather than stopping midway. In fact, we generally do our best to emit warnings and only emit errors when we cannot produce an actual artifact.

@Coffei
Copy link

Coffei commented Nov 25, 2024

I've been thinking about it some more, and we maybe missed the point of my comment a little. It's not really important whether it's an error or a warning technically. I feel it's the difference in the strength of feedback that makes the new proposed syntax worse. A typo in a field seems like a situation where the compiler knows the code will not work. And if it knows that I feel like it's good the compilation will fail and not let me run the code. I don't really care whether we use errors or warnings for that purpose.

But using warnings without any other mechanism than --warnings-as-errors seems lacking, since there are bunch of other warnings that are not as important. So mixing warnings like unused variables, which are non threatening, with warnings that essentially say "this code will not work and will crash in runtime" seems a bit backwards. People have to choose between failing for every little violation and not having any check at all. Maybe we already have some warnings like that, idk.

So maybe this is not a point against this proposed deprecation, but rather a suggestion to have some multi-level warnings some of which could fail the compilation by default (but not in the same way as errors, as you mention above), if the compiler is sure the code is wrong. I am sure what is the right solution. But I know I myself would prefer using the old syntax for this very reason.

@josevalim
Copy link
Member Author

josevalim commented Nov 25, 2024

It is very important to not break the user flow. Phoenix compiles during requests, tests compile before running. Being able to run and debug a program, even if it says that it will error, is better than aborting the user process and leave them to "fight the compiler" (or the type checker) as the only option available. Forcing compilation to fail can be very disruptive to the developer experience. If you really want that, then set elixirc_options: [warnings_as_errors: true] in your def project. You don't have to only pass it via CLI.

@jonatanklosko
Copy link
Member

A typo in a field seems like a situation where the compiler knows the code will not work. And if it knows that I feel like it's good the compilation will fail and not let me run the code.

This assumes that the field is missing because of a typo, but that may not be the only case. Imagine that we remove a field from a struct, while working on a subset of the code. In such case it can be useful to run the app and tests, without going and fixing all other places where the field is already used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants