add deduplication of types #1004

walter9388 · 2025-02-20T16:31:48Z

Relating to #982

Took me a while to get round to this, but here we are....

I think there are three different levels of approaching this issue:

Remove extra None types only:

-def f(x: Optional[Union[int, None]]): pass
+def f(x: int | None): pass
-def g(x: Union[Optional[int], None]): pass
+def g(x: int | None): pass

Remove any duplicated scalar types at the same depth by name in Union blocks:

-def f(x: Union[Union[Union[Union[a, b], c], d], a]): pass
+def f(x: a | b | c | d): pass
-def g(x: Union[a.b | a.c, a.b, list[str], str]): pass
+def g(x: a.b | a.c | list[str] | str): pass

General deduplication at any depth on any block:

-def f(x: Union[list[Union[int, str]], list[Union[str, int]]]): pass
+def f(x: list[int | str]): pass

I settled on level 2 as this was still possible with a single pass and seemed more useful than just focusing on None types.
I couldn't see a way of approaching the general problem (level 3) without recursively making a tree structure and then assessing the leaf nodes. However, I am not deeply familiar with the standard python libraries for parsing ASTs etc., so if there are simple built in methods for problems like this I would be interested to know!

I used the existing scan in _fix_union to determine the delimitators at depth==1 between the types. This seemed to work well, but I definitely ran into some interesting edge cases when it came to handling comments, whitespace and multilines.

I have managed to get this working for a variety of test cases, and I would be interested to hear your feedback.

Btw I enjoy your YouTube content! I have learnt a lot of niche things I would have struggled to pick up otherwise. So thank you for that!

for more information, see https://pre-commit.ci

asottile

just a quick first pass -- will look more closely later

tests/features/typing_pep604_test.py

pyupgrade/_token_helpers.py

walter9388

Just highlighting a few things to be aware of.

walter9388 · 2025-02-22T20:33:55Z

pyupgrade/_plugins/typing_pep604.py

 from pyupgrade._token_helpers import find_op
 from pyupgrade._token_helpers import is_close
 from pyupgrade._token_helpers import is_open


 def _fix_optional(i: int, tokens: list[Token]) -> None:
    j = find_op(tokens, i, '[')
-    k = find_closing_bracket(tokens, j)
+    k, contains_none = _find_closing_bracket_and_if_contains_none(tokens, j)


Modified the general find_closing_bracket function to also check for whether the optional block already contains None in the same pass.

walter9388 · 2025-02-22T20:38:41Z

pyupgrade/_plugins/typing_pep604.py

+            tokens[k:k + 1] = [
+                Token('UNIMPORTANT_WS', ' '),
+                Token('CODE', '| '),
+                Token('CODE', 'None'),
+            ]


The reason for changing the single token containing | None to explicit whitespace, | and None is for the deduplication and whitespace removal functions used in _fix_union. This also applies to the multiline version a few lines below.

walter9388 · 2025-02-22T20:43:08Z

pyupgrade/_plugins/typing_pep604.py

+        to_delete += _remove_consecutive_unimportant_ws(
+            tokens, [x for x in range(j, k) if x not in to_delete],
+        )


Not convinced this is the best approach to remove whitespace, but not sure about what to do in situations where lines are completely deleted other than comments. I have written a niche test for this situation in test id='duplicated types in multi-line nested unions or optionals'.

walter9388 and others added 4 commits February 20, 2025 14:53

initial effort

eca6277

clean up

dc9d5d9

spelling

1fae34a

[pre-commit.ci] auto fixes from pre-commit.com hooks

90dfcab

for more information, see https://pre-commit.ci

asottile reviewed Feb 22, 2025

View reviewed changes

tests/features/typing_pep604_test.py Outdated Show resolved Hide resolved

pyupgrade/_token_helpers.py Outdated Show resolved Hide resolved

walter9388 added 4 commits February 22, 2025 20:03

moved modified helper function into typing_pep604.py

1316d66

fixed bad recursive typing xfail test

2bbe9d7

moved other modified helper function into typing_pep604.py

2a064a8

fixed flake8 failures

94931ad

walter9388 commented Feb 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add deduplication of types #1004

add deduplication of types #1004

walter9388 commented Feb 20, 2025 •

edited

Loading

asottile left a comment

walter9388 left a comment

walter9388 Feb 22, 2025

walter9388 Feb 22, 2025

walter9388 Feb 22, 2025

add deduplication of types #1004

Are you sure you want to change the base?

add deduplication of types #1004

Conversation

walter9388 commented Feb 20, 2025 • edited Loading

asottile left a comment

Choose a reason for hiding this comment

walter9388 left a comment

Choose a reason for hiding this comment

walter9388 Feb 22, 2025

Choose a reason for hiding this comment

walter9388 Feb 22, 2025

Choose a reason for hiding this comment

walter9388 Feb 22, 2025

Choose a reason for hiding this comment

walter9388 commented Feb 20, 2025 •

edited

Loading