-
-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add deduplication of types #1004
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a quick first pass -- will look more closely later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just highlighting a few things to be aware of.
from pyupgrade._token_helpers import find_op | ||
from pyupgrade._token_helpers import is_close | ||
from pyupgrade._token_helpers import is_open | ||
|
||
|
||
def _fix_optional(i: int, tokens: list[Token]) -> None: | ||
j = find_op(tokens, i, '[') | ||
k = find_closing_bracket(tokens, j) | ||
k, contains_none = _find_closing_bracket_and_if_contains_none(tokens, j) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified the general find_closing_bracket
function to also check for whether the optional block already contains None
in the same pass.
tokens[k:k + 1] = [ | ||
Token('UNIMPORTANT_WS', ' '), | ||
Token('CODE', '| '), | ||
Token('CODE', 'None'), | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for changing the single token containing | None
to explicit whitespace, |
and None
is for the deduplication and whitespace removal functions used in _fix_union
. This also applies to the multiline version a few lines below.
to_delete += _remove_consecutive_unimportant_ws( | ||
tokens, [x for x in range(j, k) if x not in to_delete], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not convinced this is the best approach to remove whitespace, but not sure about what to do in situations where lines are completely deleted other than comments. I have written a niche test for this situation in test id='duplicated types in multi-line nested unions or optionals'
.
Relating to #982
Took me a while to get round to this, but here we are....
I think there are three different levels of approaching this issue:
None
types only:Union
blocks:I settled on level 2 as this was still possible with a single pass and seemed more useful than just focusing on
None
types.I couldn't see a way of approaching the general problem (level 3) without recursively making a tree structure and then assessing the leaf nodes. However, I am not deeply familiar with the standard python libraries for parsing ASTs etc., so if there are simple built in methods for problems like this I would be interested to know!
I used the existing scan in
_fix_union
to determine the delimitators atdepth==1
between the types. This seemed to work well, but I definitely ran into some interesting edge cases when it came to handling comments, whitespace and multilines.I have managed to get this working for a variety of test cases, and I would be interested to hear your feedback.
Btw I enjoy your YouTube content! I have learnt a lot of niche things I would have struggled to pick up otherwise. So thank you for that!