Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How about using @property instead of __getattr__ for Token class? #122

Open
otariidae opened this issue Nov 25, 2023 · 0 comments
Open

How about using @property instead of __getattr__ for Token class? #122

otariidae opened this issue Nov 25, 2023 · 0 comments

Comments

@otariidae
Copy link

otariidae commented Nov 25, 2023

When using Janome, I expected Token class returned from Tokenizer.tokenize method has some attributes like surface, and they are str. But static analyzers like mypy cannot recognize the existence of these attributes and assume them as Any.

from janome.tokenizer import Tokenizer

t = Tokenizer()
tokens = t.tokenize("メロスは激怒した")

for token in tokens:
  print(token.surface)  # <- Any

I think it comes from Token class defining dynamic attributes using __getattr__ method here:

janome/janome/tokenizer.py

Lines 121 to 139 in 9d82248

def __getattr__(self, name) -> Any:
if name == 'surface':
return self.node.surface
elif name == 'part_of_speech':
return self.extra[0] if self.extra else self.node.part_of_speech
elif name == 'infl_type':
return self.extra[1] if self.extra else self.node.infl_type
elif name == 'infl_form':
return self.extra[2] if self.extra else self.node.infl_form
elif name == 'base_form':
return self.extra[3] if self.extra else self.node.base_form
elif name == 'reading':
return self.extra[4] if self.extra else self.node.reading
elif name == 'phonetic':
return self.extra[5] if self.extra else self.node.phonetic
elif name == 'node_type':
return self.node.node_type
else:
None

And I think Python builtin @property can do the same thing with current __getattr__ implementation and is more type friendly and more auto completion friendly. The code example is like below:

    @property
    def surface(self):
        return self.node.surface

    @property
    def part_of_speech(self):
        return self.extra[0] if self.extra else self.node.part_of_speech

Do you think that makes sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant