Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic strategy for subclasses #434

Open
AdrianSosic opened this issue Oct 19, 2023 · 2 comments
Open

Dynamic strategy for subclasses #434

AdrianSosic opened this issue Oct 19, 2023 · 2 comments

Comments

@AdrianSosic
Copy link
Contributor

  • cattrs version: latest unreleased version (before 23.2)
  • Python version: 3.8.6
  • Operating System: macOS

Description

(copied from #312)

Hi guys 👋 It's great that the subclass feature is finally coming in the next release. I've been hoping for it for quite a while, it allows me to get rid of some manual workarounds 👍

I do have one question, though. I'm currently playing around with it and trying to incorporate it into our codebase. In most places, it worked out-of-the-box, but I encountered problems with one of my class hierarchies. When I looked a bit deeper, I suddenly understood what is the problem: the subclass tree is constructed during the call of include_subclasses and not at the time of structuring. This differs from my manual workaround, in which the lookup was done inside the hook itself. The logic was as roughly as follows:

def structure_base(val: dict, _) -> _T:
    _type = val["type"]
    cls = next((cl for cl in get_subclasses(base) if cl.__name__ == _type), None)
    if cls is None:
        raise ValueError(f"Unknown subclass {_type}.")
    return cattrs.structure_attrs_fromdict(val, cls)

where get_subclasses builds the subclass tree similar to your code.

The fact that the classes are collected in include_subclasses causes a problem to me in that no subclasses can be added later on. Do you see any solution to this problem?

I can imagine at least two use cases where a delayed lookup would be necessary:

  1. (My situation) You have a base class Base defined in base.py and would like to activate the subclass strategy for it (probably in that same file). However, several of its subclasses are defined in different files, e.g. subclass1.py, subclass2.py and so on. In this case, it's not obvious where you would even activate the strategy.
  2. (More generally) You do not own all the subclasses yourself, i.e. some user of your package extends the subclass tree by adding their own subclass implementations.

Maybe I'm even misusing the feature, but then I'd be also happy about getting your input :D

@Tinche
Copy link
Member

Tinche commented Oct 19, 2023

Howdy,

so I gather the problem is this:

from attrs import define

from cattrs import Converter
from cattrs.strategies import include_subclasses

c = Converter()


@define
class Base:
    a: int


include_subclasses(Base, c)


@define
class Child(Base):
    b: int


print(c.structure({"a": 1, "b": 2}, Base))  # Base(a=1)

The fact include_subclasses picks the subclasses at strategy application time has to do with the ratio of complexity to usefulness, but in your case looks like you need that extra complexity anyway ;)

Here's a general cattrs trick that we can use: we'll register a structure hook factory for the Base class. This factory will do the work of applying the strategy when the Base class is structured the first time. We use a factory since applying the strategy is somewhat costly so we want to do it only once. (I added some type hints for my own benefit, you can remove them if you like.)

from typing import Callable

from attrs import define

from cattrs import Converter
from cattrs.strategies import include_subclasses

c = Converter()


@define
class Base:
    a: int


def make_include_subclasses_hooks(_: type) -> Callable:
    include_subclasses(Base, c)
    return c._structure_func.dispatch(Base)


c.register_structure_hook_factory(lambda t: t is Base, make_include_subclasses_hooks)


@define
class Child(Base):
    b: int


print(c.structure({"a": 1, "b": 2}, Base))  # Child(a=1, b=2)

Also don't be alarmed by c._structure_func, that's not going away and will be getting a public API getter soon ;)

Let me know if this solves your issue.

@AdrianSosic
Copy link
Contributor Author

Hi @Tinche,

Thanks a lot for your quick help! I think this could actually solve my problem, but the scenario is a bit more complicated. In particular, I have an intermediate abstract class and also need a union strategy.

Let me first share an extended example and then give the context and ask my questions below.
Note: In the example, a union strategy is not needed strictly speaking, but I my code it is, so I included it here as well.

Code

from abc import ABC, abstractmethod
from functools import partial

from attrs import define

from cattrs import Converter
from cattrs.strategies import configure_tagged_union, include_subclasses


@define
class Base(ABC):
    a: int

    @abstractmethod
    def foo(self):
        pass


@define
class Intermediate(Base, ABC):
    b: int


@define
class Child(Intermediate):
    c: int

    def foo(self):
        pass


c = Converter()
union_strategy = partial(configure_tagged_union, tag_name="type")


def make_include_subclasses_structure_hooks_for_parent(parent: type):
    def make_include_subclasses_structure_hooks(_: type):
        include_subclasses(parent, c, union_strategy=union_strategy)
        return c._structure_func.dispatch(parent)

    return make_include_subclasses_structure_hooks


def make_include_subclasses_unstructure_hooks_for_parent(parent: type):
    def make_include_subclasses_unstructure_hooks(_: type):
        include_subclasses(parent, c, union_strategy=union_strategy)
        return c._unstructure_func.dispatch(parent)

    return make_include_subclasses_unstructure_hooks


c.register_structure_hook_factory(
    lambda t: t is Base, make_include_subclasses_structure_hooks_for_parent(Base)
)
c.register_structure_hook_factory(
    lambda t: t is Intermediate,
    make_include_subclasses_structure_hooks_for_parent(Intermediate),
)
c.register_unstructure_hook_factory(
    lambda t: t is Base,
    make_include_subclasses_unstructure_hooks_for_parent(Base),
)
c.register_unstructure_hook_factory(
    lambda t: t is Intermediate,
    make_include_subclasses_unstructure_hooks_for_parent(Intermediate),
)

child = Child(a=1, b=1, c=1)
print(child)
d = c.unstructure(child, Intermediate)
print(d)
child2 = c.structure(d, Intermediate)
print(child2)

The code above will run just fine, but as you can see, things are getting complicated :D

Question 1

The fact that the include_subclasses strategy was executed during structuring basically means that you cannot unstructure before you structure at least one instance of the class. Since in practice you don't know which operation will happen first, I basically needed to copy the same logic into a corresponding unstructuring factory, which seems a bit inelegant. Is there a better approach?

Question 2

In my code, I not only need to (un-)structure as base class, but also as the intermediate class. The only way I saw to achieve this is register additional factories and generalize the logic so that the factories themselves can be created via factories. Again, seems quite complicated. Is this how you would do it?

Next Steps

Even with the logic above, I'm still struggling to apply the new include_sublasses logic to my entire code base. For some parts it works, for others I'm running into problems that I don't yet understand. Will investigate them further, but getting your feedback to the above points would already help =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants