Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work in progress: Refactor #232

Draft
wants to merge 32 commits into
base: master
Choose a base branch
from
Draft

Conversation

dairiki
Copy link
Collaborator

@dairiki dairiki commented Jan 21, 2023

I've fallen down a rabbit hole and have started work on a significant refactor of the code-base.

This refactor pulls in work from PR #172 as well as PR #214.

It is a work in progress. It's not done, however, it should theoretically be in a working state.
Comments are welcome.

Summary

Issues Addressed

It fixes #230, #198, #229, #193, and #51.

It can probably address #143.

It could possibly address #146, and #56.

I haven't exhaustively checked for issues that may be addressed — there are probably more.

Here we are more careful about which caller's locals we use to
resolve forward type references.  We want the callers locals
at decoration-time — not at decorator-construction time.

Consider:
```py
frozen_dataclass = marshmallow_dataclass.dataclass(frozen=True)

def f():
    @custom_dataclass
    class A:
        b: "B"

    @custom_dataclass
    class B:
        x: int
```

The locals we want in this case are the one from where the
custom_dataclass decorator is called, not from where
marshmallow_dataclass.dataclass is called.
When class_schema is called, it doesn't need the caller's whole stack
frame.  What it really wants is a `localns` to pass to
`typing.get_type_hints` to be used to resolve type references.

Here we add the ability to pass an explicit `localns` parameter to
`class_schema`.  We also add the ability to pass an explicit
`globalns`, because ... might as well — it might come in useful.
(Since we need these only to pass to `get_type_hints`, we might
as well match `get_type_hints` API as closely as possible.)
Manually merge work in PR lovasoa#172 by @onursatici to support generic dataclasses

Refactor _field_for_schema to reduce complexity.
Handle default generic params values in _field_for_builtin_collection_type.
Class_schema does not currently support generating schemas for classes
which have generic base classes. Typing.get_type_hints doesn't
properly dereference the generic parameters.  (I think it can be done,
but I don't think it's simple.)
This fixes thread-safety issues in lazy_class_attribute.
Here we also specialize to our use case, thus cleaning up the type annotations
FIXME: may should warn instead of raising exception
Retain relaxed checking of tests for now
This allows us to cleanly eliminate the thread-local context stack.
@dairiki dairiki added this to the post-py36 milestone May 3, 2023
@mvanderlee
Copy link
Contributor

@dairiki Any progress on this? Is there anything others can do to help?

@dairiki
Copy link
Collaborator Author

dairiki commented Feb 1, 2024

@dairiki Any progress on this? Is there anything others can do to help?

@mvanderlee

As you've probably noted. I stalled on this and haven't looked at it in some time.

As I was working on this, that I kept bumping up against issues with correctly supporting the .Schema class attribute that gets added to classes decorated with the @marshmallow_dataclass.dataclass decorator.
I started working on a write-up of the issues encountered (see https://github.com/dairiki/marshmallow_dataclass/wiki).
(The main issue is that, in some cases, the schema can not be computed (because of, e.g., forward refs in the type annotations). Deferred computation is impossible to do completely correctly — again mostly because resolving the forward type annotations is very tricky, and not completely well-defined at this point.)

Since this PR was turning into a near-complete rewrite, and since the .Schema class attribute seemed impossible to implement in a full-correct manner, the conclusion I came to is that I'd really like to drop the @marshmallow_dataclass.dataclass decorator from our API altogether. I wasn't sure how well that would go over with all y'all. I was starting to work on proposing the idea seriously (i.e.. the above-linked wiki page) but never got around to proposing the change.

So, I guess, some community (and maintainer) guidance as to the appetite for:
a) a near-complete rewrite
and/or
b) dropping the custom dataclass decorator (and its .Schema attribute)
might help motivate me to finish this.

@sloria
Copy link
Collaborator

sloria commented Feb 1, 2024

FWIW, in my previous work as a user ofmarshmallow-dataclass, we never used the @marshmallow_dataclass.dataclass API. Instead we used class_schema to generate separate Schema classes.

@dataclass
class City:
    name: Optional[str]
    buildings: List[Building] = field(default_factory=list)


CitySchema = marshmallow_dataclass.class_schema(City)

For the cost of 1 extra LOC, you get clearer code that separates dataclass and Schema, and makes type checkers happy.

@dairiki
Copy link
Collaborator Author

dairiki commented Feb 1, 2024

FWIW, in my previous work as a user ofmarshmallow-dataclass, we never used the @marshmallow_dataclass.dataclass API. Instead we used class_schema to generate separate Schema classes.

@sloria Me too. And I agree.

However, based on tickets opened here and general discussion, there are quite a few who do use the magic decorator.

@mvanderlee
Copy link
Contributor

Personally I'm okay with the removal of the decorator but certainly not with the forced use of class_schema.
It'd make a lot more sense to do what pydantic did then and use inheritance with a dataclass_transform metaclass.

That being said, even pydantic supports the dataclass decorator, but with limitations. Which I think is a good compromise?

@mvanderlee
Copy link
Contributor

Regarding forward references typing._eval_type does a great job at it.
It could fail, in which case we could either let it raise an exception or wrap it in a try catch, log a warning, and don't include the failed field in the schema.

I've done something similar in the past: https://github.com/mvanderlee/starmallow/blob/master/starmallow/utils.py#L383 and pydantic does this as well.
I prefer marshmallow over pydantic, but there is some cool stuff there.

This was referenced Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError when creating a marshmallow schema from dataclass inherited from generic class
3 participants