[BUG] Link DBRefs do not preserve extra attributes when calling `document.model_dump()` #1041

MrEarle · 2024-10-07T16:14:28Z

Describe the bug
When dumping a model with links, the corresponding DBRef dicts don't keep extra attributes.

To Reproduce

class User(Document):
    email: str


class ToDo(Document):
    user: Link[User]


async def main():
    client = AsyncIOMotorClient(db_uri)
    await init_beanie(
        database=client.get_database("test"),
        document_models=[User, ToDo],
    )

    user = await User(email="[email protected]").create()

    user_db_ref = DBRef("user", user.id, _extra={"email": user.email})

    todo = await ToDo(user=user_db_ref).create()

    print(todo.model_dump()["user"])  # Does not print the email field

Expected behavior
A clear and concise description of what you expected to happen.

I would expect that the serialization of the DBRef contains the extra fields it contains. This helps in having an extended reference pattern.

Additional context

Playing around with beanie's implementation, I got it working by modifying the to_dict method of Link:

    def to_dict(self):
        extra = {k: v for k,v in self.ref.as_doc().items() if k not in {"$ref", "$id", "$db"}}
        return {**extra, "id": str(self.ref.id), "collection": self.ref.collection}

Edit: This would probably change how to do serialization with Pydantic. For example, in build_validation for Pydantic V2, if isinstance(v, dict) and v.keys() == {"id", "collection"} is used to detect if the dict is a DBRef. Given that keys can now contain more items, the strict equality would no longer hold.

The text was updated successfully, but these errors were encountered:

staticxterm · 2024-10-07T21:28:58Z

I would technically not characterize this as a bug as this was simply never implemented. My guess is that DBRef comes from PyMongo right?
https://github.com/mongodb/mongo-python-driver/blob/master/bson/dbref.py
The thing is, Link objects are essentially Pydantic models. And when you try to serialize attributes that are private in the model, they are simply omitted
Ref: https://docs.pydantic.dev/latest/concepts/models/#private-model-attributes
So to support your use case, some 'hacking' would be necessary, as you've demonstrated.

MrEarle · 2024-10-07T21:36:02Z

Well, Links are not pydantic models, they are classes that can be serialized by pydantic. The serialization very explicitly only encodes the inner self.ref's id and collection, while ignoring any other attribute it has.

Also, when creating a document with a link, you need to either call to_ref or build the DBRef yourself. Maybe this is not a bug, but I don't think it is desirable behavior to omit the fields that the DBRef contains.

MrEarle · 2024-10-08T12:48:04Z

To further drive de discussion, Beanie showcases Links as a way to filter a document based on the linked document field. In my example, it would be:

Todo.find(Todo.user.email == email, fetch_links=True)

But this performs a lookup before filtering, and cannot take advantages of indices, which requires expensive colscan operations. This get harder for more complex real-world examples.

A way to improve on this, is to have extra fields, like in the embedded document pattern, that can be indexed (IndexModel(user.email)). That way, we can perform the above query with an index and without the need to fetch links. We can then fetch the links after performing the query, which would not involve a colscan just to filter the Todo collection.

But the current implementation does not support this.

staticxterm · 2024-10-08T17:03:20Z

I see. That may be very useful but would probably complicate the design or overall logic.
If I'm understanding this correctly, we can have any additional fields under the $ref field, which then somehow need to stay in sync from wherever they are originally defined, is this the expectation? (I'm thinking about when this embedded "Link" data needs to be updated) If so, this seems like much work.

But I do agree that at least this possibility to add extra fields is currently missing, and your proposed code in Additional context seems like a way to go and probably wouldn't break anything if extra fields are added.

MrEarle · 2024-10-08T18:34:16Z

Well, it should be the responsibility of the developer to determine when to update the extra fields in the DBRef, as that is part of the business logic of their use cases. The advantages of embedded document pattern are to improve read performance, at the expense of write performance, which is usually rarer. This is a common (and recommended) pattern for MongoDB.

Maybe the solution is not to add this functionality to Links, but have a new type Link-ish feature that represents an embedded document.

github-actions · 2024-11-08T02:36:48Z

This issue is stale because it has been open 30 days with no activity.

github-actions bot added the Stale label Nov 8, 2024

staticxterm added feature request and removed Stale labels Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Link DBRefs do not preserve extra attributes when calling `document.model_dump()` #1041

[BUG] Link DBRefs do not preserve extra attributes when calling `document.model_dump()` #1041

MrEarle commented Oct 7, 2024 •

edited

Loading

staticxterm commented Oct 7, 2024

MrEarle commented Oct 7, 2024

MrEarle commented Oct 8, 2024 •

edited

Loading

staticxterm commented Oct 8, 2024

MrEarle commented Oct 8, 2024

github-actions bot commented Nov 8, 2024

[BUG] Link DBRefs do not preserve extra attributes when calling document.model_dump() #1041

[BUG] Link DBRefs do not preserve extra attributes when calling document.model_dump() #1041

Comments

MrEarle commented Oct 7, 2024 • edited Loading

staticxterm commented Oct 7, 2024

MrEarle commented Oct 7, 2024

MrEarle commented Oct 8, 2024 • edited Loading

staticxterm commented Oct 8, 2024

MrEarle commented Oct 8, 2024

github-actions bot commented Nov 8, 2024

[BUG] Link DBRefs do not preserve extra attributes when calling `document.model_dump()` #1041

[BUG] Link DBRefs do not preserve extra attributes when calling `document.model_dump()` #1041

MrEarle commented Oct 7, 2024 •

edited

Loading

MrEarle commented Oct 8, 2024 •

edited

Loading