Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow a generator to be provided instead of a List #1030

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

skinkie
Copy link
Contributor

@skinkie skinkie commented May 7, 2024

📒 Description

Considering you are writing a very big tree, you don't want to materialise that tree before it ends up in a file in a list first. Ideally the entire subtree should only be rendered just in time.

🔗 What I've Done

I have allowed a Generator to be handled as a List.

💬 Comments

There might be more places this needs to be changed. Since List[] is used everywhere, type hinting fails.

🛫 Checklist

import sqlite3

from xsdata.formats.dataclass.context import XmlContext
from xsdata.formats.dataclass.parsers import XmlParser
from xsdata.formats.dataclass.parsers.config import ParserConfig
from xsdata.formats.dataclass.parsers.handlers import LxmlEventHandler
from xsdata.formats.dataclass.serializers import XmlSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig
from xsdata.models.datatype import XmlDateTime

from netex import PublicationDelivery, ParticipantRef, MultilingualString, DataObjectsRelStructure, GeneralFrame, \
    GeneralFrameMembersRelStructure, ServiceJourney

serializer_config = SerializerConfig(ignore_default_attributes=True, xml_declaration=True)
serializer_config.pretty_print = True
serializer_config.ignore_default_attributes = True
serializer = XmlSerializer(config=serializer_config)

context = XmlContext()
config = ParserConfig(fail_on_unknown_properties=False)
parser = XmlParser(context=context, config=config, handler=LxmlEventHandler)

def load_generator(con, clazz, limit=None):
    type = getattr(clazz.Meta, 'name', clazz.__name__)

    cur = con.cursor()
    if limit is None:
        cur.execute(f"SELECT object FROM {type};")
    else:
        cur.execute(f"SELECT object FROM {type} LIMIT {limit};")

    while True:
        xml = cur.fetchone()
        if xml is None:
            break
        yield parser.from_bytes(xml[0], clazz)

with sqlite3.connect("/tmp/netex.sqlite") as con:
    publication_delivery = PublicationDelivery(
                publication_timestamp=XmlDateTime.now(),
                participant_ref=ParticipantRef(value="NDOV"),
                description=MultilingualString(value="Huge XML Serializer test"),
                data_objects=DataObjectsRelStructure(choice=[GeneralFrame(members=GeneralFrameMembersRelStructure(choice=load_generator(con, ServiceJourney, 10)))]),
                version="ntx:1.1",
            )

ns_map = {'': 'http://www.netex.org.uk/netex', 'gml': 'http://www.opengis.net/gml/3.2'}
with open('netex-output/huge.xml', 'w') as out:
    serializer.write(out, publication_delivery, ns_map)

Copy link

codecov bot commented May 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (eab6bb7) to head (d4b3e5b).
Report is 17 commits behind head on main.

Current head d4b3e5b differs from pull request most recent head 6bfeb67

Please upload reports for the commit 6bfeb67 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #1030   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          117       115    -2     
  Lines         9272      9265    -7     
  Branches      2194      2190    -4     
=========================================
- Hits          9272      9265    -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tefra
Copy link
Owner

tefra commented May 7, 2024

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

@skinkie
Copy link
Contributor Author

skinkie commented May 7, 2024

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

But would this something you would support from an architecture point of view?

@tefra
Copy link
Owner

tefra commented May 7, 2024

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

But would this something you would support from an architecture point of view?

Yes

@skinkie
Copy link
Contributor Author

skinkie commented May 30, 2024

@tefra How would you like to proceed? Materialise List or Tuple in the tests?

@skinkie
Copy link
Contributor Author

skinkie commented May 30, 2024

Testing it with my own code results in this error. So it is clearly not done yet.

xsdata.exceptions.XmlContextError: Error on DataObjectsRelStructure::choice: Xml Elements does not support typing `typing.Iterable[typing.Union[netex.general_version_frame_structure.CompositeFrame, netex.mobility_journey_frame.MobilityJourneyFrame, netex.mobility_service_frame.MobilityServiceFrame, netex.sales_transaction_frame.SalesTransactionFrame, netex.fare_frame.FareFrame, netex.driver_schedule_frame.DriverScheduleFrame, netex.vehicle_schedule_frame.VehicleScheduleFrame, netex.service_frame.ServiceFrame, netex.timetable_frame.TimetableFrame, netex.site_frame.SiteFrame, netex.infrastructure_frame.InfrastructureFrame, netex.general_version_frame_structure.GeneralFrame, netex.resource_frame.ResourceFrame, netex.service_calendar_frame.ServiceCalendarFrame]]`

Copy link

sonarcloud bot commented May 30, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarCloud

@skinkie
Copy link
Contributor Author

skinkie commented Jun 1, 2024

Parsing breaks with Iterable.

        if tokens_factory:
            value = value if collections.is_array(value) else value.split()
            return tokens_factory(
                converter.deserialize(val, types, ns_map=ns_map, format=format)
                for val in value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants