Open
Description
While pycodcif is the fastest tool tested here https://github.com/ltalirz/cif-parsing-benchmark
there might still be low-hanging fruit for further optimization:
Only about a third of the time is spent in parse_cif
and
- More time is spent in
decode_utf8_frame
- Significant time is spent in
extract_precision
My questions would be
Re 1.: Without knowing details of what this function does - if it's really about decoding utf8, could this perhaps be done once per file rather than once per every element (e.g. decode_utf8_typed_values
is called 1.7M times on the test set)?
Even if not, this function could probably be sped up significantly by moving it to C.
Re 2.: How about making this optional, i.e. adding a flag that allows to disable extracting precision?
Metadata
Metadata
Assignees
Labels
No labels