Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_ods errors with basic_string::substr: __pos (which is [a large number]) > this->size() (which is 1) #211

Closed
mayeulk opened this issue Jan 4, 2025 · 4 comments

Comments

@mayeulk
Copy link

mayeulk commented Jan 4, 2025

With a given file edited by hand, read_ods errors:

> read_ods("test_may_make_read_ods_crash_v2.ods")
Erreur : basic_string::substr: __pos (which is 18446744073709551615) > this->size() (which is 1)
> read_ods("test_may_make_read_ods_crash_v3.ods")
# A tibble: 0 × 2
# ℹ 2 variables: picture_archive_url <chr>, video_url <chr>

The two files v2 and v3 look identical when opened in the LO Calc software which edited them:
Version: 24.2.7.2 (X86_64) / LibreOffice Community
Build ID: 420(Build:2)
CPU threads: 32; OS: Linux 6.8; UI render: default; VCL: gtk3
Locale: fr-FR (fr_FR.UTF-8); UI: fr-FR
Ubuntu package version: 4:24.2.7-0ubuntu0.24.04.1
Calc: threaded

(original file v1, not shown here, was an export from Moodle and was not buggy; editing it led to v2).

File ..._v3 was created starting with file ..._v2, deleting (empty) rows up to the moment when the error disappeared.

Main differences I can see, decompresing the .ods files, (extracts below):

in settings.xml (extract)
in file crahsing v2:
config:name="VisibleAreaHeight" config:type="int">3154</config:config-item>
in file OK v3:
config:name="VisibleAreaHeight" config:type="int">526</config:config-item>

meta.xml
in file crahsing v2:
meta:cell-count="7"
in file OK v3:
meta:cell-count="2"

content.xml (end of file
in file crahsing v2:
<text:p>picture_archive_url</text:p></table:table-cell><table:table-cell office:value-type="string" calcext:value-type="string"><text:p>video_url</text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1"><table:table-cell/><table:table-cell office:value-type="string" calcext:value-type="string"><text:p><text:s/></text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1"><table:table-cell/><table:table-cell office:value-type="string" calcext:value-type="string"><text:p><text:s/></text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1"><table:table-cell/><table:table-cell office:value-type="string" calcext:value-type="string"><text:p><text:s/></text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1"><table:table-cell/><table:table-cell office:value-type="string" calcext:value-type="string"><text:p><text:s/></text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1"><table:table-cell/><table:table-cell office:value-type="string" calcext:value-type="string"><text:p><text:s/></text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1" table:number-rows-repeated="1048568"><table:table-cell table:number-columns-repeated="16384"/></table:table-row><table:table-row table:style-name="ro2"><table:table-cell table:number-columns-repeated="16384"/></table:table-row><table:table-row table:style-name="ro2"><table:table-cell table:number-columns-repeated="16384"/></table:table-row></table:table><table:named-expressions/></office:spreadsheet></office:body></office:document-content>

in file OK v3:
<text:p>picture_archive_url</text:p></table:table-cell><table:table-cell office:value-type="string" calcext:value-type="string"><text:p>video_url</text:p></table:table-cell><table:table-cell table:number-columns-repeated="16382"/></table:table-row><table:table-row table:style-name="ro1" table:number-rows-repeated="1048560"><table:table-cell table:number-columns-repeated="16384"/></table:table-row><table:table-row table:style-name="ro2" table:number-rows-repeated="14"><table:table-cell table:number-columns-repeated="16384"/></table:table-row><table:table-row table:style-name="ro2"><table:table-cell table:number-columns-repeated="16384"/></table:table-row></table:table><table:named-expressions/></office:spreadsheet></office:body></office:document-content>

sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
[1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats graphics grDevices datasets utils methods base

other attached packages:
[1] readODS_2.3.1

loaded via a namespace (and not attached):
[1] utf8_1.2.4 cellranger_1.1.0 tzdb_0.4.0 magrittr_2.0.3
[5] bspm_0.5.7 glue_1.8.0 tibble_3.2.1 pkgconfig_2.0.3
[9] lifecycle_1.0.4 cli_3.6.3 zip_2.3.1 fansi_1.0.6
[13] vctrs_0.6.5 compiler_4.4.2 tools_4.4.2 pillar_1.9.0
[17] minty_0.0.4 rlang_1.1.4 stringi_1.8.4

@mayeulk
Copy link
Author

mayeulk commented Jan 4, 2025

The two test files. v2 errors. V3 is OK.
test_may_make_read_ods_crash_v2.ods
test_may_make_read_ods_crash_v3.ods

@chainsawriot
Copy link
Collaborator

@mayeulk Thank you for reporting this. The preliminary investigation suggests that it should be a problem of minty (the underlying type guessing engine).

See gesistsa/minty#37

@chainsawriot
Copy link
Collaborator

@mayeulk The problem has been fixed. If you install the github version of minty, it should be fine. I'll need to push minty to CRAN first and make readODS depend on minty v0.0.5. Then I will close this issue.

@mayeulk
Copy link
Author

mayeulk commented Jan 5, 2025

Hi, thanks a lot! Thanks to for providing guidelines to use the github version of minty! I really appreciate that! (still, I used a workaround to fix the issue in the short term on my side; I'm using eddelbuettel/r2u: CRAN as Ubuntu Binaries and I might wait for packages to propagate).

chainsawriot added a commit that referenced this issue Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants