Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NonMatchingChecksumError while downloading 'multi_news' or 'cnn_dailymail' dataset #5232

Open
singhniraj08 opened this issue Jan 16, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@singhniraj08
Copy link

singhniraj08 commented Jan 16, 2024

Short description
Description of the bug.

getting NonMatchingChecksumError while downloading multi_news or cnn_dailymail datasets.

Environment information

  • Operating System: : Colab

  • Python version: : 3.10

  • tensorflow-datasets/tfds-nightly version: tensorflow-datasets 4.9.4

  • tensorflow/tf-nightly version: tensorflow 2.15

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Yes

Reproduction instructions

(https://colab.sandbox.google.com/gist/singhniraj08/9f80bc167706b9b351b75e003dcad39c/untitled2.ipynb)

If you share a colab, make sure to update the permissions to share it.

Link to logs

NonMatchingChecksumError: Artifact https://drive.google.com/uc?export=download&id=1vRY2wM6rlOZrf9exGTm5pXj5ExlVwJ0C, downloaded to /root/tensorflow_datasets/downloads/ucexport_download_id_1vRY2wM6rlOZrf9exGTm5pXj5OT0RBXCg5OWBrYMJXysF1hdrkZtPhK-7JWdYi2HrYYc.tmp.c134b8c8d86c4764bad073c9d79db385/download, has wrong checksum:

  • Expected: UrlInfo(size=245.06 MiB, checksum='64ae4d2483b248c9664b50bacfab6821f8a3e93f382c7587686fa4a127f77626', filename='multi-news-original-20190725T164630Z-001.zip')
  • Got: UrlInfo(size=2.40 KiB, checksum='d86ce49a2cafe0ed25eae0c9a5ed9abf8db1e34414e3acb667e316ad221c73c5', filename='download')
    To debug, see: https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror

Expected behavior
What you expected to happen.

Dataset should download without any issues.

Additional context
Add any other context about the problem here.

@singhniraj08 singhniraj08 added the bug Something isn't working label Jan 16, 2024
@83here
Copy link

83here commented Jan 21, 2024

Hello @singhniraj08, This is an persisting problem in tfds (#3935) and there is no solutions till now, although you can bypass the issue by just downloading it manually.

Thank you,

@Rahulraj0308
Copy link

@singhniraj08 you can visit link- https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror. For correction and as per my knowledge this issue is not solved yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants