Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tfds failed to load open-x-embodiement dataset #5392

Closed
WesleyHsieh0806 opened this issue May 2, 2024 · 4 comments
Closed

tfds failed to load open-x-embodiement dataset #5392

WesleyHsieh0806 opened this issue May 2, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@WesleyHsieh0806
Copy link

/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET

Short description
Fail to load dataset fractal20220817_data

Environment information

  • Operating System: Debian GNU/Linux 11

  • Python version: 3.12.2

  • tensorboard 2.16.2

  • tensorboard-data-server 0.7.2

  • tensorflow 2.16.1

  • tensorflow-datasets 4.9.3 (I also tried 4.9.4)

  • tensorflow-metadata 1.15.0

  • tfds-nightly 4.9.3.dev202312180044 (I also tried 4.9.4.dev202405010044)

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ?
    yes

Reproduction instructions

import tensorflow as tf
import tensorflow_datasets as tfds
from tqdm import tqdm

# 66 datasets excluding droid
datasets = [
    'fractal20220817_data',
    'kuka',
    'bridge',
    'taco_play',
    'jaco_play',
    'berkeley_cable_routing',
    'roboturk',
    'nyu_door_opening_surprising_effectiveness',
    'viola',
    'berkeley_autolab_ur5',
    'toto',
    'language_table',
    'columbia_cairlab_pusht_real',
    'stanford_kuka_multimodal_dataset_converted_externally_to_rlds',
    'nyu_rot_dataset_converted_externally_to_rlds',
    'stanford_hydra_dataset_converted_externally_to_rlds',
    'austin_buds_dataset_converted_externally_to_rlds',
    'nyu_franka_play_dataset_converted_externally_to_rlds',
    'maniskill_dataset_converted_externally_to_rlds',
    'furniture_bench_dataset_converted_externally_to_rlds',
    'cmu_franka_exploration_dataset_converted_externally_to_rlds',
    'ucsd_kitchen_dataset_converted_externally_to_rlds',
    'ucsd_pick_and_place_dataset_converted_externally_to_rlds',
    'austin_sailor_dataset_converted_externally_to_rlds',
    'austin_sirius_dataset_converted_externally_to_rlds',
    'bc_z', 'usc_cloth_sim_converted_externally_to_rlds',
            'utokyo_pr2_opening_fridge_converted_externally_to_rlds',
            'utokyo_pr2_tabletop_manipulation_converted_externally_to_rlds',
            'utokyo_saytap_converted_externally_to_rlds',
            'utokyo_xarm_pick_and_place_converted_externally_to_rlds',
            'utokyo_xarm_bimanual_converted_externally_to_rlds',
            'robo_net',
            'berkeley_mvp_converted_externally_to_rlds',
            'berkeley_rpt_converted_externally_to_rlds',
            'kaist_nonprehensile_converted_externally_to_rlds',
            'stanford_mask_vit_converted_externally_to_rlds',
            'tokyo_u_lsmo_converted_externally_to_rlds',
            'dlr_sara_pour_converted_externally_to_rlds',
            'dlr_sara_grid_clamp_converted_externally_to_rlds',
            'dlr_edan_shared_control_converted_externally_to_rlds',
            'asu_table_top_converted_externally_to_rlds',
            'stanford_robocook_converted_externally_to_rlds',
            'eth_agent_affordances',
            'imperialcollege_sawyer_wrist_cam',
            'iamlab_cmu_pickup_insert_converted_externally_to_rlds',
            'qut_dexterous_manipulation',
            'uiuc_d3field',
            'utaustin_mutex',
            'berkeley_fanuc_manipulation',
            'cmu_playing_with_food',
            'cmu_play_fusion',
            'cmu_stretch',
            'berkeley_gnm_recon',
            'berkeley_gnm_cory_hall',
            'berkeley_gnm_sac_son',
            'robot_vqa',
            'conq_hose_manipulation',
            'dobbe',
            'fmb',
            'io_ai_tech',
            'mimic_play',
            'aloha_mobile',
            'robo_set',
            'tidybot',
            'vima_converted_externally_to_rlds'
]

print('Download {} datasets from Open-X-Embodiement...'.format(len(datasets)))

# optionally replace the DATASET_NAMES below with the list of filtered datasets from the google sheet
DOWNLOAD_DIR = '~/Open-X-Embodiement'

print(f"Downloading {len(datasets)} datasets to {DOWNLOAD_DIR}.")
for dataset_name in tqdm(datasets):
    # print(tfds.__version__)
    _ = tfds.load(
        dataset_name, data_dir=DOWNLOAD_DIR)

If you share a colab, make sure to update the permissions to share it.

Link to logs

Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 442, in try_reraise
    yield
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/load.py", line 220, in builder
    return cls(**builder_kwargs)  # pytype: disable=not-instantiable
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/logging/__init__.py", line 289, in decorator
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/dataset_builder.py", line 1370, in __init__
    super().__init__(**kwargs)
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/logging/__init__.py", line 289, in decorator
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/dataset_builder.py", line 287, in __init__
    self.info.initialize_from_bucket()
    ^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/logging/__init__.py", line 169, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/dataset_builder.py", line 482, in info
    info = self._info()
           ^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/robotics/dataset_importer_builder.py", line 82, in _info
    features = self.get_ds_builder().info.features
               ^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/robotics/dataset_importer_builder.py", line 149, in get_ds_builder
    ds_builder = tfds.builder_from_directory(ds_location)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/read_only_builder.py", line 150, in builder_from_directory
    return ReadOnlyBuilder(builder_dir=builder_dir)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/logging/__init__.py", line 289, in decorator
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/read_only_builder.py", line 66, in __init__
    info_proto = dataset_info.read_proto_from_builder_dir(builder_dir)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/dataset_info.py", line 1059, in read_proto_from_builder_dir
    return read_from_json(info_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/dataset_info.py", line 1037, in read_from_json
    raise FileNotFoundError(f"Could not load dataset info from {path}") from e
FileNotFoundError: Could not load dataset info from gs:/gresearch/robotics/fractal20220817_data/0.1.0/dataset_info.json

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./download_v1.py", line 82, in <module>
    _ = tfds.load(
        ^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/logging/__init__.py", line 169, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/load.py", line 641, in load
    dbuilder = _fetch_builder(
               ^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/load.py", line 496, in _fetch_builder
    return builder(name, data_dir=data_dir, try_gcs=try_gcs, **builder_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/logging/__init__.py", line 169, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/load.py", line 217, in builder
    with py_utils.try_reraise(
  File "/root/miniconda3/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 444, in try_reraise
    reraise(e, *args, **kwargs)
  File "/root/miniconda3/lib/python3.12/site-packages/tensorflow_datasets/core/utils/py_utils.py", line 411, in reraise
    raise exception from e
FileNotFoundError: Failed to construct dataset "fractal20220817_data", builder_kwargs "{'data_dir': '~/Open-X-Embodiement'}": Could not load dataset info from gs:/gresearch/robotics/fractal20220817_data/0.1.0/dataset_info.json

Expected behavior
Successfully download each dataset listed.

@WesleyHsieh0806 WesleyHsieh0806 added the bug Something isn't working label May 2, 2024
@ccl-core
Copy link
Collaborator

ccl-core commented May 8, 2024

Hi @WesleyHsieh0806 , thank you for reporting this issue! We will have a closer look into this.

In the meanwhile, you should be able to load the dataset with:

ds = tfds.load("fractal20220817_data:0.1.0", data_dir="gs://gresearch/robotics")

or:

ds = tfds.load("robotics:fractal20220817_data:0.1.0")

Thanks!

@ccl-core
Copy link
Collaborator

ccl-core commented May 8, 2024

Hi @WesleyHsieh0806 , a clarification question: have you by any chance modified your code?
The line:

FileNotFoundError: Could not load dataset info from gs:/gresearch/robotics/fractal20220817_data/0.1.0/dataset_info.json

in your error stack seems very odd: it is not clear where the gs:/ prefix with only one slash (instead of gs://) comes from...

@WesleyHsieh0806
Copy link
Author

I resolved this issue by using the following command to download the data

gsutil -m cp -r gs://gdm-robotics-open-x-embodiment/{dataset_name} ~/tensorflow_datasets/

@ccl-core
Copy link
Collaborator

Hi @WesleyHsieh0806 , thank you for the update! Great to know that you are unblocked now :)

I am closing the bug, but please feel free to reopen it in case you encounter any further problem with this dataset. And pleae feel free to open a PR if you want to contribute to TFDS!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants