[Feature Request] Add support for loading datasets from local Minari cache

## Motivation

The current implementation of `MinariExperienceReplay` requires datasets to be downloaded using the class itself, which creates an `env_metadata.json` file in the target directory. This workflow does not accommodate custom Minari datasets created by users or datasets that have been loaded into the local Minari cache by other means (e.g., through `minari.load_dataset` or custom dataset creation via `DataCollector`). 

As a result, attempting to instantiate `MinariExperienceReplay` with `download=False` for locally available datasets leads to a `FileNotFoundError` due to missing metadata files, even though the dataset exists in the Minari cache. This limitation is frustrating for users who want to leverage their own datasets without redownloading or duplicating data, and it hinders workflows where datasets are managed independently of TorchRL.

This issue is meant to enable loading datasets directly from the local Minari cache (typically `~/.minari/datasets`) without requiring prior setup via `MinariExperienceReplay`'s download workflow, making it more flexible and compatible with custom and preloaded datasets.

## Solution

Add and fully support the argument `load_from_local_minari` to the `MinariExperienceReplay` class. When set to `True`, this argument will instruct the class to:

- Look for the dataset in the user's local Minari cache (e.g., `~/.minari/datasets/{dataset_id}/data/main_data.hdf5`).
- Bypass any download or remote fetching logic.
- If the required files are present, load the dataset and construct any necessary metadata on-the-fly (e.g., from the Minari dataset spec, if possible).
- Raise a clear and informative `FileNotFoundError` if the dataset is not found in the expected local cache location.
- Ensure that custom datasets created by users (such as those with `DataCollector(...).create_dataset(...)`) or datasets first loaded with `minari.load_dataset` can be used seamlessly with `MinariExperienceReplay`.

This solution allows for greater flexibility, avoids unnecessary downloads and data duplication, and makes TorchRL compatible with the wider Minari ecosystem.

## Alternatives

- **Manual copying of files**: Users could manually copy datasets and metadata to the expected TorchRL directory, but this is error-prone and not user-friendly.
- **Automated metadata generation scripts**: Provide standalone tools for generating `env_metadata.json` based on existing Minari datasets. This adds maintenance burden and complexity for users.

## Additional context

- The new `load_from_local_minari` argument should default to `False` to preserve backward compatibility.
- If `load_from_local_minari=True` is set, the `MinariExperienceReplay` class will prioritize loading the dataset directly from the local Minari cache (typically located at `~/.minari/datasets`). If the dataset exists in the cache, the class will skip any fetching from the Minari server; no remote download or overwrite will occur. After loading the dataset from the local cache, all subsequent preprocessing and loading steps will proceed as usual, ensuring the dataset is processed and made available correctly.
- This feature will facilitate workflows for research, benchmarking, and development using custom or proprietary datasets, and it is more in line with how Minari itself manages datasets locally.
- Example usage:

```python
import minari
data = MinariExperienceReplay(
    dataset_id=dataset_id,
    split_trajs=False,
    batch_size=128,
    sampler=SamplerWithoutReplacement(drop_last=True),
    prefetch=4,
    load_from_local_minari=True,  # <--- key addition
)
```


## Checklist

- [x] I have checked that there is no similar issue in the repo (**required**)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Add support for loading datasets from local Minari cache #3067

Motivation

Solution

Alternatives

Additional context

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Add support for loading datasets from local Minari cache #3067

Description

Motivation

Solution

Alternatives

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions