Skip to content

ERROR: "Dataset does not exist at the specified path..." #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jimjones26 opened this issue May 23, 2023 · 4 comments
Open

ERROR: "Dataset does not exist at the specified path..." #3

jimjones26 opened this issue May 23, 2023 · 4 comments

Comments

@jimjones26
Copy link

I am getting the following error anytime I start the app locally, or once it is running locally and I try to upload my own documents.

Failed to build chain for data source 'https://github.com/gustavz/DataChad.git' with error: A dataset does not exist at the specified path, or you do not have sufficient permissions to load or create one. Please check the dataset path and make sure that you have sufficient permissions to the path.

I am guessing this has to do with my activeloop account, but I am unable to figure out where to enable permissions for datasets to be created in activeloop. Am I missing something obvious?

@gustavz
Copy link
Owner

gustavz commented May 24, 2023

Try following:

  1. Does it work without providing your activeloop credential, so using our database ?
    -> if this works its related to your activeloop account
  2. Double check your credentials before submitting them
  3. Delete any datasets that datachad created in your account, they may be broken
  4. retry

let me know if this helped

@cnndabbler
Copy link

ok so I wanted to use Deep Lake online. I had a similar initialization issue... I looked around the code and decided to add some (unrelated) code to initialize a Deep Lake dataset with the same name I used DEFAULT_DATA_SOURCE = "brain22" but initialized (in the separate code) as hub://"org"/brain22-1000-0

This let me get to the point where now I could actually upload a directory of pdf files....

But in the process, it created a completely separate DeepLake dataset ....

and then for each additional pdf I would add it would create yet another dataset instead of adding to the previously created ...

Hope it helps

@gustavz
Copy link
Owner

gustavz commented May 26, 2023

It is intended to create a dataset per datasource. The app always chats with a single data source, like a pdf or a GitHub repo, to enable asking dedicated questions.

But I see the potential use case here, will add the option to store everything to a single dataset.

@fjsikora
Copy link

fjsikora commented Jun 6, 2023

@jimjones26, in constants.py, delete ".git" from the DEFAULT_DATA_SOURCE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants