New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF File requires extra dependencies. Install with pip install --upgrade "embedchain[dataloaders]" #1274
Comments
Hello, I just wanted to add that I am getting a similar error, but with postgres. I am using docker and I fixed this error adding : I am still trying to get the answer when calling app.query() function. |
I got similar error with embedchain also after running the 'pip install --upgrade "embedchain[dataloaders]"' command however it seems that the embedchain is using langchain-community version 0.0.20 which seems to cause the problem. I downgraded the langchain-community package to version 0.0.19 (pip install langchain-community==0.0.19). After that it started working. However there might be dependency problems with the solution. More here: https://stackoverflow.com/questions/77994079/no-module-named-pwd-while-use-from-langchain-community-document-loaders-impor |
馃悰 Describe the bug
(c2) C:\Users\harsh.padaliya\Desktop\custom_writeups_project>python try_rag.py
2024-02-19 15:32:19,557 - 14132 - add_config.py-add_config:30 - WARNING: min_chunk_size 0 should be greater than chunk_overlap 100, otherwise it is redundant.
Traceback (most recent call last):
File "C:\Users\harsh.padaliya\Desktop\custom_writeups_project\try_rag.py", line 79, in
app.add(r"\data\xdtb.pdf", data_type='pdf_file')
File "C:\Users\harsh.padaliya\AppData\Local\anaconda3\envs\c2\lib\site-packages\embedchain\embedchain.py", line 200, in add
data_formatter = DataFormatter(data_type, config, loader, chunker)
File "C:\Users\harsh.padaliya\AppData\Local\anaconda3\envs\c2\lib\site-packages\embedchain\data_formatter\data_formatter.py", line 34, in init
self.loader = self._get_loader(data_type=data_type, config=config.loader, loader=loader)
File "C:\Users\harsh.padaliya\AppData\Local\anaconda3\envs\c2\lib\site-packages\embedchain\data_formatter\data_formatter.py", line 90, in _get_loader
loader_class: type = self._lazy_load(loaders[data_type])
File "C:\Users\harsh.padaliya\AppData\Local\anaconda3\envs\c2\lib\site-packages\embedchain\data_formatter\data_formatter.py", line 40, in lazy_load
module = import_module(module_path)
File "C:\Users\harsh.padaliya\AppData\Local\anaconda3\envs\c2\lib\importlib_init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "C:\Users\harsh.padaliya\AppData\Local\anaconda3\envs\c2\lib\site-packages\embedchain\loaders\pdf_file.py", line 6, in
raise ImportError(
ImportError: PDF File requires extra dependencies. Install with
pip install --upgrade "embedchain[dataloaders]"
The text was updated successfully, but these errors were encountered: