Skip to content

Commit

Permalink
Merge pull request #585 from h2oai/control_embedding_migration
Browse files Browse the repository at this point in the history
Control embedding migration
  • Loading branch information
pseudotensor authored Aug 2, 2023
2 parents 738d7ac + fe6aaef commit d333423
Show file tree
Hide file tree
Showing 31 changed files with 938 additions and 404 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Query and summarize your documents or just chat with local private GPT LLMs usin
- **Inference Servers** support (HF TGI server, vLLM, Gradio, ExLLaMa, OpenAI)
- **OpenAI-compliant Python client API** for client-server control
- **Evaluate** performance using reward models
- **Quality** maintained with over 250 unit and integration tests taking over 4 GPU-hours

### Getting Started

Expand Down Expand Up @@ -128,9 +129,10 @@ GPU and CPU mode tested on variety of NVIDIA GPUs in Ubuntu 18-22, but any moder
- To run h2oGPT tests:
```bash
wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q8_0.bin
pip install requirements-parser
pytest -s -v tests client/tests
pip install requirements-parser pytest-instafail
pytest --instafail -s -v tests client/tests
```
or tweak/run `tests/test4gpus.sh` to run tests in parallel.

### Help

Expand Down
18 changes: 17 additions & 1 deletion docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,11 +182,27 @@ This warning can be safely ignored.
- `CUDA_VISIBLE_DEVICES`: Standard list of CUDA devices to make visible.
- `PING_GPU`: ping GPU every few minutes for full GPU memory usage by torch, useful for debugging OOMs or memory leaks
- `GET_GITHASH`: get git hash on startup for system info. Avoided normally as can fail with extra messages in output for CLI mode

- `H2OGPT_SCRATCH_PATH`: Choose base scratch folder for scratch databases and files
- `H2OGPT_BASE_PATH`: Choose base folder for all files except scratch files
These can be useful on HuggingFace spaces, where one sets secret tokens because CLI options cannot be used.

> **_NOTE:_** Scripts can accept different environment variables to control query arguments. For instance, if a Python script takes an argument like `--load_8bit=True`, the corresponding ENV variable would follow this format: `H2OGPT_LOAD_8BIT=True` (regardless of capitalization). It is important to ensure that the environment variable is assigned the exact value that would have been used for the script's query argument.
### How to run functions in src from Python interpreter

E.g.
```python
import sys
sys.path.append('src')
from src.gpt_langchain import get_supported_types
non_image_types, image_types, video_types = get_supported_types()
print(non_image_types)
print(image_types)
for x in image_types:
print(' - `.%s` : %s Image (optional),' % (x.lower(), x.upper()))
print(video_types)
```

### GPT4All not producing output.

Please contact GPT4All team. Even a basic test can give empty result.
Expand Down
69 changes: 67 additions & 2 deletions docs/README_LangChain.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,72 @@ Open-source data types are supported, .msg is not supported due to GPL-3 require
- `.odt`: Open Document Text,
- `.pptx` : PowerPoint Document,
- `.ppt` : PowerPoint Document,
- `.apng` : APNG Image (optional),
- `.blp` : BLP Image (optional),
- `.bmp` : BMP Image (optional),
- `.bufr` : BUFR Image (optional),
- `.bw` : BW Image (optional),
- `.cur` : CUR Image (optional),
- `.dcx` : DCX Image (optional),
- `.dds` : DDS Image (optional),
- `.dib` : DIB Image (optional),
- `.emf` : EMF Image (optional),
- `.eps` : EPS Image (optional),
- `.fit` : FIT Image (optional),
- `.fits` : FITS Image (optional),
- `.flc` : FLC Image (optional),
- `.fli` : FLI Image (optional),
- `.fpx` : FPX Image (optional),
- `.ftc` : FTC Image (optional),
- `.ftu` : FTU Image (optional),
- `.gbr` : GBR Image (optional),
- `.gif` : GIF Image (optional),
- `.grib` : GRIB Image (optional),
- `.h5` : H5 Image (optional),
- `.hdf` : HDF Image (optional),
- `.icb` : ICB Image (optional),
- `.icns` : ICNS Image (optional),
- `.ico` : ICO Image (optional),
- `.iim` : IIM Image (optional),
- `.im` : IM Image (optional),
- `.j2c` : J2C Image (optional),
- `.j2k` : J2K Image (optional),
- `.jfif` : JFIF Image (optional),
- `.jp2` : JP2 Image (optional),
- `.jpc` : JPC Image (optional),
- `.jpe` : JPE Image (optional),
- `.jpeg` : JPEG Image (optional),
- `.jpf` : JPF Image (optional),
- `.jpg` : JPG Image (optional),
- `.jpx` : JPX Image (optional),
- `.mic` : MIC Image (optional),
- `.mpeg` : MPEG Image (optional),
- `.mpg` : MPG Image (optional),
- `.msp` : MSP Image (optional),
- `.pbm` : PBM Image (optional),
- `.pcd` : PCD Image (optional),
- `.pcx` : PCX Image (optional),
- `.pgm` : PGM Image (optional),
- `.png` : PNG Image (optional),
- `.jpg` : JPEG Image (optional),
- `.jpeg` : JPEG Image (optional).
- `.pnm` : PNM Image (optional),
- `.ppm` : PPM Image (optional),
- `.ps` : PS Image (optional),
- `.psd` : PSD Image (optional),
- `.pxr` : PXR Image (optional),
- `.qoi` : QOI Image (optional),
- `.ras` : RAS Image (optional),
- `.rgb` : RGB Image (optional),
- `.rgba` : RGBA Image (optional),
- `.sgi` : SGI Image (optional),
- `.tga` : TGA Image (optional),
- `.tif` : TIF Image (optional),
- `.tiff` : TIFF Image (optional),
- `.vda` : VDA Image (optional),
- `.vst` : VST Image (optional),
- `.webp` : WEBP Image (optional),
- `.wmf` : WMF Image (optional),
- `.xbm` : XBM Image (optional),
- `.xpm` : XPM Image (optional).

To support image captioning, on Ubuntu run:
```bash
Expand Down Expand Up @@ -326,6 +389,8 @@ For links to direct to the document and download to your local machine, the orig

* [docquery](https://github.com/impira/docquery) like PrivateGPT but uses LayoutLM.

* [KhoJ](https://github.com/khoj-ai/khoj) but also access from emacs or Obsidian.

* [ChatPDF](https://www.chatpdf.com/) but h2oGPT is open-source and private and many more data types.

* [Sharly](https://www.sharly.ai/) but h2oGPT is open-source and private and many more data types. Sharly and h2oGPT both allow sharing work through UserData shared collection.
Expand Down
3 changes: 2 additions & 1 deletion reqs_optional/requirements_optional_langchain.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,8 @@ chromadb==0.3.25
unstructured[local-inference]==0.7.4
#pdf2image==1.16.3
#pytesseract==0.3.10
pillow
pillow>=10.0.0
posthog>=3.0.1

pdfminer.six==20221105
urllib3
Expand Down
8 changes: 8 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,11 @@ text-generation==0.6.0
tiktoken==0.4.0
# optional: for OpenAI endpoint or embeddings (requires key)
openai==0.27.8

requests>=2.31.0
urllib3>=1.26.16
filelock>=3.12.2
joblib>=1.3.1
tqdm>=4.65.0
tabulate>=0.9.0
packaging>=23.1
8 changes: 5 additions & 3 deletions src/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ def run_cli( # for local function:
score_model=None, load_8bit=None, load_4bit=None, load_half=None,
load_gptq=None, load_exllama=None, use_safetensors=None, revision=None,
use_gpu_id=None, tokenizer_base_model=None,
gpu_id=None, local_files_only=None, resume_download=None, use_auth_token=None,
gpu_id=None, n_jobs=None, local_files_only=None, resume_download=None, use_auth_token=None,
trust_remote_code=None, offload_folder=None, rope_scaling=None, max_seq_len=None, compile_model=None,
# for some evaluate args
stream_output=None, async_output=None, num_async=None,
Expand All @@ -40,11 +40,13 @@ def run_cli( # for local function:
raise_generate_gpu_exceptions=None, load_db_if_exists=None, use_llm_if_no_docs=None,
my_db_state0=None, selection_docs_state0=None, dbs=None, langchain_modes=None, langchain_mode_paths=None,
detect_user_path_changes_every_query=None,
use_openai_embedding=None, use_openai_model=None, hf_embedding_model=None, cut_distance=None,
use_openai_embedding=None, use_openai_model=None,
hf_embedding_model=None, migrate_embedding_model=None,
cut_distance=None,
answer_with_sources=None,
append_sources_to_answer=None,
add_chat_history_to_context=None,
db_type=None, n_jobs=None, first_para=None, text_limit=None, verbose=None, cli=None, reverse_docs=None,
db_type=None, first_para=None, text_limit=None, verbose=None, cli=None, reverse_docs=None,
use_cache=None,
auto_reduce_chunks=None, max_chunks=None, model_lock=None, force_langchain_evaluate=None,
model_state_none=None,
Expand Down
3 changes: 2 additions & 1 deletion src/client_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@
from bs4 import BeautifulSoup # pip install beautifulsoup4

from enums import DocumentSubset, LangChainAction
from tests.utils import get_inf_server

debug = False

Expand All @@ -58,7 +59,7 @@
def get_client(serialize=True):
from gradio_client import Client

client = Client(os.getenv('HOST', "http://localhost:7860"), serialize=serialize)
client = Client(get_inf_server(), serialize=serialize)
if debug:
print(client.view_api(all_endpoints=True))
return client
Expand Down
10 changes: 6 additions & 4 deletions src/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def run_eval( # for local function:
score_model=None, load_8bit=None, load_4bit=None, load_half=None,
load_gptq=None, load_exllama=None, use_safetensors=None, revision=None,
use_gpu_id=None, tokenizer_base_model=None,
gpu_id=None, local_files_only=None, resume_download=None, use_auth_token=None,
gpu_id=None, n_jobs=None, local_files_only=None, resume_download=None, use_auth_token=None,
trust_remote_code=None, offload_folder=None, rope_scaling=None, max_seq_len=None, compile_model=None,
# for evaluate args beyond what's already above, or things that are always dynamic and locally created
temperature=None,
Expand Down Expand Up @@ -60,11 +60,13 @@ def run_eval( # for local function:
raise_generate_gpu_exceptions=None, load_db_if_exists=None, use_llm_if_no_docs=None,
my_db_state0=None, selection_docs_state0=None, dbs=None, langchain_modes=None, langchain_mode_paths=None,
detect_user_path_changes_every_query=None,
use_openai_embedding=None, use_openai_model=None, hf_embedding_model=None, cut_distance=None,
use_openai_embedding=None, use_openai_model=None,
hf_embedding_model=None, migrate_embedding_model=None,
cut_distance=None,
answer_with_sources=None,
append_sources_to_answer=None,
add_chat_history_to_context=None,
db_type=None, n_jobs=None, first_para=None, text_limit=None, verbose=None, cli=None, reverse_docs=None,
db_type=None, first_para=None, text_limit=None, verbose=None, cli=None, reverse_docs=None,
use_cache=None,
auto_reduce_chunks=None, max_chunks=None,
model_lock=None, force_langchain_evaluate=None,
Expand Down Expand Up @@ -121,7 +123,7 @@ def run_eval( # for local function:
num_examples = len(examples)
scoring_path = 'scoring'
# if no permissions, assume may not want files, put into temp
scoring_path = makedirs(scoring_path, tmp_ok=True)
scoring_path = makedirs(scoring_path, tmp_ok=True, use_base=True)
if eval_as_output:
used_base_model = 'gpt35'
used_lora_weights = ''
Expand Down
Loading

0 comments on commit d333423

Please sign in to comment.