-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
How to train on other documentation
This AI can use any documentation, but first it needs to be prepared for similarity search.
Start by going to
/scripts/
folder
If you open this file you will see that it uses RST files from the folder to create a docs.index
and faiss_store.pkl
.
It currently uses OPEN_AI to create vector store, so make sure your documentation is not too big. Pandas cost me around 3-4$
You can usually find documentation on github in docs/ folder for most open-source projects.
Name it inputs/
Put all your .rst files in there
The search is recursive, so you don't need to flatten them
If there are no .rst files just convert whatever you find to txt and feed it. (dont forget to change the extension in script)
And write your OpenAI API key inside
OPENAI_API_KEY=<your-api-key>
python ingest.py
It will tell you how much it will cost
Once you run it will use new context that is relevant to your documentation Make sure you select default in the dropdown in the UI
You can learn more about options while running ingest.py by running:
python ingest.py --help
Options | |
---|---|
--dir TEXT | List of paths to directory for index creation. E.g. --dir inputs --dir inputs2 [default: inputs] |
--file TEXT | File paths to use (Optional; overrides directory) E.g. --files inputs/1.md --files inputs/2.md |
--recursive / --no-recursive | Whether to recursively search in subdirectories [default: recursive] |
--limit INTEGER | Maximum number of files to read |
--formats TEXT | List of required extensions (list with .) Currently supported: .rst, .md, .pdf, .docx, .csv, .epub, .html [default: .rst, .md] |
--exclude / --no-exclude | Whether to exclude hidden files (dotfiles) [default: exclude] |
-y, --yes | Whether to skip price confirmation |
|