A chatbot to help staff identify and use correct sensitivity labels in communications. Built with Python, Gradio, Pandoc, Tesseract OCR, and OpenAI.
Recommended: Use Docker and Docker Compose for setup and running.
RTFM at https://www.docker.com/ if not sure
- Docker and Docker Compose (v2+)
- OpenAI API key (add to .env)
- (For local dev: Python 3.11+ and Git)
- Setup & Run (Docker, Docker Compose)
git clone https://github.com/chweekueh1/nyp-fyp-project
cd nyp-fyp-project
cp .env.dev .env # Add your OpenAI API key to .env
python setup.py --docker-build
python setup.py --docker-run
setup.py
is just a wrapper over Docker commands, so run them directly if you are unable to run the setup script on Windows.
Note that certain paths in the source code are hard coded.
Uses separate containers for dev, test, prod, and docs. Requires Docker Compose for multi-container workflows and benchmarks. See Docker Compose install.
Common commands:
python setup.py --docker-build # Build dev container
python setup.py --docker-run # Run app
python setup.py --docker-test # Run tests
python setup.py --docs # Build & serve docs (http://localhost:8080)
Note that the sites are currently exposed by
nginx
reverse proxy (generated by Gradio), which is exposed onhttp://0.0.0.0:7680
->site_url
. Documentation and other Docker containers may use other ports.
To be implemented
User data is stored in ~/.nypai-chatbot/ (local) or /home/appuser/.nypai-chatbot/ (Docker).
You would need to create the following under the project root since we are currently using a volume mount:
|-- data
|---- cache
|---- memory_persistence
|---- reports
|---- vector_store
Build and serve docs:
python setup.py --docs
Docs available at http://127.0.0.1:8080
Technical detail: this just grabs docstrings and renders it in Sphinx.
Benchmarks for various function and API calls in the codebase can be triggered via:
python setup.py --run-benchmarks
It will output to the <project root>/data
directory as benchmark.md
once complete. This directory also has a JSON and SQLITE file recording Docker build details.
Pre-commit hooks with ruff for linting and formatting:
Note: The pre-commit flag in the setup script might not work depending on the directory you are in when you invoke the script.
If this is the case, make use of these steps instead
Activate and create Python virtual environment named .venv
and activate it. See Python Docs for more on virtual environments.
pip install -r requirements/requirements-precommit.txt
Then you can run git commit
and git push
within the context of the virtual environment and it will automatically run the configure pre-commit hooks.
You can also manually run the pre-commit hooks at any time. See here for details.
API Key Issues: Check .env and your OpenAI API key. Port Conflicts: Default is 7860; Gradio will use the next available port. Dependencies: Pandoc, ffmpeg, hyperfine (handled by Docker).
MIT License. See LICENSE for details.