Terrier AI helps you extract structured data from webpages.
- Parsing of browser HTML to look for structured content
- Uses Gemini's long context window under the hood to process HTML
- Coming Soon
- Node.js v16+
- Python 3.13+ (Important - previous versions will throw package version/OS related issues)
- Chrome/Firefox browsers
Client Setup:
cd client
npm install
API/Server Setup:
cd server
python3 -m venv venv
- on MacOS run:
source venv/bin/activate
- or with Windows:
cd venv/Scripts && activate && cd ../../
- Install python modules:
pip install -r requirements.txt
To run this project, you will need to add the following environment variables to your .env file(s) (depending on your usage)
GEMINI_API_KEY=your_gemini_api_key
VITE_BACKEND_URL=your_backend_url_or_localhost
Start both services simultaneously in separate terminals (We recommend using Docker to run the server to avoid versioning issues):
Frontend:
cd client && npm run dev
Backend:
docker build -t <image_name> .
docker run -p 5000:5000 <image_name>
Client: React, Redux, TailwindCSS
API: FastAPI, Playwright, Selenium Wire, langchain, browser-use
AI Components: Custom agent implementations, smolagents class, pocketflow framework
Contributions are welcome!
See contributing.md
for ways to get started.
Please adhere to this project's code of conduct
.