This project uses the Document AI API to annotate PDF documents.
- Install Python
- Install the prerequisites:
pip install -r requirements.txt
- Install the Google Cloud SDK
- Run
gcloud init
and create a new project - Enable the Document AI API:
gcloud services enable documentai.googleapis.com
- Setup application default authentication, run:
gcloud auth application-default login
- Clone this repo and run the sample:
python main.py -i invoice.pdf
. You should see the annotated document in the current directory namedinvoice_annotated.pdf
.
- Install pyenv: https://github.com/pyenv/pyenv#installation
- Use pyenv to install
the latest version of Python 3 for
example, to install Python version 3.10.1, run:
pyenv install 3.10.1
- Create a Python virtual environment with the installed version of Python 3,
for example, to create a Python 3.10.1 virtual environment called
docai-annotator
, run:pyenv virtualenv 3.10.1 docai-annotator
- Clone this repo and
cd
to the root of the repo - Configure pyenv to use the virtual python environment we created earlier when in this repo:
pyenv local docai-annotator
- Install the prerequisites:
pip install -r requirements.txt
- Install the Cloud SDK: https://cloud.google.com/sdk/docs/install
- Run
gcloud init
, to create a new project, and link a billing to your project - Enable the Document AI API:
gcloud services enable documentai.googleapis.com
- Setup application default authentication, run:
gcloud auth application-default login
- Run the sample:
python main.py -i invoice.pdf
- Check to see the annotated version of the PDF created in the current directory with the name
invoice_annotated.pdf
.