Skip to content

landing-ai/ade-sample-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ADE Sample Projects

Sample workflows, schemas, documents, and integrations for LandingAI's Agentic Document Extraction (ADE) — a service that parses and extracts structured data from any document, with no templates required.

Try the visual playground: va.landing.ai/


What's in this repo

Each subdirectory is a standalone sample project. They are organized into four categories:

Category What you'll find
Workflows Reusable integration patterns — quickstart, RAG, serverless, Snowflake
Use Cases Domain-specific extraction demos — invoices, food labels, utility bills, certificates
Events Demos and tutorials built for specific conferences and courses
Other Standalone utilities, such as the SEC EDGAR pipeline

Getting started

All ADE samples need a LandingAI API key. Get one at va.landing.ai/settings/api-key.

Set it as an environment variable or in a .env file in the project directory:

export VISION_AGENT_API_KEY=your_api_key_here

Most projects use the landingai-ade Python library:

pip install landingai-ade

Each project has its own README with setup and usage instructions.


Workflows

The minimal starting point. Follows the ADE Quickstart Guide: parse a document, then extract structured fields using a Pydantic schema.

Handles mixed document types in a single batch. Automatically classifies each document (e.g., pay stub, bank statement, investment statement), then applies the correct extraction schema. Includes bounding-box visualizations for every extracted field.

Async batch parser that processes many documents concurrently and writes chunks with bounding-box coordinates to CSV — ready to load into a vector database like ChromaDB.

Asynchronous job submission for large PDFs (up to 1 GB / 1,000 pages) that exceed the standard synchronous API limits. Includes polling, progress tracking, and automatic handling of inline vs. URL-based results.

Deploys ADE as a Docker-based AWS Lambda function triggered by S3 uploads. Supports both parse-only and structured extraction modes. Includes build.sh and deploy.sh scripts.

Full Snowflake-native pipeline: upload documents to a Snowflake stage, apply ADE to parse and extract, enable RAG with Cortex Search, query structured fields with Cortex Analyst, and surface everything through a Cortex Agent. Uses FDA medical device documents as the example.


Use Cases

Extracts 27 structured fields from food label images — product name, brand, weight, serving size, certifications (organic, non-GMO, kosher), and dietary claims. Demonstrates parse-once, extract-multiple-times for different schemas.

Extracts header fields and line items from invoices using a nested Pydantic schema with six sub-models.

Extracts account information, billing period, charges, and usage data from utility bills using a JSON schema.

Extracts CME certificate fields — provider, activity title, credit hours, completion date — from continuing education certificates.


Events

Conference demo: classify and extract from mixed financial documents (pay stubs, bank statements, investment statements) with bounding-box visualizations. Includes a caching layer to speed up re-runs.

Course lesson: deploy ADE as a Lambda function, store results in S3, build a Bedrock knowledge base from parsed markdown, and create a medical document chatbot with conversation memory.

End-to-end tutorial: extract structured data from PDFs, evaluate accuracy against a golden set, and iteratively refine the extraction schema until all fields reach ≥ 95% accuracy. Demonstrates the full Parse → Build Schema → Extract → Evaluate loop using the REST APIs directly (no Python SDK required for the Build Schema step).

Webinar demo: SQL setup script and schema for parsing FINRA award documents inside the Snowflake ADE native application.


Other

Standalone utility for fetching SEC EDGAR 10-K and 8-K filings by ticker symbol and converting them to PDF. Does not use ADE — useful as a document source for ADE extraction pipelines.


Resources

About

Sample workflows, schemas, documents and front-ends using Agentic Document Extraction (ADE).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors