Sample workflows, schemas, documents, and integrations for LandingAI's Agentic Document Extraction (ADE) — a service that parses and extracts structured data from any document, with no templates required.
Try the visual playground: va.landing.ai/
Each subdirectory is a standalone sample project. They are organized into four categories:
| Category | What you'll find |
|---|---|
| Workflows | Reusable integration patterns — quickstart, RAG, serverless, Snowflake |
| Use Cases | Domain-specific extraction demos — invoices, food labels, utility bills, certificates |
| Events | Demos and tutorials built for specific conferences and courses |
| Other | Standalone utilities, such as the SEC EDGAR pipeline |
All ADE samples need a LandingAI API key. Get one at va.landing.ai/settings/api-key.
Set it as an environment variable or in a .env file in the project directory:
export VISION_AGENT_API_KEY=your_api_key_hereMost projects use the landingai-ade Python library:
pip install landingai-adeEach project has its own README with setup and usage instructions.
The minimal starting point. Follows the ADE Quickstart Guide: parse a document, then extract structured fields using a Pydantic schema.
Handles mixed document types in a single batch. Automatically classifies each document (e.g., pay stub, bank statement, investment statement), then applies the correct extraction schema. Includes bounding-box visualizations for every extracted field.
Async batch parser that processes many documents concurrently and writes chunks with bounding-box coordinates to CSV — ready to load into a vector database like ChromaDB.
Asynchronous job submission for large PDFs (up to 1 GB / 1,000 pages) that exceed the standard synchronous API limits. Includes polling, progress tracking, and automatic handling of inline vs. URL-based results.
Deploys ADE as a Docker-based AWS Lambda function triggered by S3 uploads. Supports both parse-only and structured extraction modes. Includes build.sh and deploy.sh scripts.
Full Snowflake-native pipeline: upload documents to a Snowflake stage, apply ADE to parse and extract, enable RAG with Cortex Search, query structured fields with Cortex Analyst, and surface everything through a Cortex Agent. Uses FDA medical device documents as the example.
Extracts 27 structured fields from food label images — product name, brand, weight, serving size, certifications (organic, non-GMO, kosher), and dietary claims. Demonstrates parse-once, extract-multiple-times for different schemas.
Extracts header fields and line items from invoices using a nested Pydantic schema with six sub-models.
Extracts account information, billing period, charges, and usage data from utility bills using a JSON schema.
Extracts CME certificate fields — provider, activity title, credit hours, completion date — from continuing education certificates.
Conference demo: classify and extract from mixed financial documents (pay stubs, bank statements, investment statements) with bounding-box visualizations. Includes a caching layer to speed up re-runs.
Course lesson: deploy ADE as a Lambda function, store results in S3, build a Bedrock knowledge base from parsed markdown, and create a medical document chatbot with conversation memory.
End-to-end tutorial: extract structured data from PDFs, evaluate accuracy against a golden set, and iteratively refine the extraction schema until all fields reach ≥ 95% accuracy. Demonstrates the full Parse → Build Schema → Extract → Evaluate loop using the REST APIs directly (no Python SDK required for the Build Schema step).
Webinar demo: SQL setup script and schema for parsing FINRA award documents inside the Snowflake ADE native application.
Standalone utility for fetching SEC EDGAR 10-K and 8-K filings by ticker symbol and converting them to PDF. Does not use ADE — useful as a document source for ADE extraction pipelines.