12 Nov 01:50

sanjay920

ab6a936

v0.3.0 Latest

Latest

GPTParse v0.3.0

New Features

Added OCR mode for direct text extraction from PDFs and images
- Supports PDF, PNG, JPG/JPEG files
- Fast local processing without requiring AI services
- Optional abort-on-error flag for better error handling
Enhanced CLI interface with four distinct processing modes:
- Vision mode (AI-powered)
- Fast mode (local processing)
- Hybrid mode (combined approach)
- OCR mode (direct text extraction)

Improvements

Added support for processing image files (PNG, JPG/JPEG) in OCR mode
Enhanced error handling and reporting
Improved documentation with comprehensive examples for all modes

Technical Details

Introduced new DoclingHandler for OCR processing
Updated CLI interface to support OCR commands and options
Added abort-on-error functionality for OCR processing

Usage

# New OCR mode examples
gptparse ocr document.pdf --output_file output.md
gptparse ocr scan.png --output_file output.md
gptparse ocr document.pdf --output_file output.md --abort-on-error

For full documentation and examples, please see the README.md.

Assets 4

04 Nov 22:42

sanjay920

v0.2.0

b8bff99

v0.2.0

GPTParse v0.2.0 - Multi-Mode Processing & Extended Format Support

GPTParse now supports both PDF documents and images (PNG, JPG, JPEG), offering multiple processing modes for conversion to Markdown. Choose between local processing, AI vision models, or a combination of both for optimal results.

Major New Features:

Extended Format Support

PDF Documents: Full support for single and multi-page PDFs
Image Files: Direct processing of PNG, JPG, and JPEG files
Preserved Structure: Maintains tables, lists, and embedded images across all formats

Multiple Processing Modes

Fast Mode: Local PDF processing using pymupdf4llm - no AI required
Vision Mode: High-fidelity conversion of PDFs and images using Vision Language Models (VLMs)
Hybrid Mode: Enhanced accuracy by combining fast and vision processing for PDFs

Improved Model Support

Latest Claude 3.5 Sonnet model integration
Pinned to latest stable releases for consistent performance
Maintained support for OpenAI, Anthropic, and Google models

Quick Start:

# Install the latest version
pip install gptparse

# Process PDFs:
# Fast local processing (no AI required)
gptparse fast document.pdf --output_file output.md

# AI-powered vision processing
export OPENAI_API_KEY="your-openai-api-key"
gptparse vision document.pdf --output_file output.md

# Enhanced hybrid processing
gptparse hybrid document.pdf --output_file output.md

# Process Images:
gptparse vision screenshot.png --output_file output.md
gptparse vision photo.jpg --output_file output.md

Supported Models:

OpenAI: gpt-4o (default), gpt-4o-mini
Anthropic: claude-3-5-sonnet-latest (default), claude-3-opus-latest
Google: gemini-1.5-pro-002 (default), gemini-1.5-flash-002, gemini-1.5-flash-8b

Breaking Changes:

Anthropic models now use the latest tag instead of specific dates
Minimum Python version requirement remains at 3.9

For full documentation, usage examples, and contribution guidelines, please refer to the README.md.

This release significantly expands GPTParse's capabilities with support for multiple file formats and processing modes, making it even more versatile for your document processing needs.

Assets 4

23 Oct 07:44

sanjay920

v0.1.2

3f33906

v0.1.2

This release allows users to use the latest claude 3.5 model. Moving forward it will be pinned to the latest 3.5 sonnet model.

Assets 4

19 Oct 02:41

sanjay920

v0.1.1

7a3879b

v0.1.1

Several improvements to user experience post-installation.

Assets 4

18 Oct 22:24

sanjay920

v0.1.0

4c16d31

v0.1.0

GPTParse v0.1.0 - Initial Release

GPTParse is a powerful document parser designed for Retrieval-Augmented Generation (RAG) systems, enabling seamless conversion of PDF documents to Markdown format using advanced vision language models (VLMs).

Key Features:

Convert complex PDFs to well-structured Markdown, preserving tables, lists, and images
Support for multiple AI providers: OpenAI, Anthropic, and Google
Flexible usage as both a Python library and CLI application
Customizable processing options, including page selection and system prompts
Detailed statistics for token usage and processing times

Quick Start:

pip install gptparse
export OPENAI_API_KEY="your-openai-api-key"
gptparse vision example.pdf --output_file output.md

Supported Models:

OpenAI: gpt-4o (default), gpt-4o-mini
Anthropic: claude-3-5-sonnet-20240620 (default), claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-haiku-20240307
Google: gemini-1.5-pro-002 (default), gemini-1.5-flash-002, gemini-1.5-flash-8b

For full documentation, usage examples, and contribution guidelines, please refer to the README.md.

We're excited to introduce GPTParse and look forward to your feedback and contributions!

Assets 4

Releases: gptscript-ai/gptparse

v0.3.0

GPTParse v0.3.0

New Features

Improvements

Technical Details

Usage

Uh oh!

v0.2.0

GPTParse v0.2.0 - Multi-Mode Processing & Extended Format Support

Major New Features:

Extended Format Support

Multiple Processing Modes

Improved Model Support

Quick Start:

Supported Models:

Breaking Changes:

Uh oh!

v0.1.2

Uh oh!

v0.1.1

Uh oh!

v0.1.0

GPTParse v0.1.0 - Initial Release

Key Features:

Quick Start:

Supported Models:

Uh oh!