🤖 GPT PDF Processor 📚

A Python application that extracts text from PDF documents and processes it using OpenAI's GPT models to answer questions about the document content.

🔎 Overview

This application provides a simple interface to:

📄 Extract text from PDF documents
🧠 Process the text using LangChain and OpenAI's GPT models
❓ Ask questions about the document contents and receive AI-generated answers

👨‍🏫 Attribution

This code was originally created by Professor Daniel Cavalieri and adapted by Paulo Sergio dos Santos Júnior.

✅ Prerequisites

🐍 Python 3.6+
🔑 OpenAI API key

💻 Installation

Clone the repository:

git clone https://github.com/paulossjunior/OpenAIandPDF.git
cd OpenAIandPDF

Install required dependencies:

pip install -r requirements.txt

Create a .env file in the root directory with your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

🚀 Usage

Place your PDF file in the project directory or specify the path in the code.
Run the main program:

python program_gpt.py

The program will:
- 🔐 Load your OpenAI API key from the .env file
- 📝 Process the PDF file specified in the code (default: "edital.pdf")
- 🧩 Ask a predefined question about the document ("Qual o objetivo do Edital")
- 📊 Print the answer from GPT

⚙️ Customization

To ask different questions, modify the send_question parameter in program_gpt.py:

answer = gpt.send_question("Your question here")

To use a different PDF file, change the pdf_path variable:

pdf_path = "your_document.pdf"

📁 Project Structure

program_gpt.py: Main entry point for the application
fapes_gpt.py: Contains the GPT class with methods for processing PDFs and interacting with the OpenAI API

✨ Features

📄 PDF text extraction
🔌 Integration with OpenAI's GPT models (default: gpt-4o-mini)
🧩 Simple API for asking questions about document content
🔒 Environment variable support for secure API key storage

⚠️ Known Issues

The code has an error in the __create_chain method, where the chain assignment is missing.
The private methods __create_prompt, __chunkify_txt, and __get_vector are defined but not used in the current workflow.

Name	Name	Last commit message	Last commit date
Latest commit paulossjunior Merge pull request #1 from leds-org/main Feb 28, 2025 fcf182f · Feb 28, 2025 History 4 Commits
docs	docs	start	Feb 28, 2025
src	src	start	Feb 28, 2025
.gitignore	.gitignore	Initial commit	Feb 28, 2025
LICENSE	LICENSE	Initial commit	Feb 28, 2025
README.md	README.md	Update README.md	Feb 28, 2025
requirements.txt	requirements.txt	start	Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 GPT PDF Processor 📚

🔎 Overview

👨‍🏫 Attribution

✅ Prerequisites

💻 Installation

🚀 Usage

⚙️ Customization

📁 Project Structure

✨ Features

⚠️ Known Issues

About

Releases

Packages

Languages

License

paulossjunior/OpenAIandPDF

Folders and files

Latest commit

History

Repository files navigation

🤖 GPT PDF Processor 📚

🔎 Overview

👨‍🏫 Attribution

✅ Prerequisites

💻 Installation

🚀 Usage

⚙️ Customization

📁 Project Structure

✨ Features

⚠️ Known Issues

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages