Content Navigator

The Content Navigator empowers users to swiftly search for specific information across multiple uploaded documents in various formats. It's designed with a user-friendly interface and powerful search capabilities, making it a breeze to find relevant data in large sets of documents.

Features

Multi-Format Support: Upload and search across documents in various formats such as 'pdf', 'doc', 'docx', 'csv', 'json', 'txt', 'htm', 'html', 'ppt', and 'pptx'.
Flexible Search Options: Specify the exact information you want to search for in each document.
Traceability: View search results along with a trace of the relevant documents.
Scalability: Handle large content by integrating with external vector databases using PineCone API.

Please note that by default, the app tries to use in-memory embedding for all document content. However, in case of extensive content combining all uploaded documents, the app conveniently uses the PineCone Vector Database.

Demo

We split the UI into two pieces to accommodate details in snapshots.

Multiple documents uploaded
Search query input and app output

Installation

Clone the repository:

git clone https://github.com/your-username/document-searcher.git

Navigate to the project directory
```
cd document-searcher
```
Install Poetry using pip (if not already installed):
```
pip install poetry
```
Activate the virtual environment using Poetry:
```
pip shell
```
Install the project dependencies using Poetry:
```
poetry install
```

Configuration

Create a .env file in the root directory of the project:

Add your own OpenAI API key as:
```
OPENAI_API_KEY = 'your-key-here'
```
In order to handle large document content, the app will use PineCone vector database. Please set up your PineCone API key:
```
PINECONE_API_KEY = 'your_pinecone_api_key'
```

Usage

After installing the dependencies, you can run the Streamlit app in the root directory by executing the following command:
```
streamlit run app.py
```
Please follow the prompts and upload your documents in the permitted formats.
Specify the information you want to search for.
The app will search the requested information, display the results and the trace of relevant documents.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
doc_search		doc_search
.gitignore		.gitignore
CN-1.JPG		CN-1.JPG
CN-2.JPG		CN-2.JPG
README.md		README.md
app.py		app.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Content Navigator

Features

Demo

Installation

Configuration

Usage

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ShahMitul-GenAI/ContentNavigator

Folders and files

Latest commit

History

Repository files navigation

Content Navigator

Features

Demo

Installation

Configuration

Usage

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages