Tathya Fact Checking System 🕵️‍♀️

✨ Overview

Tathya is a comprehensive fact-checking system designed to verify claims by autonomously gathering and analyzing evidence from multiple sources. The name "Tathya" (तथ्य) comes from Sanskrit, meaning "truth" or "reality" - perfectly embodying the system's purpose of discovering factual accuracy through a rigorous, agent-driven process. It uses a sophisticated agent powered by LLMs and LangChain to dynamically select tools, conduct research, and synthesize findings, ultimately delivering a verdict with a confidence score and detailed explanation.

🚀 Features

🤖 Agentic Workflow: Employs an AI agent to manage the entire fact-checking process, from claim analysis to final synthesis.
🛠️ Dynamic Tool Selection: The agent intelligently chooses the best tools (Search Engines, Wikidata, News APIs, Web Scrapers) based on the claim and intermediate findings.
🔍 Multi-source Evidence Collection: Gathers information from diverse sources like Tavily, Google Search (via Gemini), DuckDuckGo, Wikidata, and NewsAPI.
🧩 Claim Decomposition: Automatically breaks down complex claims into simpler, verifiable sub-questions using LLMs.
📊 Confidence Scoring: Provides a numerical confidence score (0.0-1.0) alongside the final verdict (TRUE, FALSE, PARTIALLY TRUE/MIXTURE, UNCERTAIN).
📝 Detailed Explanation: Offers a comprehensive summary explaining the agent's reasoning, citing the evidence gathered.
🔗 Source Attribution: Transparently lists all sources consulted and the tools used to access them.
🖥️ Modern Dark Mode Interface: Clean, user-friendly Streamlit interface with dark mode support.
🪜 Multi-step Verification Process: Shows the user the agent's step-by-step reasoning and evidence gathering process.

🏗️ System Architecture

Tathya leverages an agentic architecture, orchestrated using principles often found in frameworks like LangGraph. Instead of a fixed pipeline, a central Fact-Checking Agent dynamically plans and executes tasks using a suite of available tools:

Core Agent: An LLM-based agent responsible for:
- Understanding the claim.
- Planning the verification strategy.
- Selecting and invoking appropriate tools.
- Analyzing tool outputs (evidence).
- Synthesizing findings into a final verdict and explanation.
Tool Suite: Functions the agent can call upon:
- claim_decomposition_tool: Breaks down complex claims.
- tavily_search, gemini_google_search_tool, duckduckgo_search: General web search tools.
- news_search: Queries NewsAPI for recent articles.
- wikidata_entity_search: Retrieves structured data from Wikidata.
- scrape_webpages_tool: Extracts content from specific URLs identified during search.
- (Other potential tools)
State Manager: Maintains the context of the investigation, including the original claim, gathered evidence, agent's thoughts, and past actions.
REST API: Exposes the agent's fact-checking capabilities.
Streamlit UI: Provides the user interface for interaction and result presentation.

The diagram below represents a high-level overview of the components the agent interacts with, rather than a strict linear pipeline.

⚙️ How the System Works (Agentic Flow)

The fact-checking process is now driven by the agent's autonomous reasoning:

User Input: A user submits a factual claim via the Streamlit UI.
API Request: The frontend sends the claim to the backend API, initiating the agent.
Phase 1: Initial Analysis & First Search:
- The agent analyzes the claim. If complex, it uses the claim_decomposition_tool to break it down.
- It plans and executes an initial broad search using a tool like tavily_search or gemini_google_search_tool.
- The agent evaluates the initial results for relevance and credibility.
Phase 2: Deep Investigation:
- Based on the initial findings, the agent plans its next step.
- It iteratively selects and uses tools (duckduckgo_search, news_search, wikidata_entity_search, scrape_webpages_tool, etc.) to gather more specific evidence, analyze contradictions, or explore different angles.
- After each tool call, the agent analyzes the new evidence and refines its plan. This continues until sufficient evidence (typically from at least 3 distinct sources) is gathered.
Phase 3: Final Synthesis:
- Once the agent determines it has enough high-quality evidence, it concludes the investigation.
- It synthesizes all gathered information, determines the final verdict (TRUE, FALSE, etc.), calculates a confidence score, and writes a detailed explanation justifying the conclusion, referencing key evidence.
Presentation: The final verdict, confidence score, explanation, step-by-step agent trace (intermediate thoughts and actions), and list of sources are presented to the user in the Streamlit interface.

🚀 Getting Started

Prerequisites

Python 3.8+
Required API keys stored securely (e.g., in a .env file):
- OpenAI API key (or Azure OpenAI endpoint details)
- Google AI (Gemini) API key
- Tavily API key
- NewsAPI key

Installation

Clone the repository:

git clone https://github.com/Kaos599/tathya-fact-checking-system.git
cd tathya-fact-checking-system

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables by creating a .env file in the root directory:

Ensure you have the necessary keys for the tools you intend the agent to use.

Running the Application

Start the backend API server:

# Navigate to the API directory if your structure requires it
# cd fact_check_system/api
uvicorn fact_check_system.api.main:app --reload --host 0.0.0.0 --port 8000
# Or if using Flask/other framework, adjust the command accordingly
# python fact_check_system/api/main.py

The API will typically be available at http://127.0.0.1:8000. Check the console output.

Start the Streamlit frontend in a separate terminal:
```
streamlit run app.py
```
The app will usually be available at http://localhost:8501.

🤔 Example Claims to Try

Challenge the agent with various claims:

"Does India have the largest population as of mid-2024?"
"Is the boiling point of water always 100 degrees Celsius?"
"Did the James Webb Space Telescope launch before 2022?"
"Elon Musk is the CEO of Neuralink."
"Which team won the last FIFA World Cup?"

🔌 API Usage

The system provides a REST API endpoint to trigger the fact-checking agent:

Check a Claim

POST /check
Content-Type: application/json

{
  "claim": "Your claim text here",
  "language": "en" // Optional, defaults might apply
}

Example Response:

{
  "claim": "Your claim text here",
  "verdict": "PARTIALLY TRUE/MIXTURE", // Or TRUE, FALSE, UNCERTAIN
  "confidence_score": 0.75,
  "explanation": "Detailed explanation generated by the agent, summarizing the evidence and reasoning...",
  "intermediate_steps": [ // Optional: Could include agent's thought process
    { "thought": "Initial thought...", "action": "ToolX", "input": "...", "observation": "..." },
    // ... more steps
  ],
  "sources": [
    {
      "url": "https://example.com/source1",
      "title": "Source Title 1",
      "snippet": "Relevant excerpt from source 1...",
      "tool_used": "tavily_search"
    },
    {
      "url": "https://newssite.com/article",
      "title": "Recent News Article",
      "snippet": "Latest developments...",
      "tool_used": "news_search"
    }
    // ... other sources
  ]
}

(Note: The exact response structure might vary based on implementation details, especially regarding intermediate steps.)

🤝 Contributing

Contributions are welcome! If you have suggestions, bug reports, or want to add new tools or features, please feel free to:

Open an issue to discuss the change.
Fork the repository.
Create a new branch (git checkout -b feature/YourFeature).
Make your changes.
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/YourFeature).
Open a Pull Request.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with ❤️ using Python, Streamlit, and LangChain.
Leverages powerful APIs from Tavily AI, Google AI (Gemini), NewsAPI, Wikidata, and potentially others.
Inspired by the need for reliable, automated fact-checking in the digital age.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.streamlit		.streamlit
fact_check_system		fact_check_system
.gitignore		.gitignore
Architechure.png		Architechure.png
Logo.png		Logo.png
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test_fact_checker.py		test_fact_checker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tathya Fact Checking System 🕵️‍♀️

✨ Overview

🚀 Features

🏗️ System Architecture

⚙️ How the System Works (Agentic Flow)

🚀 Getting Started

Prerequisites

Installation

Running the Application

🤔 Example Claims to Try

🔌 API Usage

Check a Claim

🤝 Contributing

📜 License

🙏 Acknowledgments

About

Releases

Packages

Languages

Kaos599/Tathya-Fact-Checking-System

Folders and files

Latest commit

History

Repository files navigation

Tathya Fact Checking System 🕵️‍♀️

✨ Overview

🚀 Features

🏗️ System Architecture

⚙️ How the System Works (Agentic Flow)

🚀 Getting Started

Prerequisites

Installation

Running the Application

🤔 Example Claims to Try

🔌 API Usage

Check a Claim

🤝 Contributing

📜 License

🙏 Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages