Skip to content

A scalable data warehouse solution designed for AI-driven traffic analytics using vehicle trajectory data from swarm UAVs. Built with Airflow for orchestration, dbt for data transformation, PostgreSQL for storage, and Redash for visualization.

Notifications You must be signed in to change notification settings

nathyBekele/custom-data-warehouse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Building a Scalable Data Warehouse for AI-Driven Traffic Analytics

Project Overview

Welcome to our AI startup's ambitious project aimed at transforming the way traffic data is analyzed and utilized for smart city initiatives. This README will guide you through the project structure, business need, data source, and instructions for setting up the data warehouse tech stack using Airflow, dbt, PostgreSQL, and Redash.

Read the Blog for details

Project Overview

Project Structure

  • /dags: Airflow DAG scripts for orchestrating data loading and transformation.
  • /dwh_dbt: dbt project folder containing models for data transformation.
  • /notebooks: Jupyter notebook files for raw CSV data processing and loading.
  • /data: Raw CSV files used in the project.
  • /images: Images used in the README.
  • /redash: Redash project folder for visualization and reporting.
  • /test: Contains test scripts.
  • .gitignore: Git configuration to exclude unnecessary files.
  • docker-compose.yaml: Docker Compose configuration for fully dockerized deployment.
  • requirements.txt: File containing the list of project dependencies.

Business Need

Our AI startup collaborates with businesses to deploy sensors, collecting diverse data for critical intelligence. The city traffic department has entrusted us to create a scalable data warehouse that analyzes vehicle trajectory data from swarm UAVs, aiming to enhance traffic flow and contribute to undisclosed projects.

Data Source

We utilize the pNEUMA dataset, a large-scale collection of naturalistic vehicle trajectories in Athens, Greece. This dataset, acquired by a unique experiment using swarm drones, provides valuable insights into traffic patterns.

Data Source: pNEUMA Dataset

References for Understanding Data Generation:

Visualization and Interaction Tools:

Getting Started

Setting Up Locally

  1. Clone the Repository:

    git clone https://github.com/your-username/Data-Warehouse.git
    cd Data-Warehouse
    
  2. Install Dependencies:

    pip install -r requirements.txt
    
  3. Run Airflow Services:

    docker-compose up --build
    
  4. Access Airflow UI:

  5. Customize DAGs and dbt Models:

    • Adjust Airflow DAGs in the dags folder.
    • Modify dbt models in the dwh_dbt folder based on your specific requirements and data transformations.
  6. Stop Airflow Services:

    docker-compose down
    

Conclusion

This project aligns with our mission to revolutionize traffic analytics, providing cities with actionable insights for smart urban planning. We invite you to explore, contribute, and leverage our data warehouse solution to make impactful decisions for a more connected and efficient future.

Read my Blog for in-depth explanation🔗

Read my Blog for in-depth explanation

About

A scalable data warehouse solution designed for AI-driven traffic analytics using vehicle trajectory data from swarm UAVs. Built with Airflow for orchestration, dbt for data transformation, PostgreSQL for storage, and Redash for visualization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published