Skip to content

aadityarock2000/beat_the_crowd

Repository files navigation

Coverage Status

SmoothFly

Project Type:

Covid-19 led to an increase in delays, cancellations, and other disruptions to air travel which can add stress and uncertainty to traveling. The Bureau of Transportation Statistics (BTS) holds a wealth of information on flight history in the US, but the data is largely inaccessible and lacks interpretability. The aim of this project is to develop a user-friendly tool that will increase the accessibility of flight data to allow users to gain insights into future travel plans with their own analysis and to provide interpretable analysis of key factors that a user may consider when booking including fares, delays, and denied boarding.

In addition to helping individual travelers, this project aims to contribute to the overall understanding of airline industry trends. By analyzing large sets of flight data, this tool will be able to identify patterns and correlations that can inform industry-wide decisions related to pricing, scheduling, and customer service. This information could be used to optimize airline operations, improve customer experiences, and enhance the overall efficiency of air travel.

To achieve these goals, this project will use a variety of data science techniques, including data cleaning, data visualization, and website scraping. The resulting tool will be designed to be user-friendly and accessible, with a simple interface that allows users to interact with the tool in two different ways. If they want to obtain a set of flight data to perform their own analysis then they will input their desired specifications and obtain a subset of data from BTS that has been collected and filtered to match their desired qualifications. Alternatively, if they want to understand how flight data can inform their future travel plans, they can select one of three possible variables of intrest: fares, denied boarding, or delays. From there they will be able to view a dashboard with visualizations focused on their selected factor.

By providing valuable insights into airline industry trends and helping individuals make informed travel decisions, this project aims to contribute to a more efficient, effective, and enjoyable travel experience for all.

Questions of Interest:

  • How can the accessability of data from BTS be improved so that users can easily access data from more than one airline or origin city?
  • What are the average flight prices for different routes, and how do they vary between short-haul and long-haul flights?
  • What is the likelihood of flight delays for different airlines and travel routes, and how does this vary by time of year?
  • What type of delay is most likely for a given flight route or carrier?
  • What is the likelihood of being denied boarding by a given carrier?
  • In the case of a denied boarding, what is the likelihood the denial is voluntary and what sort of compensation can be expected?

Project Output Goal:

The goal of this project will be to produce a tool that can be used to help a user optimize their travel plans to fit their needs. This tool will increase the accessibility of flight data from the BTS to allow users to gain insights into future travel plans with their own analysis by providing the ability to obtain collected data from more than one origin city and carrier. Additionally, it will provide interpretable analysis of key factors that a user may consider when booking air travel including fares, denied boardings, and delays to help users make informed decisions when booking air travel.

Data Sources:

Setting up the project:

Prerequisites: The project needs the user to have Microsoft SQL server in their machine where they intend to host/run the website.

Steps:

  1. Clone the repository
  2. In the root folder, run the following commands to set up the project.
    Windows:
    py -m pip install --upgrade build
    py -m build
    Unix/macOS:
    python3 -m pip install --upgrade build
    python3 -m build
  3. In the root folder, run the command conda env create -f environment.yaml to install all the required dependencies
  4. To create the SQL Database, look at the pipeline_example.ipynb file in the examples doc to create the database and connect to your machine. 4.1. DISCLAIMER: This is a huge download, with a requirement of more than 20GB of downloads, so we have included a file for testing named data.bacpac inside the data/pipeline_data 4.2. Import this into your MS SQL database. You can refer this link
  5. In the sql_parsing.py file in SmoothFly/website_utils path, edit the connect_sql_server() method, by adding in your server name.
  6. Rerun using streamlit run app.py to look at the updated website, with the functionality to access the pipeline.

The Demo Link

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •