Skip to content

Website Traffic & Conversion Prediction Analysis: A regression analysis project analyzing user behavior and conversion rates for a heat pump installation company based in the United Kingdom, featuring EDA, machine learning models for churn prediction & regression analysis, producing actionable insights to optimize the conversion funnel.

Notifications You must be signed in to change notification settings

munas-git/website-traffic-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Heat Pump Installation Website Analytics and Conversion Prediction Model

Overview

This project focuses on analysing website traffic patterns, user behavior, and conversion rates for a heat pump installation service. It includes exploratory data analysis (EDA), data preprocessing, and the development of a machine-learning model to predict conversion rates. The goal is to identify key factors influencing customer engagement and optimise the conversion funnel.

Table of Contents

Introduction

The project aims to provide insights into how website visitors interact with a heat pump installation service's online platform. By analysing various metrics, such as website visits over time, time taken to book a design consultation, and conversion rates, the project seeks to identify opportunities for improving customer engagement and increasing the likelihood of conversion.

Data Sources

The project utilises the following datasets:

  • funnel_data.csv: Contains data related to user interactions and the various stages of the conversion funnel, including first search, design consultation payments, and completion.
  • installer_locations.csv: Contains location data of installers, which can be used to calculate distances and analyse regional trends.
  • property_estimates.csv: Contains property estimates that can be linked to user data for further analysis.

Libraries Used

The following Python libraries were used in this project:

  • Data Wrangling:
    • pandas: For data manipulation and analysis.
    • calendar: For date-related functionalities.
  • Data Preprocessing:
    • LabelEncoder: For encoding categorical variables.
  • Geospatial Calculations:
    • math: For calculating distances between geographical coordinates.
  • Visualisation:
    • seaborn: For creating statistical visualisations.
    • matplotlib: For creating plots and charts.
    • IPython.display: For displaying images.
  • Machine Learning:
    • scikit-learn: For model development, evaluation, and preprocessing.
    • imblearn: For handling imbalanced datasets using techniques like SMOTE.
    • xgboost: For training gradient boosting models.
  • Model Evaluation:
    • sklearn.metrics: For evaluating model performance using metrics like accuracy, confusion matrix, and classification report.

Data Wrangling

The data wrangling process involved the following steps:

  1. Loading Data: Reading the datasets (funnel_data.csv, installer_locations.csv, property_estimates.csv) using pandas.
  2. Data Type Conversion: Converting date columns to datetime format using pd.to_datetime.
  3. Handling Missing Values: Addressing missing data in relevant columns.
  4. Feature Engineering: Creating new features such as time deltas between first visit and design consultation payment.

Exploratory Data Analysis (EDA)

The EDA process involved the following steps:

  1. Understanding Data: Examining the structure, data types, and missing values in the datasets.
  2. Visualising Website Visits Over Time: Creating time series plots to identify patterns and trends in website traffic.
  3. Analysing Time Deltas: Calculating and visualising the time taken from the first visit to design consultation payment.
  4. Determining Conversion Rates: Calculating the percentage of website visitors who booked a design consultation.

Machine Learning Model Development

The machine learning model development process involved the following steps:

  1. Feature Selection: Selecting relevant features for the model.
  2. Data Preprocessing: Encoding categorical variables using LabelEncoder and handling imbalanced datasets using SMOTE.
  3. Model Training: Training various machine learning models, including Support Vector Machines (SVM), XGBoost, and Decision Trees.
  4. Pipeline Creation: Implementing a machine learning pipeline using scikit-learn and imblearn to streamline the process.
  5. Cross-Validation: Employing cross-validation techniques to ensure model robustness.

Model Evaluation

The model evaluation process involved the following steps:

  1. Performance Metrics: Evaluating model performance using metrics such as accuracy, confusion matrix, and classification report.
  2. Comparative Analysis: Comparing the performance of different models to identify the most effective one.
  3. Visualisation: Visualising the results using confusion matrices and other relevant plots.

Results

Key findings from the analysis include:

  • Website Traffic Patterns: Peak first visits occurred at the beginning of December 2023, followed by a progressive decrease.
  • Time to Book Consultation: The average time from the first visit to design consultation payment is approximately 17.71 days.
  • Conversion Rate: Approximately 0.98% of website visitors booked a design consultation.

Challenges Addressed

The project addressed the following challenges:

  • Analysing website traffic patterns over time.
  • Determining the time taken for users to pay for design consultations after their first visit.
  • Calculating the percentage of website visitors who booked a design consultation.
  • Understanding how conversion rates vary based on eligibility for the Boiler Upgrade Scheme grant.

Recommendations

Based on the analysis, the following recommendations are made:

  • Track website visits before installation deposit: Monitor the number of website visits before clients make an installation deposit.
  • Track proposal downloads: If proposal downloads are an option, track these to understand client engagement.
  • Further investigation: Investigate the reasons for the decline in conversion rates after the initial peak in website visits.

Further Investigation

Further areas for investigation include:

  • Impact of Boiler Upgrade Scheme grant eligibility on conversion rates.
  • Reasons for the decline in conversion rates after the initial peak in website visits.
  • Analysis of installer locations and their impact on service delivery.

Contact

For any questions or further information, please contact:

[Einetein]

[[email protected]]

[https://www.linkedin.com/in/einstein-ebereonwu/]

About

Website Traffic & Conversion Prediction Analysis: A regression analysis project analyzing user behavior and conversion rates for a heat pump installation company based in the United Kingdom, featuring EDA, machine learning models for churn prediction & regression analysis, producing actionable insights to optimize the conversion funnel.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published