This is the repository for the LinkedIn Learning course End-to-End Data Engineering Project. The full course is available from LinkedIn Learning.
The world of data engineering is ever-changing, with new tools and technologies emerging on a regular basis. Building an effective analytics platform can be a daunting task, especially if you’re not familiar with all the tools available. How do you turn scattered, complex data into a model that drives insights and decision-making? In this course, Thalia Barrera teaches data professionals how to implement an end-to-end data engineering project using open tools from the modern data stack. She touches on best practices such as data modeling, testing, documentation and version control and shows you how to efficiently extract, load, and transform data into a unified, analytics-ready format. Thalia shows you how to confidently select and use tools through practical examples—taking you through the construction of a robust data pipeline for a fictional ecommerce company—and how to implement best practices in data engineering.
This repository has two branches: main
holds the initial state of the project, and finished
holds the final state. You can use the branch pop up menu in github to switch to a specific branch and take a look at the course at that stage, or you can add /tree/BRANCH_NAME
to the URL to go to the branch you want to access.
You will be working in the main
branch throughout the course. At any time, you can checkout the finished
branch to consult how the finished project looks like.
Ensure you have Python 3 installed. If not, you can download and install it from Python's official website.
- Fork the Repository:
- Click the "Fork" button on the top right corner of this repository.
- Clone the repository:
git clone https://github.com/YOUR_USERNAME/end-to-end-data-engineering-project-4413618.git
- Note: Replace YOUR_USERNAME with your GitHub username
- Navigate to the directory:
cd end-to-end-data-engineering-project-4413618
- Set Up a Virtual Environment:
- For Mac:
python3 -m venv venv
source venv/bin/activate
- For Windows:
python -m venv venv
.\venv\Scripts\activate
- For Mac:
- Install Dependencies:
pip install -e ".[dev]"
Thalia Barrera
Check out my other courses on LinkedIn Learning.