- The Rise of Data Engineer
- Data Engineer Resources
- Popular Questions about Data Engineering Career Path
- A Beginner’s Guide to Data Engineering — Part I
- Lambda Architecture: How to Build a Big Data Pipeline
- Designing Data-Intensive Applications
- Clean Code
- Introduction to Machine Learning with Python
- Relational Database
- PostgreSQL
- Normalized/Denormalized Data Tables/Schemas
- Scheduler/ Automation of the Data Pipelines
- Apache Airflow
- Cloud Database
- S3
- EC2/ RDS PostgreSQL
- Querying Big Data
- Apache Spark
- Snowflake/Redshift
- Threading
- Various Tools
- Docker Containers
- Kafka
- Kubernetes
- Review of the Regression Models & Cost Functions + Pros and Cons of each model
- Review of the Classification Models & Cost Functions + Pros and Cons of each model
- Dimensionality Reduction for Images, Videos and Big Texts (Tensorflow introduction for images and videos)
- Clustering
- Density-Based Sptial Clustering of Applications with Noise (DBSCAN)
- Clustering using Mixture Models