Skip to content

benjaminbang987/data_eng_sci_studies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data_eng_sci_studies

Ben's blog/write-up of the tasks/projects/lessons

Resources:

Blogs:

Books:

Topics (Data Eng):

  1. Relational Database
    1. PostgreSQL
    2. Normalized/Denormalized Data Tables/Schemas
  2. Scheduler/ Automation of the Data Pipelines
    1. Apache Airflow
  3. Cloud Database
    1. S3
    2. EC2/ RDS PostgreSQL
  4. Querying Big Data
    1. Apache Spark
    2. Snowflake/Redshift
  5. Threading
    1. An Intro to Threading in Python
  6. Various Tools
    1. Docker Containers
    2. Kafka
    3. Kubernetes

Topics (Data Sci):

  1. Review of the Regression Models & Cost Functions + Pros and Cons of each model
  2. Review of the Classification Models & Cost Functions + Pros and Cons of each model
  3. Dimensionality Reduction for Images, Videos and Big Texts (Tensorflow introduction for images and videos)
  4. Clustering
    1. Density-Based Sptial Clustering of Applications with Noise (DBSCAN)
    2. Clustering using Mixture Models

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published