Skip to content

Repo to contain the assignments for DSCI 553: Foundations and Applications of Data Mining course at USC

Notifications You must be signed in to change notification settings

l0g1c-80m8/data-mining-assignments

Repository files navigation

Assignments for DSCI 553 : Foundations and Applications of Data Mining

Repo to contain the assignments for DSCI 553: Foundations and Applications of Data Mining course at USC.

Instructor: Professor Wei-Min Shen (Spring 2023)

Follow these instructions to run the script locally and on Vocareum.

For additional details, look at the particular README of the homeworks individually.

Summary

Assignment Topic Implementation Concepts Dataset
Homework 0 Setting up development
environment
Python, Scala Map-Reduce None
Homework 1 Data Exploration
on Yelp Dataset
Python Map-Reduce Test, Full
Homework 2 Frequent Item-set
Mining
Python SON Algorithm,
Apriori Algorithm,
Frequent Item-sets
Simulated, Real-world
Homework 3 Locality Sensitive
Hashing (LSH),
Collaborative Filtering,
Recommendation Systems
Python Min-Hashing, Locality Sensitive Hashing, Pearson Similarity, Model-based Recommendation System Training and Validation
Homework 4 Community Detection Python Girvan-Newman Algorithm, Label Propagation Algorithm Graph Data
Homework 5 Processing Data Streams Python Bloom Filter, Flajolet-Martin Algorithm, Reservoir Sampling Seed dataset for stream + Stream Generator
Homework 6 Clustering Python Bradley-Fayyad-Reina (BFR) Algorithm Synthetic dataset
Competition Project Recommendation System Python Recommendation Systems Same as homework 3

Reference