Skip to content
/ DM-Lab Public

datamining American Mental Health Dataset(SAMHDA)

License

Notifications You must be signed in to change notification settings

rve/DM-Lab

Repository files navigation

Data Mining Lab - Mental Health Dataset

Data Mining Practical Course

Getting Started

Prerequisites

For lab machines

wget link/to/anaconda.sh
bash anaconda.sh

Check the lab wiki (week 1) for how to connect jupyter notebook remotely.

For Google Clound

sudo apt-get install python-pip
sudo apt-get install unzip

You may also install zsh and oh-my-zsh for the auto completion.

Setting up

Install dependencies

pip install -r requirements.txt

You should also install xgboost manually if you're using it. The installation can be tricky for Mac user, you may need to compile it while changing makefile exports to gcc-7&g++-7.

Download the csv datasets and concat them.

bash utils/download.sh
bash utils/concat.sh 

To get the current preprocessed dataset (for Google Cloud):
inside the folder of most recent week, run

python get_newsplit.py

Running

Running on Google Cloud without hanging up FYI:

jupyter nbconvert --to python some.ipynb
nohup time python some.py | tee result.out &

you can check the process with htop.

Schedule

Week 0: Dataset Preparation
Week 1-5: Descriptive Mining I-V
Week 6-9: Predictive Mining I-IV
Week 10-11: Final Presentation

Dataset

  • SAMHDA - Treatment Episode Data Set: Admissions (TEDS-A)
  • DASIS - National Mental Health Services Survey (N-MHSS)
  • DASIS - National Survey of Substance Abuse Treatment Services (N-SSATS)

About

datamining American Mental Health Dataset(SAMHDA)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published