- Input data for sample code after chapter 2
- Use data from Kaggle competition Prudential Life Insurance Assessment as a reference. Data was made artificially to simulate insurance underwriting data. The data construction was simple, so its structure is simpler than real life data.
- Total of training and test data is 10000 lines
Column name | Notes |
---|---|
age | |
gender | |
height | |
weight | |
product | product type |
amount | insurance premium |
date | application date |
medical_info_a1/a2/a3 | medical information - continuous variable |
medical_info_b1/b2/b3 | medical information - continuous and catergorical variables |
medical_info_c1/c2 | medical information - continuous and catergorical variables |
medical_keyword_1-10 | medical information - binary variable |
target | target values (binary) |
- From Kaggle competition Titanic: Machine Learning from Disaster, save following data (save into folders as follows: ch01-titanic/train.csv, ch01-titanic/test.csv)
- Data for explanations on how to combine different tables
- Data for explanations on how to process time series data