Skip to content

QuickEst repository: Quick Estimation of Quality of Results

License

Notifications You must be signed in to change notification settings

cornell-zhang/quickest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

QuickEst

Publication


If you use QuickEst in your research, please cite our preliminary work published in FCCM'18.

  @article{dai-hls-qor-fccm2018,
    title   = "{Fast and Accurate Estimation of Quality of Results in 
                High-Level Synthesis with Machine Learning}",
    author  = {Steve Dai and Yuan Zhou and Hang Zhang and Ecenur Ustun and 
               Evangeline F.Y. Young and Zhiru Zhang},
    journal = {Int'l Symp. on Field-Programmable Custom Computing Machines
               (FCCM)},
    month   = {May},
    year    = {2018},
  }

Usage

QuickEst is organized into directories for different estimation tasks.

HLS

hls directory currently supports resource estimation for HLS. The python files for these features are in [path to top directory]/hls.

Input data format

The data file is CSV file. The columns are separated by ",". The rows are separated by "\n".
The following formats should also be satisfied:
    1) The first 2 columns should be design index and device index respectively.
    2) The columns from 3 to <end_col-target_col> should be features.
    3) The columns from <end_col-target_col> to the end should be the targets.
    4) If there are k targets, the first k features should be the corresponding HLS result of the k targets.

Data preprocessing

python preprocess.py [-h] [--data_dir DATA_DIR] [-c TARGET_COL] [--test_seed TEST_SEED] [--cluster_k CLUSTER_K] [--split_by SPLIT_BY] [--test_ids ID1 ID2 ID3 ...]

optional arguments:
  -h, --help            show this help message and exit
  --data_dir DATA_DIR   Directory or file of the input data. String. Default:
                        ./data/data.csv
  -c TARGET_COL, --target_col TARGET_COL
                        The number of target columns.
			The first 2 columns are design index and device index
                        respectively. Integer. Default: 4 
  --split_by SPLIT_BY   The strategy to split the data. 
                        random - Splitting the data by the id randomly. The ratio
                            of the testing data is given by <ratio>
                        design_random - Splitting the data by the design id randomly. 
                            The ratio of the testing designs is given by <ratio>
                        design_select - Splitting the data by the design id. 
                            The testing design id is given by <test_ids>.
                        design_sort - Splitting the data by the design id. The design 
                            ids are clustered, sorted by the target values and the splitted.
                            The number of cluster groups are controlled by <cluster_k>.
                        Default: design_sort
  --test_ids ID1 ID2 ID3 ... 
                        The test index/device id list. Integer.
                        Default: [0] 
  --test_seed TEST_SEED
                        The seed used for selecting the test id. Integer.
                        Default: 0
  --cluster_k CLUSTER_K
                        How many clusters will be grouped when patitioning the
                        training and testing dataset. Integer. Default: 8

Model training

python train.py [-h] [--data_dir DATA_DIR] [--params_dir PARAMS_DIR] [--params_save_dir PARAMS_SAVE_DIR] [--models_dir MODELS_DIR] [--models_save_dir MODELS_SAVEDIR] [-d] [--validation_ratio VALIDATION_RATIO] [-m MODEL_TRAIN] [-s MODEL_FSEL] [-a MODEL_ASSEMBLE]

optional arguments:
  -h, --help            show this help message and exit
  --data_dir DATA_DIR   Directory or file of the training dataset. String.
                        Default: ./data/data_train.pkl
  --params_dir PARAMS_DIR
                        Directory or file to load the parameters.
                        String. Default: ./saves/train/params.pkl
  --params_save_dir PARAMS_SAVE_DIR
                        Directory or file to save the parameters.
                        String. Default: ./saves/train/params_save.pkl
  --models_load_dir MODELS_LOAD_DIR
                        Directory or file to load the model. String.
                        Default: ./saves/train/models.pkl
  --models_save_dir MODELS_SAVE_DIR
                        Directory or file to save the trained model. String.
                        Default: ./saves/train/models_save.pkl
  -d, --disable_param_tuning 
                        Whether to disable parameters tuning or not. Boolean. 
          		Default: false
  --validation_ratio VALIDATION_RATIO
                        The ratio of the training data to do validation.
                        Float. Default: 0.25
  -m MODEL_TRAIN, --model_train MODEL_TRAIN
                        The model to be trained. Empty means not training
                        models. Value from "", "xgb"(default), "lasso"
  -s MODEL_FSEL, --model_fsel MODEL_FSEL
                        The model used to select features. Empty means not
                        selecting features. Value from "", "xgb",
                        "lasso"(default)
  -a MODEL_ASSEMBLE, --model_assemble MODEL_ASSEMBLE
                        Strategy used to assemble the trained models
                        (automatically train them if they are not existed).
                        Empty means not training models. Value from ""(default),
                        "xgb+lasso+equal_weights", "xgb+lasso+learn_weights"

Model testing

python test.py [-h] [--data_dir DATA_DIR] [--models_save_dir MODELS_SAVE_DIR] [--save_result_dir SAVE_RESULT_DIR]

optional arguments:
  -h, --help            show this help message and exit
  --data_dir DATA_DIR   Directory or file of the testing dataset. String.
                        Default: ./data/data_test.pkl
  --models_save_dir MODELS_SAVE_DIR
                        Directory or file of the pre-trained models. String.
                        Default: ./train/models.pkl
  --save_result_dir SAVE_RESULT_DIR
                        Directory to save the result. Input folder or file
                        name. String. Default: ./saves/test/

Model analysis

usage: analyze.py [-h] [--train_data_dir TRAIN_DATA_DIR] [--test_data_dir TEST_DATA_DIR] [--model_save_dir MODEL_SAVE_DIR] [--param_save_dir PARAM_SAVE_DIR] [--result_dir RESULT_DIR] [--save_result_dir SAVE_RESULT_DIR] [-f FUNC]

optional arguments:
  -h, --help            show this help message and exit
  --train_data_dir TRAIN_DATA_DIR
                        File of the training dataset. String. Default:
                        ./data/data_train.pkl
  --test_data_dir TEST_DATA_DIR
                        File of the testing dataset. String. Default:
                        ./data/data_test.pkl
  --model_save_dir MODEL_SAVE_DIR
                        File of the pre-trained models. String. Default:
                        ./save/train/models_save.pkl
  --param_save_dir PARAM_SAVE_DIR
                        File of the pre-tuned params. String. Default:
                        ./save/train/params_save.pkl
  --result_dir RESULT_DIR
                        File of the testing results. String. Default:
                        ./save/test/results.pkl
  --save_result_dir SAVE_RESULT_DIR
                        Directory to save the analyzing results. String.
                        Default: ./save/analysis/
  -f FUNC, --func FUNC  Select the analysis function. Value from "fi" or
                        "feature_importance", "schls" or "score_hls", "sc" or
                        "score" (default), "lc" or "learning_curve", "re" or
                        "result_error", "red" or "result_error_design", "rt"
                        or "result_truth", "rtd" or "result_truth_design",
                        "rp" or "result_predict", "rpd" or
                        "result_predict_design".
                        
function explaination:
fi - calculate feature importance
sc - calculate the scores of model results
schls - calculate the scores of HLS results
lc - plot the learning curve
re - show the result errors
red - show the result errors grouped by the design ids
rt - show the result ground truth of the testing data
rtd - show the result truth of the testing data grouped by the design ids
rp - show the result prediction of the testing data
rpd - show the result prediction grouped by the design ids

Additional Acknowledgements

  • Qiang You: For contributing a new set of training, testing, and analysis features to the estimation flow, improving the usability of the scripts, and performing preliminary study on extending the flow to LegUp.
  • Dan Batan: For his early effort in extending the flow to LegUp.

About

QuickEst repository: Quick Estimation of Quality of Results

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages