|
| 1 | +# Basic Training |
| 2 | +## config |
| 3 | +You may want to load your configurations in equivalent ways: |
| 4 | +* cmd |
| 5 | +* config files |
| 6 | +* yaml |
| 7 | + |
| 8 | +### cmd |
| 9 | +You may want to change configurations in the command line like ``--xx=yy``. ``xx`` is the name of the parameters and ``yy`` is the corresponding value. for example: |
| 10 | + |
| 11 | +```bash |
| 12 | +python run_textbox.py --model=BART --model_path=facebook/bart-base --epochs=1 |
| 13 | +``` |
| 14 | + |
| 15 | +It's suitable for **a few temporary** modifications with cmd like: |
| 16 | +* ``model`` |
| 17 | +* ``model_path`` |
| 18 | +* ``dataset`` |
| 19 | +* ``epochs`` |
| 20 | +* ... |
| 21 | + |
| 22 | +### config files |
| 23 | + |
| 24 | +You can also modify configurations through the local files: |
| 25 | +```bash |
| 26 | +python run_textbox.py ... --config_files <config-file-one> <config-file-two> |
| 27 | +``` |
| 28 | + |
| 29 | +Every config file is an additional yaml file like: |
| 30 | + |
| 31 | +```yaml |
| 32 | +efficient_methods: ['prompt-tuning'] |
| 33 | +``` |
| 34 | +It's suitable for **a large number of** modifications or **long-term** modifications with cmd like: |
| 35 | +* ``efficient_methods`` |
| 36 | +* ``efficient_kwargs`` |
| 37 | +* ... |
| 38 | +
|
| 39 | +### yaml |
| 40 | +
|
| 41 | +The original configurations are in the yaml files. You can check the values there, but it's not recommended to modify the files except for **permanent** modification of the dataset. These files are in the path ``textbox\properties``: |
| 42 | +* ``overall.yaml`` |
| 43 | +* ``dataset\*.yaml`` |
| 44 | +* ``model\*yaml`` |
| 45 | +
|
| 46 | +
|
| 47 | +## trainer |
| 48 | +
|
| 49 | +You can choose an optimizer and scheduler through `optimizer=<optimizer-name>` and `scheduler=<scheduler-name>`. We provide a wrapper around **pytorch optimizer**, which means parameters like `epsilon` or `warmup_steps` can be specified with keyword dictionaries `optimizer_kwargs={'epsilon': ... }` and `scheduler_kwargs={'warmup_steps': ... }`. See [pytorch optimizer](https://pytorch.org/docs/stable/optim.html#algorithms) and scheduler for a complete tutorial. <!-- TODO --> |
| 50 | + |
| 51 | +Validation frequency is introduced to validate the model **at each specific batch-steps or epoch**. Specify `valid_strategy` (either `'step'` or `'epoch'`) and `valid_steps=<int>` to adjust the pace. Specifically, the traditional train-validate paradigm is a special case with `valid_strategy=epoch` and `valid_steps=1`. |
| 52 | + |
| 53 | +`max_save=<int>` indicates **the maximal amount of saved files** (checkpoint and generated corpus during evaluation). `-1`: save every file, `0`: do not save any file, `1`: only save the file with the best score, and `n`: save both the best and the last $n−1$ files. |
| 54 | + |
| 55 | +According to ``metrics_for_best_model``, the score of the current checkpoint will be calculated, and evaluation metrics specified with ``metrics``([full list](evaluation.md)) will be chosen. **Early stopping** can be configured with `stopping_steps=<int>` and score of every checkpoint. |
| 56 | + |
| 57 | + |
| 58 | +```bash |
| 59 | +python run_textbox.py ... --stopping_steps=8 \\ |
| 60 | + --metrics_for_best_model=\[\'rouge-1\', \'rouge-w\'\] \\ |
| 61 | + --metrics=\[\'rouge\'\] |
| 62 | +``` |
| 63 | + |
| 64 | +You can resume from a **previous checkpoint** through ``model_path=<checkpoint_path>``.When you want to restore **all trainer parameters** like optimizer and start_epoch, you can set ``resume_training=True``. Otherwise, only **model and tokenizer** will be loaded. The script below will resume training from checkpoint in the path ``saved/BART-samsum-2022-Dec-18_20-57-47/checkpoint_best`` |
| 65 | + |
| 66 | +```bash |
| 67 | +python run_textbox --model_path=saved/BART-samsum-2022-Dec-18_20-57-47/checkpoint_best \\ |
| 68 | +--resume_training=True |
| 69 | +``` |
| 70 | + |
| 71 | +Other commonly used parameters include `epochs=<int>` and `max_steps=<int>` (indicating maximum iteration of epochs and batch steps, if you set `max_steps`, `epochs` will be invalid), `learning_rate=<float>`, `train_batch_size=<int>`, `weight_decay=<bool>`, and `grad_clip=<bool>`. |
| 72 | + |
| 73 | +### Partial Experiment |
| 74 | + |
| 75 | +You can run the partial experiment with `do_train`, `do_valid`and `do_test`. You can test your pipeline and debug with `quick_test=<amount-of-data-to-load>` to load just a few examples. |
| 76 | + |
| 77 | +The following script loads the trained model from a local path and conducts generation and evaluation without training and evaluation. |
| 78 | +```bash |
| 79 | +python run_textbox.py --model_path=saved/BART-samsum-2022-Dec-18_20-57-47/checkpoint_best \\ |
| 80 | +--do_train=False --do_valid=False |
| 81 | +``` |
| 82 | + |
| 83 | +## wandb |
| 84 | + |
| 85 | +If you are running your code in jupyter environments, you may want to log in by simply setting an environment variable (your key may be stored in plain text): |
| 86 | + |
| 87 | +```python |
| 88 | +%env WANDB_API_KEY=<your-key> |
| 89 | +``` |
| 90 | +Here you can set wandb with `wandb`. |
| 91 | + |
| 92 | +If you are debugging your model, you may want to **disable W&B** with `--wandb=disabled`, and **none of the metrics** will be recorded. You can also disable **sync only** with `--wandb=offline` and enable it again with `--wandb=online` to upload to the cloud. Meanwhile, the parameter can be configured in the yaml file like: |
| 93 | + |
| 94 | +```yaml |
| 95 | +wandb: online |
| 96 | +``` |
| 97 | + |
| 98 | +The local files can be uploaded by executing `wandb sync` in the command line. |
| 99 | + |
| 100 | +After configuration, you can throttle wandb prompts by defining the environment variable `export WANDB_SILENT=false`. For more information, see [documentation](docs.wandb.ai). |
0 commit comments