Skip to content

Commit 0b1f316

Browse files
kavanasecw-tan
authored andcommitted
Add 'Considerations for Fine-Tuning Training'
1 parent 044aa1f commit 0b1f316

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

docs/guide/training-techniques/fine_tuning.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,3 +74,12 @@ model:
7474
```
7575

7676
See [Dataset Statistics](../configuration/data.md#dataset-statistics) for more details on configuring dataset statistics.
77+
78+
## Considerations for Fine-Tuning Training
79+
There are a number of considerations and changes you may want to make to training setup and hyperparameters when fine-tuning, rather than training from scratch. This is an active area of research within the field and the `NequIP` userbase.
80+
81+
Key differences to training from scratch are:
82+
83+
- **Decrease the learning rate**: It is typically best to use a lower learning rate for fine-tuning a pre-trained model, compared to the optimal LR for from-scratch training.
84+
- **Update energy shifts**: As discussed above, you will likely want to update the atomic energy shifts of the model to match the settings (and thus absolute energies) of your data, to ensure smooth fine-tuning.
85+
- **Fixed model hyperparameters**: When fine-tuning, the architecture of the pre-trained model (number of layers _l_-max, radial cutoff etc. – e.g. provided on [nequip.net](https://www.nequip.net/)) cannot be modified. When comparing the performance of fine-tuning and from-scratch training, it is advised to use the same model hyperparameters for appropriate comparison.

0 commit comments

Comments
 (0)