-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OMAT-DFT labels for the WBM test set. #1022
Comments
Hi @rdguha66, The WBM test set of energy/force/stress labels in Table 1 of the manuscript is calculated with OMat24 settings. These are datapoints generated as described in the manuscript that have "prototype structure labels" found in the WBM dataset used in Matbench-discovery. The WBM dataset used in Matbench-discovery is computed using MP settings. We did not compute this dataset. The WBM dataset is from this work. Hope this helps! |
Hey @lbluque , Thanks for the reply! That makes sense. My question primarily was regarding the energy/force/stress labels for the WBM test set calculated using the OMat24 settings. Are these labeled structures (taken from the WBM test set) also part of the OMat24 dataset? If yes, is there some flag we can use to filter them out. The reason I ask is that if we look at Figure 3 of the arXiv, it is clear that there are significant deviations between the MP DFT set and the OMat DFT set. Therefore, if a model is trained on the OMat data, then an apples to apples comparison for WBM would be to evaluate the formation energies for the test set with the OMat labels; not the MP labels available through Matbench-Discovery. |
Hi @rdguha66, The naming can be a bit confusing! The structures in the WBM test set (lets call it OMat-WBM-test) results shown in Table 1 of the manuscript are not part of the original WBM dataset. They are part of OMat24 - we generated OMat24 structures starting from relaxed structures in the Alexandria dataset. The OMat-WBM-test set is a split of the full OMat24 dataset with all structures that were generated using an Alexandria structure with a matching prototype in the original WBM dataset or that the generated structure itself had a matching prototype in the original WBM dataset. The OMat-WBM-test set is not publicly available, only train and a 1M validation split. Your last point is correct! Evaluating an OMat24 trained model with WBM labels is not apples to apples. This is why we only tested models fine-tuned on MPTrj and/or sAlex in the Matbench-discovery benchmark. |
What would you like to report?
Hey FAIR team,
I have a quick question.
In the OMat24 arXiv draft the authors mention - "Note the calculations in OMat24 differ from those found in the Materials Project PBE and PBE+U calculations. Care must be taken when mixing calculations for analysis or training models. Although the difference in settings is small (the pseudopotential in version 5.4 and the choice of pseudopotential for Yb and W), predictions of total and formation energies differ. To illustrate this, we compare calculated energies and formation energies for MP settings and OMat24 settings using calculations in the WBM dataset".
Are the DFT labels for the WBM test set calculated using the OMat24 settings publicly available?
The text was updated successfully, but these errors were encountered: