-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Refactoring to include outputs parsing #80
Open
gbrunin
wants to merge
16
commits into
aiidateam:main
Choose a base branch
from
gbrunin:develop
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
e95d85f
✨ First draft of `PwParser`
mbercx 17dfd54
Modified the structure of the code to split between inputs and output…
gbrunin 9283644
Refactored structure with inputs/outputs.
gbrunin 3135fc4
Abstraction of base outputs, added ase and pymatgen files where the c…
gbrunin 2ffcbc7
Added pycharm stuff in gitignore.
gbrunin bbc3e9e
Added general parse method to output files.
gbrunin d73b0f0
Updated imports in tests.
gbrunin 8ccf333
Added pymatgen and ase checks in respective files.
gbrunin 3e58f13
Refactoring with Outputs objects relying on Parsers. This allows flex…
gbrunin f68ac1a
👌 `PwOutput`: Small changes to code
mbercx 5a66f07
✨ Add basic `stdout` parsing functionality
mbercx a72eb3f
✨🧪 Add missing `qes-1.0.xsd` schema and XML parsing tests
mbercx de4f8c0
🔧 Update dependencies; fix `pre-commit` & tests
mbercx c31f570
📚 [WIP] Add some basic documentation and design notes
mbercx 90b3608
Use annotations from future for cleaner type hint. Moved qe_time_to_s…
gbrunin c737b43
👌 Remove the `executable` input from the output classes
mbercx File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = . | ||
BUILDDIR = _build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# -*- coding: utf-8 -*- | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# For the full list of built-in configuration values, see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Project information ----------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information | ||
|
||
project = 'qe-tools' | ||
copyright = '2024, Marnik Bercx' | ||
author = 'Marnik Bercx' | ||
|
||
# -- General configuration --------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration | ||
|
||
extensions = [ | ||
'myst_parser', | ||
# "autodoc2", | ||
# "sphinx.ext.intersphinx", | ||
# "sphinx.ext.viewcode", | ||
# "sphinx.ext.autodoc", | ||
# "sphinx.ext.autosummary", | ||
'sphinx_design', | ||
'sphinx_copybutton', | ||
] | ||
|
||
templates_path = ['_templates'] | ||
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store'] | ||
|
||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output | ||
|
||
html_theme = 'sphinx_book_theme' | ||
html_static_path = ['_static'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Quick notes | ||
|
||
A place to write down quick notes on the design decisions made for the `qe-tools` package. | ||
|
||
## Main goal | ||
|
||
The main goal of this package should be to develop **easy to use** tools to deal with Quantum ESPRESSO that **just work**. | ||
|
||
* **Easy to use**: Tools should be easy to find, intuitive and require as few steps as possible to get the desired outcome. | ||
* **Just work**: Tools should be robust and clever, working as the user intends with a variety of input options without having to understand their functionality in too much detail. | ||
|
||
## One input per code | ||
|
||
Typically, each executable in the Quantum ESPRESSO suite will have a single input file. | ||
Technical exceptions are restart files such as the charge density, but those are full (sometimes binary system-specific) files that don't require a Python object to represent. | ||
The input class should allow for two use cases: | ||
|
||
1. Generate a Quantum ESPRESSO input file from various Python types. | ||
2. Parse an existing input file into various Python types. | ||
|
||
## One output object for each calculation | ||
|
||
Although QE can provide the output of a calculation distributed over various files, it would be useful to gather all of these into a single "output" object from which the user can access all data they are interested in. | ||
|
||
Question: would it instead not be useful to have one object that has both inputs and outputs? Reasons could be: | ||
|
||
1. The user might want to just load both input/output from the directory in one fell swoop, since they might want to work with the output of the calculation differently depending on the input. | ||
2. Some parsing functionality might be easier to implement if the inputs are known. I think it may even be necessary for some outputs to know e.g. what the number of k-points are. Some of the inputs are also in the XML output though... | ||
|
||
## One output file, one parser | ||
|
||
Even though the user might want to obtain all outputs from a single object, it still is sensible to have a separate parser tool for each output file. | ||
The output class can then rely on these nicely separated parsers to combine all outputs into one based on preference. | ||
|
||
## ASE/`pymatgen`/AiiDA/... support | ||
|
||
Most users will want to provide e.g. the input structure or output data in the flavour of their choosing. | ||
We should provide tools for converting: | ||
|
||
1. The flavored Python types (`Structure`, `Atoms`, ...) into the Quantum ESPRESSO input file. | ||
2. The Quantum ESPRESSO raw parsed output into the flavour's Python type. | ||
|
||
--- | ||
|
||
Notes from Guillaume: | ||
|
||
I have changed the structure of the code to better reflect the distinction between input and output parsers. | ||
|
||
For the outputs, I created objects such as PwOutput, that inherit from an abstract BaseOutput, that for now can be instantiated with a from_dir method (a from_files method could/should be added as well). | ||
|
||
A user with a job that ran in a given directory could get the outputs easily using this classmethod. In these from_dir methods, specific XML and/or standard output Parsers would be used to get the results. | ||
|
||
Each Parser would parse a single file, and the logic of parsing and extracting the outputs from the different codes would have to be implemented in each from_dir method. | ||
|
||
For instance, a NebOutput.from_dir would parse the standard output of the global computation and probably the standard outputs and/or XML files for each image. The extracted outputs would be stored as a simple dictionary and these objects would not rely on any external package. | ||
|
||
Then, in qe_tools.extractors, ASE and pymatgen objects could be constructed (e.g., ase.Atoms/pymatgen.Structure, band structures,...), allowing each to be optional dependencies. | ||
|
||
This is only the base logic of the new structure and many things remain to be implemented. | ||
|
||
The idea now is to see what breaks with this in `aiida-quantumespresso` and how the parsing could be moved from there to here. Then, more will be added depending on the needs. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Welcome to `qe-tools`'s documentation! | ||
|
||
## 💾 Installation | ||
|
||
To install the package from the [PyPI](https://pypi.org/), simply use `pip`: | ||
|
||
``` | ||
pip install qe-tools | ||
``` | ||
|
||
## 💡 Tutorials | ||
|
||
```{toctree} | ||
tutorials/getting_started.md | ||
tutorials/inputs_desired_usage.md | ||
tutorials/outputs_desired_usage.md | ||
``` | ||
|
||
## 🤔 Design Documents | ||
|
||
```{toctree} | ||
design/quicknotes.md | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
@ECHO OFF | ||
|
||
pushd %~dp0 | ||
|
||
REM Command file for Sphinx documentation | ||
|
||
if "%SPHINXBUILD%" == "" ( | ||
set SPHINXBUILD=sphinx-build | ||
) | ||
set SOURCEDIR=. | ||
set BUILDDIR=_build | ||
|
||
%SPHINXBUILD% >NUL 2>NUL | ||
if errorlevel 9009 ( | ||
echo. | ||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx | ||
echo.installed, then set the SPHINXBUILD environment variable to point | ||
echo.to the full path of the 'sphinx-build' executable. Alternatively you | ||
echo.may add the Sphinx directory to PATH. | ||
echo. | ||
echo.If you don't have Sphinx installed, grab it from | ||
echo.https://www.sphinx-doc.org/ | ||
exit /b 1 | ||
) | ||
|
||
if "%1" == "" goto help | ||
|
||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
goto end | ||
|
||
:help | ||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
|
||
:end | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Getting started | ||
|
||
## Parsing `pw.x` outputs | ||
|
||
Say we have just run a `pw.x` calculation in the `qe_dir` directory: | ||
|
||
``` | ||
from qe_tools.outputs.pw import PwOutput | ||
|
||
pw_out = PwOutput.from_dir("qe_dir") | ||
pw_out.outputs | ||
``` | ||
|
||
## Parsing a single output file | ||
|
||
If you only want to parse the `stdout` of the `pw.x` calculation, you can load the parser class directly: | ||
|
||
``` | ||
from qe_tools.outputs.parsers.pw import PwStdoutParser | ||
pw_out = PwStdoutParser.from_file('qe_dir/pw.out') | ||
``` | ||
|
||
## Parsing an already existing input file | ||
|
||
Currently the input class `PwInputFile` only supports parsing an already existing input file: | ||
|
||
``` | ||
from qe_tools.inputs.pw import PwInputFile | ||
from pathlib import Path | ||
|
||
pw_input = PwInputFile(Path('qe_dir/pw.in').read_text()) | ||
``` | ||
|
||
This will also only really parse the structure and k-points. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Inputs: Desired usage | ||
|
||
## Producing a `pw.x` input file | ||
|
||
A user might want to generate a Quantum ESPRESSO input file from other Python types, e.g.: | ||
|
||
``` | ||
PwInput( | ||
structure= | ||
parameters= | ||
kpoints= | ||
magmom= | ||
... | ||
) | ||
``` | ||
|
||
It will definitely be challenging to define a complete API that doesn't get overly big or convoluted. | ||
|
||
Another approach for constructing a class that defines the inputs of a Quantum ESPRESSO calculation can be found in: | ||
|
||
[https://github.com/elinscott/qe_input_prototype](https://github.com/elinscott/qe_input_prototype) | ||
|
||
This defines the inputs files as `pydantic` models, and allows you to complete the namelists with tab-completion, also offers validation features based on the Quantum ESPRESSO files that define the inputs. | ||
However, at first glance it requires a lot of Quantum ESPRESSO knowledge to populate the inputs. | ||
We could use these classes under the hood, having a wrapper class that knows how to convert commonly used formats from ASE, `pymatgen`, etc into the corresponding namelists. | ||
|
||
One challenge here will always be that Quantum ESPRESSO expects properties etc to be defined in terms of the _kinds_, whereas most other structure classes have a site-based approach. | ||
This in combination with the 3-character limit for the kind names means that we'll typically have to use threshold to combine sites into one kind even if they don't have exactly the same value for all properties. | ||
Otherwise we would create too many kinds for | ||
|
||
## Parsing an existing `pw.x` input file | ||
|
||
It would also be nice to be able to directly parse from a file, e.g.: | ||
|
||
``` | ||
PwInput.from_file('path/to/file') | ||
``` | ||
|
||
These features are already present in the package, but seem to be a bit limited. | ||
Moreover, I would not call the class `PwInputFile`, but rather `PwInput`, to be in line with `PwOutput`, _and_ the fact that it doesn't just represent the file. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Outputs: Desired usage | ||
|
||
## One class to rule both input/output | ||
|
||
If you want to parse all the inputs and outputs from a `pw.x` run in the `pw_run` directory: | ||
|
||
``` | ||
from qe_tools.parsers import PwParser | ||
|
||
parser = PwParser.from_dir('pw_run') | ||
``` | ||
|
||
Then you can obtain the outputs as | ||
|
||
``` | ||
parser.outputs['structure'] | ||
``` | ||
|
||
similarly, the inputs can then be obtained from the `inputs` attribute: | ||
|
||
``` | ||
parser.inputs['structure'] | ||
``` | ||
|
||
(Maybe this should not be a "parser", but a "calculation". E.g. `PwCalc` that has `inputs` and `outputs`.) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all these necessary ? I think it would be best to restrict to the strict minimum. I would say that for parsing or generating input even numpy should not be needed (let alone scipy ?). I'd be in favor of removing them but I don't know where/why they are used. For the packaging dependency, it has itself no dependency so I think it's probably fine to keep it even though it's probably possible to get around without it.. For xmlschema, I would also say it seems fine as xmlschema has just one dependency itself, elementpath, which itself has no dependency. Any thoughts on numpy/scipy removal ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looked, scipy is used only to get the norm of a few vectors in one file. I think this is definitely something that should then probably be removed from the dependencies (also from the fact that if numpy is kept as a required dependency, the norm can also be obtained by numpy, but for numpy, my comment below holds too :)).
As for numpy, it's used in a few more places, but 1/ for very simple maths, vector operations and 2/ for some conversion functions. For the latter point, maybe the "lowest-level" qe tools could be installed without it (the "client" code, be it aiida, pymatgen or else) has to follow the exact API that the lowest level functions require. And numpy could be an optional dependency for "activating" helper functions if the user wants it. Any thoughts ?