-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse_output_error
: fix parser for QE stdout errors
#751
Conversation
The previous code returned a wrong `line_number_end`, thus all the QE stdout errors were discarded. Moreover, the code converted lines into a `set` to remove repetitions, but this led to random ordering of error messages in the final `logs.error` list, causing difficulties in pytest data_regression. Now the repeated messages are explicitly discarded so the error message ordering in QE stdout is still preserved. Finally, a test is added to ensure errors are correctly recorded in `logs.error`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @qiaojunfeng! You're right that the previous code didn't properly assign the line_number_end
, which meant that typically line_number_end < line_number_start
and hence the logs would be empty. Definitely good to fix this. :)
Reviewing this PR has lead me down a rabbit hole of trying to remove this line-based logic in favor of using regular expressions with re
. However, since this also requires changes in the parser of neb.x
, that is clearly beyond the scope of this PR. ^^
I've left some questions for now, as well as a comment on the tests you've added.
- ERROR_OUTPUT_STDOUT_INCOMPLETE | ||
- ' Error in routine scale_sym_ops (12):' | ||
- ' incompatible FFT grid' | ||
- ' stopping ...' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also want to add unique lines between the error messages (i.e. between the %%%%%%
lines) to the logs?
|
||
for marker, message in message_map['warning'].items(): | ||
if marker in line: | ||
if message is None: | ||
message = line | ||
logs.warning.append(message) | ||
if message not in logs.warning: | ||
logs.warning.append(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this part of the code is only run in case there is a %%%%%%%
line: aren't these only preserved for critical errors? I.e. can you think of a case where we would want to only log a warning?
|
||
|
||
@pytest.fixture | ||
def generate_output(filepath_tests): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixture seems to only provide the file content for a single test, since the path inside the function is hardcoded. Maybe we want to have several tests here, e.g. one test that for the default functioning of the base stdout
parser with output file:
out_file = fixtures_path / 'parse_raw' / 'base' / 'default.out'
and one that tests a failed calculation, e.g. with output file:
out_file = fixtures_path / 'parse_raw' / 'base' / 'failed_fft_error.out'
(I think it's fine not to have separate directories for the output files here, as is the case for other parsing tests, since the base
parsing only tests the stdout
file anyways)
Maybe it's better to have this fixture return a function that returns the stdout
content for a test with a name that corresponds to the filename, e.g.:
def test_parse_output_base_default(generate_output, data_regression):
"""Test ``parse_output_base`` on the stdout of a ``open_grid.x`` calculation."""
from aiida_quantumespresso.parsers.parse_raw.base import parse_output_base
filecontent = generate_output('default')
parsed_data, logs = parse_output_base(filecontent, codename='OPEN_GRID')
data_regression.check({'parsed_data': dict(parsed_data), 'logs': dict(logs)})
That way we can reuse it for multiple tests.
return filecontent | ||
|
||
|
||
def test_parse_output_base_default(generate_output, data_regression): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can have two tests here: one that checks the parse_output_base
method for a successful calculation (i.e. also returning the wall time, which I would call default
, and one that checks the method for a failed calculation (e.g. failed_fft_error
, also see my other comment).
Hi +-> WARNING at 2021-12-04 14:20:42.160632+00:00
| output parser returned exit code<350>: The parser raised an unexpected exception.
+-> ERROR at 2021-12-04 14:20:42.157234+00:00
| The parser raised an unexpected exception.
+-> CRITICAL at 2021-12-04 14:20:42.152651+00:00
| Traceback (most recent call last):
| File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/pw.py", line 325, in parse_stdout
| parsed_data, logs = parse_stdout(stdout, parameters, parser_options, parsed_xml)
| File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/parse_raw/pw.py", line 396, in parse_stdout
| trajectory_data['atomic_species_name'] = [data_lines[i + 1 + j].split()[1] for j in range(nat)]
| File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/parse_raw/pw.py", line 396, in <listcomp>
| trajectory_data['atomic_species_name'] = [data_lines[i + 1 + j].split()[1] for j in range(nat)]
| IndexError: list index out of range |
@obgeneralao I doubt it. This seems to be a different problem in the parser where it assumes a certain format in the stdout. Could you share the stdout file as an attachment to this PR? Then we can try to determine the problem and fix it. You can obtain the output using the CLI with |
Hi, The |
Thanks @obgeneralao . There is indeed a problem with your output file, but it is not due to the bug of this PR. Instead, it is because your output file is corrupted because you somehow ran a serial version of QE in parallel. You can see that at the beginning of your file, the header is printed multiple times. Therefore the output of the various processors that are all writing to the same file is being interleaved leading to a corrupt file and so the parser crashes. You should make sure that if you either compile a parallel version of QE, so with MPI support, or you run this executable with |
Hi, Thank you very much. I got it fixed by recompiling it with OpenMP. |
@qiaojunfeng is it ok if I close this PR, since it's been superseded by the refactoring work done in #756? I think the test you add here is very similar to the test that is added in #714? If not, could you migrate it to that PR? Thanks! |
Sure, always welcome a refactor 😁
The test here is only for the |
Great, thanks @qiaojunfeng! I think I'll adapt your file to do some testing on the base parsing indeed. Will close this PR. PS: If you have a change to work on #714, that'd be great, then we can also get that merged :) |
The previous code returned a wrong
line_number_end
, thus all the QEstdout errors were discarded. Moreover, the code converted lines into a
set
to remove repetitions, but this led to random ordering of errormessages in the final
logs.error
list, causing difficulties in pytestdata_regression. Now the repeated messages are explicitly discarded so
the error message ordering in QE stdout is still preserved. Finally, a
test is added to ensure errors are correctly recorded in
logs.error
.Fix #750