`parse_output_error`: fix parser for QE stdout errors #751

qiaojunfeng · 2021-10-07T12:55:25Z

The previous code returned a wrong line_number_end, thus all the QE
stdout errors were discarded. Moreover, the code converted lines into a
set to remove repetitions, but this led to random ordering of error
messages in the final logs.error list, causing difficulties in pytest
data_regression. Now the repeated messages are explicitly discarded so
the error message ordering in QE stdout is still preserved. Finally, a
test is added to ensure errors are correctly recorded in logs.error.

Fix #750

The previous code returned a wrong `line_number_end`, thus all the QE stdout errors were discarded. Moreover, the code converted lines into a `set` to remove repetitions, but this led to random ordering of error messages in the final `logs.error` list, causing difficulties in pytest data_regression. Now the repeated messages are explicitly discarded so the error message ordering in QE stdout is still preserved. Finally, a test is added to ensure errors are correctly recorded in `logs.error`.

mbercx

Thanks @qiaojunfeng! You're right that the previous code didn't properly assign the line_number_end, which meant that typically line_number_end < line_number_start and hence the logs would be empty. Definitely good to fix this. :)

Reviewing this PR has lead me down a rabbit hole of trying to remove this line-based logic in favor of using regular expressions with re. However, since this also requires changes in the parser of neb.x, that is clearly beyond the scope of this PR. ^^

I've left some questions for now, as well as a comment on the tests you've added.

mbercx · 2021-10-11T18:36:16Z

tests/parsers/parse_raw/test_base/test_parse_output_base_default.yml

+  - ERROR_OUTPUT_STDOUT_INCOMPLETE
+  - '     Error in routine scale_sym_ops (12):'
+  - '     incompatible FFT grid'
+  - '     stopping ...'


Do we also want to add unique lines between the error messages (i.e. between the %%%%%% lines) to the logs?

mbercx · 2021-10-11T18:39:39Z

aiida_quantumespresso/parsers/parse_raw/base.py


        for marker, message in message_map['warning'].items():
            if marker in line:
                if message is None:
                    message = line
-                logs.warning.append(message)
+                if message not in logs.warning:
+                    logs.warning.append(message)


Since this part of the code is only run in case there is a %%%%%%% line: aren't these only preserved for critical errors? I.e. can you think of a case where we would want to only log a warning?

mbercx · 2021-10-11T19:04:50Z

tests/parsers/parse_raw/test_base.py

+
+
+@pytest.fixture
+def generate_output(filepath_tests):


This fixture seems to only provide the file content for a single test, since the path inside the function is hardcoded. Maybe we want to have several tests here, e.g. one test that for the default functioning of the base stdout parser with output file:

out_file = fixtures_path / 'parse_raw' / 'base' / 'default.out'

and one that tests a failed calculation, e.g. with output file:

out_file = fixtures_path / 'parse_raw' / 'base' / 'failed_fft_error.out'

(I think it's fine not to have separate directories for the output files here, as is the case for other parsing tests, since the base parsing only tests the stdout file anyways)

Maybe it's better to have this fixture return a function that returns the stdout content for a test with a name that corresponds to the filename, e.g.:

def test_parse_output_base_default(generate_output, data_regression): """Test ``parse_output_base`` on the stdout of a ``open_grid.x`` calculation.""" from aiida_quantumespresso.parsers.parse_raw.base import parse_output_base filecontent = generate_output('default') parsed_data, logs = parse_output_base(filecontent, codename='OPEN_GRID') data_regression.check({'parsed_data': dict(parsed_data), 'logs': dict(logs)})

That way we can reuse it for multiple tests.

mbercx · 2021-10-11T19:07:53Z

tests/parsers/parse_raw/test_base.py

+    return filecontent
+
+
+def test_parse_output_base_default(generate_output, data_regression):


Maybe we can have two tests here: one that checks the parse_output_base method for a successful calculation (i.e. also returning the wall time, which I would call default, and one that checks the method for a failed calculation (e.g. failed_fft_error, also see my other comment).

obgeneralao · 2021-12-04T14:27:04Z

Hi
Is this issue related to the error I got as shown below?

+-> WARNING at 2021-12-04 14:20:42.160632+00:00
 | output parser returned exit code<350>: The parser raised an unexpected exception.
+-> ERROR at 2021-12-04 14:20:42.157234+00:00
 | The parser raised an unexpected exception.
+-> CRITICAL at 2021-12-04 14:20:42.152651+00:00
 | Traceback (most recent call last):
 |   File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/pw.py", line 325, in parse_stdout
 |     parsed_data, logs = parse_stdout(stdout, parameters, parser_options, parsed_xml)
 |   File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/parse_raw/pw.py", line 396, in parse_stdout
 |     trajectory_data['atomic_species_name'] = [data_lines[i + 1 + j].split()[1] for j in range(nat)]
 |   File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/parse_raw/pw.py", line 396, in <listcomp>
 |     trajectory_data['atomic_species_name'] = [data_lines[i + 1 + j].split()[1] for j in range(nat)]
 | IndexError: list index out of range

sphuber · 2021-12-06T08:33:17Z

@obgeneralao I doubt it. This seems to be a different problem in the parser where it assumes a certain format in the stdout. Could you share the stdout file as an attachment to this PR? Then we can try to determine the problem and fix it. You can obtain the output using the CLI with verdi calcjob outputcat <PK> > stdout.txt where you replace <PK> with the pk of the failed calcjob.

obgeneralao · 2021-12-06T09:29:42Z

Hi,

The verdi calcjob outputcat <PK> output is attached
stdout.txt

sphuber · 2021-12-07T09:32:30Z

Thanks @obgeneralao . There is indeed a problem with your output file, but it is not due to the bug of this PR. Instead, it is because your output file is corrupted because you somehow ran a serial version of QE in parallel. You can see that at the beginning of your file, the header is printed multiple times. Therefore the output of the various processors that are all writing to the same file is being interleaved leading to a corrupt file and so the parser crashes. You should make sure that if you either compile a parallel version of QE, so with MPI support, or you run this executable with metadata.options.withmpi = False in the inputs of the PwCalculation.

obgeneralao · 2021-12-07T12:35:52Z

Hi,

Thank you very much. I got it fixed by recompiling it with OpenMP.

mbercx · 2022-01-28T08:45:25Z

@qiaojunfeng is it ok if I close this PR, since it's been superseded by the refactoring work done in #756? I think the test you add here is very similar to the test that is added in #714? If not, could you migrate it to that PR? Thanks!

qiaojunfeng · 2022-01-28T10:50:10Z

@qiaojunfeng is it ok if I close this PR, since it's been superseded by the refactoring work done in #756?

Sure, always welcome a refactor 😁

I think the test you add here is very similar to the test that is added in #714? If not, could you migrate it to that PR? Thanks!

The test here is only for the parse_output_error function, although I was a bit lazy that I reused a opengrid stdout with some errors to test the function 😅. Since this is a test for the parse_output_error function which is used everywhere, so I think it's better to test it in #756?

mbercx · 2022-01-28T12:26:26Z

Great, thanks @qiaojunfeng! I think I'll adapt your file to do some testing on the base parsing indeed. Will close this PR.

PS: If you have a change to work on #714, that'd be great, then we can also get that merged :)

qiaojunfeng added 2 commits October 7, 2021 14:18

Add init file to avoid pytest name clash

07c0056

qiaojunfeng mentioned this pull request Oct 7, 2021

parse_output_base: Move detection of 'JOB DONE' to the end #750

Closed

mbercx reviewed Oct 11, 2021

View reviewed changes

mbercx closed this Jan 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`parse_output_error`: fix parser for QE stdout errors #751

`parse_output_error`: fix parser for QE stdout errors #751

qiaojunfeng commented Oct 7, 2021

mbercx left a comment

mbercx Oct 11, 2021

mbercx Oct 11, 2021

mbercx Oct 11, 2021

mbercx Oct 11, 2021

obgeneralao commented Dec 4, 2021

sphuber commented Dec 6, 2021

obgeneralao commented Dec 6, 2021

sphuber commented Dec 7, 2021

obgeneralao commented Dec 7, 2021

mbercx commented Jan 28, 2022

qiaojunfeng commented Jan 28, 2022

mbercx commented Jan 28, 2022

		return filecontent


		def test_parse_output_base_default(generate_output, data_regression):

parse_output_error: fix parser for QE stdout errors #751

parse_output_error: fix parser for QE stdout errors #751

Conversation

qiaojunfeng commented Oct 7, 2021

mbercx left a comment

Choose a reason for hiding this comment

mbercx Oct 11, 2021

Choose a reason for hiding this comment

mbercx Oct 11, 2021

Choose a reason for hiding this comment

mbercx Oct 11, 2021

Choose a reason for hiding this comment

mbercx Oct 11, 2021

Choose a reason for hiding this comment

obgeneralao commented Dec 4, 2021

sphuber commented Dec 6, 2021

obgeneralao commented Dec 6, 2021

sphuber commented Dec 7, 2021

obgeneralao commented Dec 7, 2021

mbercx commented Jan 28, 2022

qiaojunfeng commented Jan 28, 2022

mbercx commented Jan 28, 2022

`parse_output_error`: fix parser for QE stdout errors #751

`parse_output_error`: fix parser for QE stdout errors #751