Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse_output_error: fix parser for QE stdout errors #751

Closed

Conversation

qiaojunfeng
Copy link
Collaborator

The previous code returned a wrong line_number_end, thus all the QE
stdout errors were discarded. Moreover, the code converted lines into a
set to remove repetitions, but this led to random ordering of error
messages in the final logs.error list, causing difficulties in pytest
data_regression. Now the repeated messages are explicitly discarded so
the error message ordering in QE stdout is still preserved. Finally, a
test is added to ensure errors are correctly recorded in logs.error.

Fix #750

The previous code returned a wrong `line_number_end`, thus all the QE
stdout errors were discarded. Moreover, the code converted lines into a
`set` to remove repetitions, but this led to random ordering of error
messages in the final `logs.error` list, causing difficulties in pytest
data_regression. Now the repeated messages are explicitly discarded so
the error message ordering in QE stdout is still preserved. Finally, a
test is added to ensure errors are correctly recorded in `logs.error`.
Copy link
Member

@mbercx mbercx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @qiaojunfeng! You're right that the previous code didn't properly assign the line_number_end, which meant that typically line_number_end < line_number_start and hence the logs would be empty. Definitely good to fix this. :)

Reviewing this PR has lead me down a rabbit hole of trying to remove this line-based logic in favor of using regular expressions with re. However, since this also requires changes in the parser of neb.x, that is clearly beyond the scope of this PR. ^^

I've left some questions for now, as well as a comment on the tests you've added.

- ERROR_OUTPUT_STDOUT_INCOMPLETE
- ' Error in routine scale_sym_ops (12):'
- ' incompatible FFT grid'
- ' stopping ...'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to add unique lines between the error messages (i.e. between the %%%%%% lines) to the logs?


for marker, message in message_map['warning'].items():
if marker in line:
if message is None:
message = line
logs.warning.append(message)
if message not in logs.warning:
logs.warning.append(message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this part of the code is only run in case there is a %%%%%%% line: aren't these only preserved for critical errors? I.e. can you think of a case where we would want to only log a warning?



@pytest.fixture
def generate_output(filepath_tests):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture seems to only provide the file content for a single test, since the path inside the function is hardcoded. Maybe we want to have several tests here, e.g. one test that for the default functioning of the base stdout parser with output file:

out_file = fixtures_path / 'parse_raw' / 'base' / 'default.out'

and one that tests a failed calculation, e.g. with output file:

out_file = fixtures_path / 'parse_raw' / 'base' / 'failed_fft_error.out'

(I think it's fine not to have separate directories for the output files here, as is the case for other parsing tests, since the base parsing only tests the stdout file anyways)

Maybe it's better to have this fixture return a function that returns the stdout content for a test with a name that corresponds to the filename, e.g.:

def test_parse_output_base_default(generate_output, data_regression):
    """Test ``parse_output_base`` on the stdout of a ``open_grid.x`` calculation."""
    from aiida_quantumespresso.parsers.parse_raw.base import parse_output_base

    filecontent = generate_output('default')
    parsed_data, logs = parse_output_base(filecontent, codename='OPEN_GRID')

    data_regression.check({'parsed_data': dict(parsed_data), 'logs': dict(logs)})

That way we can reuse it for multiple tests.

return filecontent


def test_parse_output_base_default(generate_output, data_regression):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have two tests here: one that checks the parse_output_base method for a successful calculation (i.e. also returning the wall time, which I would call default, and one that checks the method for a failed calculation (e.g. failed_fft_error, also see my other comment).

@obgeneralao
Copy link

Hi
Is this issue related to the error I got as shown below?

+-> WARNING at 2021-12-04 14:20:42.160632+00:00
 | output parser returned exit code<350>: The parser raised an unexpected exception.
+-> ERROR at 2021-12-04 14:20:42.157234+00:00
 | The parser raised an unexpected exception.
+-> CRITICAL at 2021-12-04 14:20:42.152651+00:00
 | Traceback (most recent call last):
 |   File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/pw.py", line 325, in parse_stdout
 |     parsed_data, logs = parse_stdout(stdout, parameters, parser_options, parsed_xml)
 |   File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/parse_raw/pw.py", line 396, in parse_stdout
 |     trajectory_data['atomic_species_name'] = [data_lines[i + 1 + j].split()[1] for j in range(nat)]
 |   File "/home/jupyter-admin/.local/lib/python3.9/site-packages/aiida_quantumespresso/parsers/parse_raw/pw.py", line 396, in <listcomp>
 |     trajectory_data['atomic_species_name'] = [data_lines[i + 1 + j].split()[1] for j in range(nat)]
 | IndexError: list index out of range

@sphuber
Copy link
Contributor

sphuber commented Dec 6, 2021

@obgeneralao I doubt it. This seems to be a different problem in the parser where it assumes a certain format in the stdout. Could you share the stdout file as an attachment to this PR? Then we can try to determine the problem and fix it. You can obtain the output using the CLI with verdi calcjob outputcat <PK> > stdout.txt where you replace <PK> with the pk of the failed calcjob.

@obgeneralao
Copy link

Hi,

The verdi calcjob outputcat <PK> output is attached
stdout.txt

@sphuber
Copy link
Contributor

sphuber commented Dec 7, 2021

Thanks @obgeneralao . There is indeed a problem with your output file, but it is not due to the bug of this PR. Instead, it is because your output file is corrupted because you somehow ran a serial version of QE in parallel. You can see that at the beginning of your file, the header is printed multiple times. Therefore the output of the various processors that are all writing to the same file is being interleaved leading to a corrupt file and so the parser crashes. You should make sure that if you either compile a parallel version of QE, so with MPI support, or you run this executable with metadata.options.withmpi = False in the inputs of the PwCalculation.

@obgeneralao
Copy link

Hi,

Thank you very much. I got it fixed by recompiling it with OpenMP.

@mbercx
Copy link
Member

mbercx commented Jan 28, 2022

@qiaojunfeng is it ok if I close this PR, since it's been superseded by the refactoring work done in #756? I think the test you add here is very similar to the test that is added in #714? If not, could you migrate it to that PR? Thanks!

@qiaojunfeng
Copy link
Collaborator Author

@qiaojunfeng is it ok if I close this PR, since it's been superseded by the refactoring work done in #756?

Sure, always welcome a refactor 😁

I think the test you add here is very similar to the test that is added in #714? If not, could you migrate it to that PR? Thanks!

The test here is only for the parse_output_error function, although I was a bit lazy that I reused a opengrid stdout with some errors to test the function 😅. Since this is a test for the parse_output_error function which is used everywhere, so I think it's better to test it in #756?

@mbercx
Copy link
Member

mbercx commented Jan 28, 2022

Great, thanks @qiaojunfeng! I think I'll adapt your file to do some testing on the base parsing indeed. Will close this PR.

PS: If you have a change to work on #714, that'd be great, then we can also get that merged :)

@mbercx mbercx closed this Jan 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parse_output_base: Move detection of 'JOB DONE' to the end
4 participants