Skip to content

add grace calculator, clarify gpu/cpu fixes #408 #426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 7, 2025

Conversation

alinelena
Copy link
Member

No description provided.

@alinelena alinelena self-assigned this Feb 14, 2025
Copy link
Member

@ElliottKasoar ElliottKasoar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far.

Which bits are the gpu/cpu fixes?

@ElliottKasoar ElliottKasoar linked an issue Feb 14, 2025 that may be closed by this pull request
@ElliottKasoar ElliottKasoar added the enhancement New/improved feature or request label Feb 14, 2025
@ElliottKasoar
Copy link
Member

ElliottKasoar commented Feb 14, 2025

Also to do:

  • Update README (dependencies, features)
  • Update docs (getting_started)

@alinelena
Copy link
Member Author

Looks good so far.

Which bits are the gpu/cpu fixes?

exactly... I did not see an obvious way to select use of cpu/gpu via python

@ElliottKasoar
Copy link
Member

Looks good so far.
Which bits are the gpu/cpu fixes?

exactly... I did not see an obvious way to select use of cpu/gpu via python

I suppose similar to dpa3?

@alinelena
Copy link
Member Author

Looks good so far.
Which bits are the gpu/cpu fixes?

exactly... I did not see an obvious way to select use of cpu/gpu via python

I suppose similar to dpa3?

i suspect is tensorflow specific https://www.tensorflow.org/guide/gpu

@ElliottKasoar
Copy link
Member

A shorter way to trigger the test failures is (with grace and mace installed):

pytest -vvv tests/test_single_point.py::test_extras tests/test_neb_cli.py::test_neb --pdb results in:

Traceback (most recent call last):
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/main.py", line 283, in wrap_session
    session.exitstatus = doit(config, session) or 0
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/main.py", line 337, in _main
    config.hook.pytest_runtestloop(session=session)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 122, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/logging.py", line 803, in pytest_runtestloop
    return (yield)  # Run all the tests.
            ^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 122, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/terminal.py", line 673, in pytest_runtestloop
    result = yield
             ^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/main.py", line 362, in pytest_runtestloop
    item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 122, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/warnings.py", line 112, in pytest_runtest_protocol
    return (yield)
            ^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 122, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/assertion/__init__.py", line 176, in pytest_runtest_protocol
    return (yield)
            ^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 122, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/unittest.py", line 429, in pytest_runtest_protocol
    res = yield
          ^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 122, in _multicall
    teardown.throw(exception)  # type: ignore[union-attr]
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/faulthandler.py", line 88, in pytest_runtest_protocol
    return (yield)
            ^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/runner.py", line 113, in pytest_runtest_protocol
    runtestprotocol(item, nextitem=nextitem)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/runner.py", line 132, in runtestprotocol
    reports.append(call_and_report(item, "call", log))
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/runner.py", line 248, in call_and_report
    ihook.pytest_exception_interact(node=item, call=call, report=report)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/debugging.py", line 286, in pytest_exception_interact
    out, err = capman.read_global_capture()
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 801, in read_global_capture
    return self._global_capturing.readouterr()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 705, in readouterr
    err = self.err.snap() if self.err else ""
          ^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 589, in snap
    self.tmpfile.seek(0)
ValueError: I/O operation on closed file.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/main.py", line 303, in wrap_session
    config.notify_exception(excinfo, config.option)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/config/__init__.py", line 1173, in notify_exception
    res = self.hook.pytest_internalerror(excrepr=excrepr, excinfo=excinfo)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 911, in pytest_internalerror
    self.stop_global_capturing()
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 776, in stop_global_capturing
    self._global_capturing.pop_outerr_to_orig()
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 657, in pop_outerr_to_orig
    out, err = self.readouterr()
               ^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 705, in readouterr
    err = self.err.snap() if self.err else ""
          ^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 589, in snap
    self.tmpfile.seek(0)
ValueError: I/O operation on closed file.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/bin/pytest", line 10, in <module>
    sys.exit(console_main())
             ^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/config/__init__.py", line 201, in console_main
    code = main()
           ^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/config/__init__.py", line 175, in main
    ret: ExitCode | int = config.hook.pytest_cmdline_main(config=config)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__
    return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall
    raise exception.with_traceback(exception.__traceback__)
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall
    res = hook_impl.function(*args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/main.py", line 330, in pytest_cmdline_main
    return wrap_session(config, _main)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/main.py", line 325, in wrap_session
    config._ensure_unconfigure()
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/config/__init__.py", line 1127, in _ensure_unconfigure
    fin()
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 776, in stop_global_capturing
    self._global_capturing.pop_outerr_to_orig()
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 657, in pop_outerr_to_orig
    out, err = self.readouterr()
               ^^^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 705, in readouterr
    err = self.err.snap() if self.err else ""
          ^^^^^^^^^^^^^^^
  File "/Users/elliottkasoar/Documents/PSDI/janus-core/.venv/lib/python3.12/site-packages/_pytest/capture.py", line 589, in snap
    self.tmpfile.seek(0)
ValueError: I/O operation on closed file.

@ElliottKasoar
Copy link
Member

ElliottKasoar commented Apr 1, 2025

A shorter way to trigger the test failures is (with grace and mace installed):

pytest -vvv tests/test_single_point.py::test_extras tests/test_neb_cli.py::test_neb --pdb results in:

The exact causes of this remain a bit unclear, given that it's dependent on the order/combination of tests, but it's some form of known pytest error, with tensorflow (which calls sys.stdout.close() and sys.stderr.close()) and/or the typer/click runner.

Most of the Python solutions seemed to involve clearing/disabling logging at various points, but we already do that to some extent, and there didn't seem to be a solution that resolves this for us.

What I've gone with for now is a wrapper blocking stderr/stdout being closed, which seems to work:

sys.stderr.close = lambda *args: None
sys.stdout.close = lambda *args: None

If we're not happy with that, the main other alternatives are:

  • Always run with --capture=no, but I'm not a fan of potential failures based on the options we pass to pytest
  • Don't use the typer/click runner, and use something like subprocess.run instead, which from initial testing also seems to fix this, but would mean changing a huge number of tests

@ElliottKasoar
Copy link
Member

Looks like there are a few compatibility issues relating to one of tensorflow's dependencies and Windows: tensorflow/io#2087.

Should be fixable, but if it's too painful, we could also not explicitly support it.

@alinelena
Copy link
Member Author

given tensorflow is not most used in the wild I would not waste the time... just make it clear we did not test it for windows and we do not plan to do so since we like our sanity

Copy link
Member

@ElliottKasoar ElliottKasoar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs another approval but I'm happy, assuming the tests all pass.

Copy link
Collaborator

@harveydevereux harveydevereux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All seems to be in order

@ElliottKasoar ElliottKasoar merged commit f83394d into stfc:main Apr 7, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New/improved feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add support for grace
3 participants