Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#163: Belos: Provide Tpetra version of LSQR examples #168

Merged
merged 26 commits into from
Sep 15, 2023

Conversation

tlamonthezie
Copy link
Collaborator

@tlamonthezie tlamonthezie commented Sep 7, 2023

Fixes #163

PR to Trilinos: trilinos#12300

@tlamonthezie tlamonthezie changed the title 163 belos add lsqr examples #163: Belos: Provide Tpetra version of LSQR examples Sep 7, 2023
@tlamonthezie tlamonthezie self-assigned this Sep 7, 2023
@tlamonthezie tlamonthezie added NGA-internal NGA workers will take care of these EpetraMPI T2 pkg: Belos labels Sep 7, 2023
@tlamonthezie tlamonthezie changed the title #163: Belos: Provide Tpetra version of LSQR examples #163: Belos: Provide Tpetra version of LSQR examples Sep 7, 2023
antoinemeyer5 pushed a commit that referenced this pull request Sep 8, 2023
* #128: set up GPU CI pipelines

* #128: specify cuda dir

* #128: temporarily run GPU on every push; stop MPI builds

* #128: specify correct dockerfile

* #128: provide correct build/source dirs

* #128: rework MPI in build script

* #128: rework MPI in other gpu build script

* #128: CI build script try another configuration and fix invalid path

* #128: fix missing letter in path for a gpu build

* #128: add newlines to end of files

* #128: add spack find -p to find cuda root

* #128: only run one pipeline; add cuda paths

* #128: add kokkos variables

* #128: add Tpetra_INST_SERIAL:BOOL=ON

* #128: add CUDA root flag

* #128: use correct kokkos architecture

* #128: enable cusolver and cusparse

* #128: emulate local build

* #128: try with different docker image

* #128: update cuda path

* Try adding debug flag for Buildx

* Tpetra: Disable cudaMemcpyAsync for Intercept.cpp

* #128: lower -j

* #128: use defaul kokkos architecture

* #128: use fewer processes for GPU testing

* 128: re-enable Kokkos_ARCH_AMPERE86

* #128: add cuda sample build to CI to validate CUDA

* #128: run cuda test on NGA host

* #128: update jobs dependency on CI for cuda

* #128: add CUD sample run

* #128: remove command not existing in CI

* #128: change cuda path

* #128: try to display information about driver

* #128: change bad command in CI

* #128: fix command

* #128: try install nvidia util in Docker container

* #128: remove commands

* #128: fix cuda path

* #128: fix dockerfile

* #128: add different cuda test images

* Run container in separate step

* Remove not needed code

* Apply changes to Epetra=OFF

* Try to build and run docker within same step

* #128: remove unused old CI files

* #128: check both gpu pipelines

* #128: Tpetra_INST_SERIAL=ON

* #128: fix workflow name

* #128: rework with cuda 11.4 dockerfile

* #168: try to simplify CI sheel script

* #128: try simplify shell script

* #128: remove librairies path for blas and lapack to check if resolved

* #128: try remove Lapack and blas lib paths from cmake call

* #128: try again changing path dynamically

* #128: fix another path

* #128: fix blas path

* #128: apply working conffiguration to other build scripts

* #128: restore triggering workflows on PR

* #128: disable GPU build job for PR having `EpetraMPI T1` label

* #128: enable GPU build only with EpetraMPI T2 and EpetraMPI T3 labels

* #128: upload test log

* #128: fix typo

* #128: fix artifacts

* #128: add junit report for tests

* #128: add junit reporting in CI and set

* #128: fix artifact name

* #128: fix artifacts missing

* #128 fix extra slach char in path

* #128: fix artifacts path

* #128: fix path in gitbub action

* #128: try mounting artifacts folder into the host runner

* #128: use same logic for gpu or non-gpu pipelines

* 128: Finalize pipelines (GPU on push, MPI cancellations)

* 128: remove label requirements

* Revert "Tpetra: Disable cudaMemcpyAsync for Intercept.cpp"

This reverts commit 5db2d5d.

* #128: test intercept reversion

* Revert "Revert "Tpetra: Disable cudaMemcpyAsync for Intercept.cpp""

This reverts commit de87a22.

* #128: fix underscore

* #128: run GPU pipeline on merge to fy23 develop

---------

Co-authored-by: Thomas Dutheillet-Lamonthézie <[email protected]>
Co-authored-by: Jacob Domagala <[email protected]>
cwschilly added a commit that referenced this pull request Sep 8, 2023
* #128: specify cuda dir

* #128: temporarily run GPU on every push; stop MPI builds

* #128: specify correct dockerfile

* #128: provide correct build/source dirs

* #128: rework MPI in build script

* #128: rework MPI in other gpu build script

* #128: CI build script try another configuration and fix invalid path

* #128: fix missing letter in path for a gpu build

* #128: add newlines to end of files

* #128: add spack find -p to find cuda root

* #128: only run one pipeline; add cuda paths

* #128: add kokkos variables

* #128: add Tpetra_INST_SERIAL:BOOL=ON

* #128: add CUDA root flag

* #128: use correct kokkos architecture

* #128: enable cusolver and cusparse

* #128: emulate local build

* #128: try with different docker image

* #128: update cuda path

* Try adding debug flag for Buildx

* Tpetra: Disable cudaMemcpyAsync for Intercept.cpp

* #128: lower -j

* #128: use defaul kokkos architecture

* #128: use fewer processes for GPU testing

* 128: re-enable Kokkos_ARCH_AMPERE86

* #128: add cuda sample build to CI to validate CUDA

* #128: run cuda test on NGA host

* #128: update jobs dependency on CI for cuda

* #128: add CUD sample run

* #128: remove command not existing in CI

* #128: change cuda path

* #128: try to display information about driver

* #128: change bad command in CI

* #128: fix command

* #128: try install nvidia util in Docker container

* #128: remove commands

* #128: fix cuda path

* #128: fix dockerfile

* #128: add different cuda test images

* Run container in separate step

* Remove not needed code

* Apply changes to Epetra=OFF

* Try to build and run docker within same step

* #128: remove unused old CI files

* #128: check both gpu pipelines

* #128: Tpetra_INST_SERIAL=ON

* #128: fix workflow name

* #128: rework with cuda 11.4 dockerfile

* #168: try to simplify CI sheel script

* #128: try simplify shell script

* #128: remove librairies path for blas and lapack to check if resolved

* #128: try remove Lapack and blas lib paths from cmake call

* #128: try again changing path dynamically

* #128: fix another path

* #128: fix blas path

* #128: apply working conffiguration to other build scripts

* #128: restore triggering workflows on PR

* #128: disable GPU build job for PR having `EpetraMPI T1` label

* #128: enable GPU build only with EpetraMPI T2 and EpetraMPI T3 labels

* #128: upload test log

* #128: fix typo

* #128: fix artifacts

* #128: add junit report for tests

* #128: add junit reporting in CI and set

* #128: fix artifact name

* #128: fix artifacts missing

* #128 fix extra slach char in path

* #128: fix artifacts path

* #128: fix path in gitbub action

* #128: try mounting artifacts folder into the host runner

* #128: use same logic for gpu or non-gpu pipelines

* 128: Finalize pipelines (GPU on push, MPI cancellations)

* 128: remove label requirements

* Revert "Tpetra: Disable cudaMemcpyAsync for Intercept.cpp"

This reverts commit 5db2d5d.

* #128: test intercept reversion

* Revert "Revert "Tpetra: Disable cudaMemcpyAsync for Intercept.cpp""

This reverts commit de87a22.

* #128: fix underscore

* #128: run GPU pipeline on merge to fy23 develop

---------

Co-authored-by: Thomas Dutheillet-Lamonthézie <[email protected]>
Co-authored-by: Jacob Domagala <[email protected]>
@tlamonthezie
Copy link
Collaborator Author

Hello @stmcgovern

The new PrecLSQRTpetraExFile example would require to add ifpack2 as dependency for Belos but it would add a circular dependency. It is due to the fact that in Ifpack2 depnds on Belos (only for some tests and examples as I can see)

I think there are 2 possibilities if we want to have a new Tpetra version of this example:

  • Option A: Add PrecLSQRTpetraExFile to Ifpack2 examples
  • Option B: Add PrecLSQRTpetraExFile to Belos examples + move Belos/Ifpack2 tests and examples from Ifpack2 to Belos package + change dependencies (make belos depend on Ifpack2 instead of the inverse, like it is the case with Ifpack)

I think the option A is easier to apply but the option B might be more logic even if we have to modify 2 packages at once.
@stmcgovern could you please give me some guidance on that or ask Belos and Ifpack2 package owners ?
Or is there maybe a place where to add tests and examples external to thse 2 packages to prevent the circular dependency ?

As I am writing this comment I am working on making the example working by ignoring the dependency problem but we will have to know how to make it work at build stage to not have dependency issue.

@tlamonthezie
Copy link
Collaborator Author

tlamonthezie commented Sep 11, 2023

Currently the example does not complete before test timeout of 1500 seconds in the CI if we include the example as a test.
But it WORKS on local machine in one second.
So I guess the problem is not the code but might be the parameters or someting linked to the CI machine... I suggest to ignore that timeout.

cwschilly
cwschilly previously approved these changes Sep 14, 2023
Copy link
Collaborator

@cwschilly cwschilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, after rebase should be good to go

packages/belos/tpetra/example/CMakeLists.txt Show resolved Hide resolved
Copy link
Collaborator

@stmcgovern stmcgovern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Preconditioned version "PrecLQSRTpetraExFile.cpp" is provided but is not built (commented out in CMakeLists.txt) due to the circular dependency issue under consideration by the package developers.

@stmcgovern stmcgovern closed this Sep 15, 2023
@stmcgovern stmcgovern reopened this Sep 15, 2023
@stmcgovern stmcgovern merged commit 096231a into NGA-FY23-develop Sep 15, 2023
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Belos: Provide Tpetra version of LSQR examples
4 participants