Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gmsh: patch for read_parallel and write_parallel with CUDA backend #405

Open
cwsmith opened this issue Dec 5, 2022 · 0 comments
Open

gmsh: patch for read_parallel and write_parallel with CUDA backend #405

cwsmith opened this issue Dec 5, 2022 · 0 comments

Comments

@cwsmith
Copy link
Contributor

cwsmith commented Dec 5, 2022

@tristan0x

We were merging the snl repo into the scorec fork and some code in Omega_h_gmsh.cpp caught my attention. It appears that in a few places device arrays are being used in host code. E.g.; vert_globals_w in this block:

Write<GO> vert_globals_w(nnodes);
for (LO local_index = 0; local_index < nnodes; ++local_index) {
const auto global_index =
static_cast<GO>(node_tags[static_cast<std::size_t>(local_index)]);
node_number_map[global_index] = local_index;
vert_globals_w[local_index] = global_index;
}

This branch main...SCOREC:omega_h:cws/gmshFix has a partial fix but fails during the serial vs parallel mesh comparison test here:

OMEGA_H_CHECK(light_compare_meshes(mesh, pmesh) == OMEGA_H_SAME);

Details on the build and failure are below.

Any help would be appreciated.


GCC version: 7.4.0
CUDA version: 11.4
MPICH version: 3.3.1

Gmsh version: 4.11.0 (tag=gmsh_4_11_0) from https://gitlab.onelab.info/gmsh/gmsh.git
Gmsh cmake build command:

d=buildGmsh
cmake -S gmsh -B $d \
  -DCMAKE_INSTALL_PREFIX=$d/install \
  -DENABLE_BUILD_DYNAMIC=on \
  -DBUILD_SHARED_LIBS=ON 
cmake --build $d --target install -j8

Omegah cmake build command:

d=buildOmegah
cmake -S omega_h -B $d \
  -DGmsh_INCLUDE_DIRS=$PWD/buildGmsh/install/include \
  -DGmsh_LIBRARIES=$PWD/buildGmsh/install/lib64/libgmsh.so \
  -DGmsh_VERSION_STRING=4.11.0 \
  -DCMAKE_INSTALL_PREFIX=$d/install \
  -DBUILD_TESTING=on  \
  -DOmega_h_USE_CUDA=on \
  -DOmega_h_CUDA_ARCH=75 \
  -DOmega_h_USE_MPI=on  \
  -DOmega_h_USE_Gmsh=on  \
  -DBUILD_SHARED_LIBS=ON \
  -DBUILD_TESTING=on
cmake --build $d --target install -j8

GDB stack trace

Note, I'm not sure why, but when I run gmsh --version on the command line the output Error : Unknown string option 'General.WebBrowser' appears. This output also appears when using the API (see below). I didn't find anything referencing this output in the gitlab issues page. It appears to be non-fatal as I was able to load the gmsh file, creating the omegah mesh, and writing VTK files (before the failing mesh comparison).

gdb ./src/unit_io
─── Output/messages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Error   : Unknown string option 'General.WebBrowser'

Thread 1 "unit_io" received signal SIGSEGV, Segmentation fault.
0x00007ffff3790220 in __memcmp_sse4_1 () from /lib64/libc.so.6
─── Source ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Stack ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[0] from 0x00007ffff3790220 in __memcmp_sse4_1
[1] from 0x00000000004201a7 in std::__equal<true>::equal<long>(long const*, long const*, long const*)+80 at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:814
[2] from 0x000000000041ee73 in std::__equal_aux<long const*, long const*>(long const*, long const*, long const*)+47 at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:831
[3] from 0x000000000041de87 in std::equal<long const*, long const*>(long const*, long const*, long const*)+79 at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:1051
[4] from 0x000000000041a0fd in light_compare_meshes(Omega_h::Mesh&, Omega_h::Mesh&)+978 at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1271
[5] from 0x000000000041ae46 in test_gmsh_parallel(Omega_h::Library*)+675 at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1358
[6] from 0x000000000041cb0d in main(int, char**)+317 at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1485
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>>> where
#0  0x00007ffff3790220 in __memcmp_sse4_1 () from /lib64/libc.so.6
#1  0x00000000004201a7 in std::__equal<true>::equal<long> (__first1=0x7fffc4410a00, __last1=0x7fffc4413708, __first2=0x7fffc4424200) at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:814
#2  0x000000000041ee73 in std::__equal_aux<long const*, long const*> (__first1=0x7fffc4410a00, __last1=0x7fffc4413708, __first2=0x7fffc4424200) at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:831
#3  0x000000000041de87 in std::equal<long const*, long const*> (__first1=0x7fffc4410a00, __last1=0x7fffc4413708, __first2=0x7fffc4424200) at /opt/scorec/spack/dev/install/linux-rhel7-x86_64/gcc-6.5.0/gcc-7.4.0-c5aaloybb5jqiolgakbf5sxir4axkly4/include/c++/7.4.0/bits/stl_algobase.h:1051
#4  0x000000000041a0fd in light_compare_meshes (a=..., b=...) at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1271
#5  0x000000000041ae46 in test_gmsh_parallel (lib=0x7fffffffac30) at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1358
#6  0x000000000041cb0d in main (argc=1, argv=0x7fffffffade8) at /space/cwsmith/omegahGmsh/omega_h/src/unit_io.cpp:1485
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant