Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI hanging and no program termination #10

Open
V-Rang opened this issue Mar 18, 2024 · 0 comments
Open

MPI hanging and no program termination #10

V-Rang opened this issue Mar 18, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@V-Rang
Copy link
Collaborator

V-Rang commented Mar 18, 2024

commit: 1c390ab

In PDEProblems.py, a __del__(self) destructor is used to destroy the 3 solver objects and 6 matrices once the class PDEVariationalProblem goes out of scope. Destroying the three solvers using:

self.solver.destroy()
self.solver_fwd_inc.destroy()
self.solver_adj_inc.destroy()

and running the examples\sfsi_toy_gaussian.py using multiple processes for e.g. mpirun -n 2 python3 sfsi_toy_gaussian.py results in the program hanging (post all expected computations). Using Ctrl+C following the the above 2 proc command, gives the error code:

Stack trace:
28      0x55fb83907ba5 _start + 37
27      0x7f3282d72e40 __libc_start_main + 128
26      0x7f3282d72d90 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f3282d72d90]
25      0x55fb83907cad Py_BytesMain + 45
24      0x55fb839312d3 Py_RunMain + 371
23      0x55fb8393faaf Py_FinalizeEx + 95
22      0x55fb83940738 python3(+0x265738) [0x55fb83940738]
21      0x55fb8383864e python3(+0x15d64e) [0x55fb8383864e]
20      0x7f327e629256 /usr/local/lib/python3.10/dist-packages/petsc4py/lib/linux-gnu-real64-32/PETSc.cpython-310-x86_64-linux-gnu.so(+0x123256) [0x7f327e629256]
19      0x55fb8382e50b python3(+0x15350b) [0x55fb8382e50b]
18      0x7f327e638873 /usr/local/lib/python3.10/dist-packages/petsc4py/lib/linux-gnu-real64-32/PETSc.cpython-310-x86_64-linux-gnu.so(+0x132873) [0x7f327e638873]
17      0x7f327d2dfaf2 PetscGarbageCleanup + 322
16      0x7f327d2df4fa GarbageKeyAllReduceIntersect_Private + 186
15      0x7f327a3071ea PMPI_Allreduce + 2026
14      0x7f327a5e2b9f /usr/local/lib/libmpi.so.12(+0x32db9f) [0x7f327a5e2b9f]
13      0x7f327a5e175e /usr/local/lib/libmpi.so.12(+0x32c75e) [0x7f327a5e175e]
12      0x7f327a5e0f7d /usr/local/lib/libmpi.so.12(+0x32bf7d) [0x7f327a5e0f7d]
11      0x7f327a5e0757 /usr/local/lib/libmpi.so.12(+0x32b757) [0x7f327a5e0757]
10      0x7f327a5e063f /usr/local/lib/libmpi.so.12(+0x32b63f) [0x7f327a5e063f]
9       0x7f327a54acb6 /usr/local/lib/libmpi.so.12(+0x295cb6) [0x7f327a54acb6]
8       0x7f327a60161e /usr/local/lib/libmpi.so.12(+0x34c61e) [0x7f327a60161e]
7       0x7f327a600ab3 /usr/local/lib/libmpi.so.12(+0x34bab3) [0x7f327a600ab3]
6       0x7f327a5eff6f /usr/local/lib/libmpi.so.12(+0x33af6f) [0x7f327a5eff6f]
5       0x7f327a66989b /usr/local/lib/libmpi.so.12(+0x3b489b) [0x7f327a66989b]
4       0x7f327a665880 /usr/local/lib/libmpi.so.12(+0x3b0880) [0x7f327a665880]
3       0x7f327a663c9b /usr/local/lib/libmpi.so.12(+0x3aec9b) [0x7f327a663c9b]
2       0x7f327c5e7659 /usr/local/lib/libmpi.so.12(+0x2332659) [0x7f327c5e7659]
1       0x7f327c5ba5f5 /usr/local/lib/libmpi.so.12(+0x23055f5) [0x7f327c5ba5f5]
0       0x7f3282d8b520 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f3282d8b520]
2024-03-18 22:38:19.458 (  84.549s) [main            ]                       :0     FATL| Signal: SIGINT
Stack trace:
[truncated]
123     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
122     0x561a1e4c570c _PyFunction_Vectorcall + 124
121     0x561a1e4b38a2 _PyEval_EvalFrameDefault + 24914
120     0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
119     0x561a1e4ade0d _PyEval_EvalFrameDefault + 1725
118     0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
117     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
116     0x561a1e4c570c _PyFunction_Vectorcall + 124
115     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
114     0x561a1e4c570c _PyFunction_Vectorcall + 124
113     0x561a1e4b38a2 _PyEval_EvalFrameDefault + 24914
112     0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
111     0x561a1e4af0d1 _PyEval_EvalFrameDefault + 6529
110     0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
109     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
108     0x561a1e4c570c _PyFunction_Vectorcall + 124
107     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
106     0x561a1e4c570c _PyFunction_Vectorcall + 124
105     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
104     0x561a1e4c570c _PyFunction_Vectorcall + 124
103     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
102     0x561a1e4c570c _PyFunction_Vectorcall + 124
101     0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
100     0x561a1e4c570c _PyFunction_Vectorcall + 124
99      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
98      0x561a1e4c570c _PyFunction_Vectorcall + 124
97      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
96      0x561a1e4c570c _PyFunction_Vectorcall + 124
95      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
94      0x561a1e4c570c _PyFunction_Vectorcall + 124
93      0x561a1e4af0d1 _PyEval_EvalFrameDefault + 6529
92      0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
91      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
90      0x561a1e4c570c _PyFunction_Vectorcall + 124
89      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
88      0x561a1e4c570c _PyFunction_Vectorcall + 124
87      0x561a1e4af0d1 _PyEval_EvalFrameDefault + 6529
86      0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
85      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
84      0x561a1e4c570c _PyFunction_Vectorcall + 124
83      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
82      0x561a1e4c570c _PyFunction_Vectorcall + 124
81      0x561a1e4b38a2 _PyEval_EvalFrameDefault + 24914
80      0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
79      0x561a1e4af0d1 _PyEval_EvalFrameDefault + 6529
78      0x561a1e4d34e1 python3(+0x16e4e1) [0x561a1e4d34e1]
77      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
76      0x561a1e4c570c _PyFunction_Vectorcall + 124
75      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
74      0x561a1e4c570c _PyFunction_Vectorcall + 124
73      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
72      0x561a1e4c570c _PyFunction_Vectorcall + 124
71      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
70      0x561a1e4c570c _PyFunction_Vectorcall + 124
69      0x561a1e4ade0d _PyEval_EvalFrameDefault + 1725
68      0x561a1e4c570c _PyFunction_Vectorcall + 124
67      0x561a1e4b02c1 _PyEval_EvalFrameDefault + 11121
66      0x561a1e4d362e python3(+0x16e62e) [0x561a1e4d362e]
65      0x561a1e4b3c66 _PyEval_EvalFrameDefault + 25878
64      0x561a1e4bb58c _PyObject_MakeTpCall + 508
63      0x561a1e4cf744 python3(+0x16a744) [0x561a1e4cf744]
62      0x561a1e4ba784 _PyObject_FastCallDictTstate + 196
61      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
60      0x561a1e4c570c _PyFunction_Vectorcall + 124
59      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
58      0x561a1e4c570c _PyFunction_Vectorcall + 124
57      0x561a1e4b41f1 _PyEval_EvalFrameDefault + 27297
56      0x561a1e4bb5eb _PyObject_MakeTpCall + 603
55      0x561a1e432683 python3(+0xcd683) [0x561a1e432683]
54      0x561a1e4d362e python3(+0x16e62e) [0x561a1e4d362e]
53      0x561a1e4b4908 _PyEval_EvalFrameDefault + 29112
52      0x561a1e4bb5eb _PyObject_MakeTpCall + 603
51      0x561a1e4c4e0e python3(+0x15fe0e) [0x561a1e4c4e0e]
50      0x7f635ab4e300 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x1a300) [0x7f635ab4e300]
49      0x7f635abc6efe /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x92efe) [0x7f635abc6efe]
48      0x7f635ab56dc1 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x22dc1) [0x7f635ab56dc1]
47      0x7f635ababd75 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x77d75) [0x7f635ababd75]
46      0x7f635ab56b20 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x22b20) [0x7f635ab56b20]
45      0x7f635ab659e7 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x319e7) [0x7f635ab659e7]
44      0x7f635ab5a2f3 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x262f3) [0x7f635ab5a2f3]
43      0x7f635ab4fd73 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x1bd73) [0x7f635ab4fd73]
42      0x7f635ab49e31 /usr/local/lib/python3.10/dist-packages/matplotlib/ft2font.cpython-310-x86_64-linux-gnu.so(+0x15e31) [0x7f635ab49e31]
41      0x561a1e5f70eb _PyObject_CallMethod_SizeT + 203
40      0x561a1e4c9159 python3(+0x164159) [0x561a1e4c9159]
39      0x561a1e4c5969 python3(+0x160969) [0x561a1e4c5969]
38      0x561a1e5ac5fa python3(+0x2475fa) [0x561a1e5ac5fa]
37      0x561a1e5c8ea6 python3(+0x263ea6) [0x561a1e5c8ea6]
36      0x561a1e5c9637 python3(+0x264637) [0x561a1e5c9637]
35      0x561a1e529cbd PyMemoryView_FromBuffer + 429
34      0x561a1e4f9421 python3(+0x194421) [0x561a1e4f9421]
33      0x561a1e59afa0 python3(+0x235fa0) [0x561a1e59afa0]
32      0x561a1e4950aa python3(+0x1300aa) [0x561a1e4950aa]
31      0x561a1e5ef984 python3(+0x28a984) [0x561a1e5ef984]
30      0x561a1e50f973 python3(+0x1aa973) [0x561a1e50f973]
29      0x561a1e4adf52 _PyEval_EvalFrameDefault + 2050
28      0x561a1e4e5dde python3(+0x180dde) [0x561a1e4e5dde]
27      0x7f6364e29a38 /usr/local/lib/python3.10/dist-packages/petsc4py/lib/linux-gnu-real64-32/PETSc.cpython-310-x86_64-linux-gnu.so(+0x133a38) [0x7f6364e29a38]
26      0x7f63641539b6 KSPDestroy + 262
25      0x7f636425fec5 PCDestroy + 53
24      0x7f636425fe26 PCReset + 22
23      0x7f636416f014 /usr/local/petsc/linux-gnu-real64-32/lib/libpetsc.so.3.20(+0xd67014) [0x7f636416f014]
22      0x7f6363e9abe0 MatDestroy + 64
21      0x7f6363d4d51e /usr/local/petsc/linux-gnu-real64-32/lib/libpetsc.so.3.20(+0x94551e) [0x7f6363d4d51e]
20      0x7f6364385317 dmumps_c + 2375
19      0x7f6364386b44 dmumps_f77_ + 4548
18      0x7f63643ef4d2 dmumps_ + 146
17      0x7f635f42f969 pmpi_comm_dup_ + 41
16      0x7f6360b59507 MPI_Comm_dup + 215
15      0x7f6360df4e6a /usr/local/lib/libmpi.so.12(+0x34fe6a) [0x7f6360df4e6a]
14      0x7f6360dff024 /usr/local/lib/libmpi.so.12(+0x35a024) [0x7f6360dff024]
13      0x7f6360dfec64 /usr/local/lib/libmpi.so.12(+0x359c64) [0x7f6360dfec64]
12      0x7f6360e09331 /usr/local/lib/libmpi.so.12(+0x364331) [0x7f6360e09331]
11      0x7f6360dd0757 /usr/local/lib/libmpi.so.12(+0x32b757) [0x7f6360dd0757]
10      0x7f6360dd05e7 /usr/local/lib/libmpi.so.12(+0x32b5e7) [0x7f6360dd05e7]
9       0x7f6360d3b80e /usr/local/lib/libmpi.so.12(+0x29680e) [0x7f6360d3b80e]
8       0x7f6360df161e /usr/local/lib/libmpi.so.12(+0x34c61e) [0x7f6360df161e]
7       0x7f6360df0ab3 /usr/local/lib/libmpi.so.12(+0x34bab3) [0x7f6360df0ab3]
6       0x7f6360ddff6f /usr/local/lib/libmpi.so.12(+0x33af6f) [0x7f6360ddff6f]
5       0x7f6360e5989b /usr/local/lib/libmpi.so.12(+0x3b489b) [0x7f6360e5989b]
4       0x7f6360e55d2a /usr/local/lib/libmpi.so.12(+0x3b0d2a) [0x7f6360e55d2a]
3       0x7f6360e53170 /usr/local/lib/libmpi.so.12(+0x3ae170) [0x7f6360e53170]
2       0x7f6360f2e18b /usr/local/lib/libmpi.so.12(+0x48918b) [0x7f6360f2e18b]
1       0x7f6360f2e036 /usr/local/lib/libmpi.so.12(+0x489036) [0x7f6360f2e036]
0       0x7f6369578520 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f6369578520]
@uvilla uvilla added the bug Something isn't working label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants