Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipc: TSAN/esp clang-17 + some transport_test and ~1 unit_test: strange output/hang/pause ("PLEASE submit a bug report"). #90

Open
ygoldfeld opened this issue Mar 12, 2024 · 0 comments
Labels
bug Something isn't working from-akamai-pre-open Issue origin is Akamai, before opening source test Unit and functional tests; demo/example programs

Comments

@ygoldfeld
Copy link
Contributor

Filed by @ygoldfeld pre-open-source:

Environments:

  • my ([~ygoldfel]) local clang-17 ({+}LLVM libc{+}+ replacing GNU libstdc++; probably irrelevant) env.
  • GitHub Flow-IPC pipeline.

Observed:

  • my env: never so far.
  • GitHub pipeline: yes; for me only with clang-17 (not clang-15, not clang-16: they have been fine):
  • transport_test exercise mode, SHM-jemalloc sub-mode (not SHM-classic, not heap; and not scripted mode);
  • the specific unit_test: Jemalloc_shm_pool_collection_test.Multithread_load (all others are fine, except Shm_session_data_test.Multithread_object_database was disabled due to test bug – it might trigger it too, based on @kinokrt comments; UPDATE - it is fixed and no longer disabled, might trigger this, don't know).

The problem itself presents as follows: console prints:

==76700==WARNING: Can't read from symbolizer at fd 8
==76700==WARNING: Can't write to symbolizer at fd 44
LLVM ERROR: Sections with relocations should have an address of 0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /usr/bin/llvm-symbolizer-17 --demangle --inlines --default-arch=x86_64
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  libLLVM-17.so.1    0x00007f4efaacc406 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 54
1  libLLVM-17.so.1    0x00007f4efaaca5b0 llvm::sys::RunSignalHandlers() + 80
2  libLLVM-17.so.1    0x00007f4efaacca9b
3  libc.so.6          0x00007f4ef9642520
4  libc.so.6          0x00007f4ef96969fc pthread_kill + 300
5  libc.so.6          0x00007f4ef9642476 raise + 22
6  libc.so.6          0x00007f4ef96287f3 abort + 211
7  libLLVM-17.so.1    0x00007f4efaa2eb15 llvm::report_fatal_error(llvm::Twine const&, bool) + 437
8  libLLVM-17.so.1    0x00007f4efaa2e956
9  libLLVM-17.so.1    0x00007f4efc1b3002
10 libLLVM-17.so.1    0x00007f4efc3855a8 llvm::DWARFContext::create(llvm::object::ObjectFile const&, llvm::DWARFContext::ProcessDebugRelocations, llvm::LoadedObjectInfo const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::function<void (llvm::Error)>, std::function<void (llvm::Error)>) + 4328
11 libLLVM-17.so.1    0x00007f4efc517fcf llvm::symbolize::LLVMSymbolizer::getOrCreateModuleInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) + 2479
12 libLLVM-17.so.1    0x00007f4efc5147aa llvm::Expected<llvm::DIGlobal> llvm::symbolize::LLVMSymbolizer::symbolizeDataCommon<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, llvm::object::SectionedAddress) + 58
13 libLLVM-17.so.1    0x00007f4efc514769 llvm::symbolize::LLVMSymbolizer::symbolizeData(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, llvm::object::SectionedAddress) + 9
14 llvm-symbolizer-17 0x0000557435c89893
15 llvm-symbolizer-17 0x0000557435c884ef
16 llvm-symbolizer-17 0x0000557435c87860
17 libc.so.6          0x00007f4ef9629d90
18 libc.so.6          0x00007f4ef9629e40 __libc_start_main + 128
19 llvm-symbolizer-17 0x0000557435c85905
==76700==WARNING: Can't read from symbolizer at fd 8
==76700==WARNING: Can't write to symbolizer at fd 15
LLVM ERROR: Sections with relocations should have an address of 0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /usr/bin/llvm-symbolizer-17 --demangle --inlines --default-arch=x86_64
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  libLLVM-17.so.1    0x00007f7cdf4cc406 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 54
1  libLLVM-17.so.1    0x00007f7cdf4ca5b0 llvm::sys::RunSignalHandlers() + 80
2  libLLVM-17.so.1    0x00007f7cdf4cca9b
3  libc.so.6          0x00007f7cde042520
4  libc.so.6          0x00007f7cde0969fc pthread_kill + 300
5  libc.so.6          0x00007f7cde042476 raise + 22
6  libc.so.6          0x00007f7cde0287f3 abort + 211
7  libLLVM-17.so.1    0x00007f7cdf42eb15 llvm::report_fatal_error(llvm::Twine const&, bool) + 437
8  libLLVM-17.so.1    0x00007f7cdf42e956
9  libLLVM-17.so.1    0x00007f7ce0bb3002
10 libLLVM-17.so.1    0x00007f7ce0d855a8 llvm::DWARFContext::create(llvm::object::ObjectFile const&, llvm::DWARFContext::ProcessDebugRelocations, llvm::LoadedObjectInfo const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::function<void (llvm::Error)>, std::function<void (llvm::Error)>) + 4328
11 libLLVM-17.so.1    0x00007f7ce0f17fcf llvm::symbolize::LLVMSymbolizer::getOrCreateModuleInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) + 2479
12 libLLVM-17.so.1    0x00007f7ce0f147aa llvm::Expected<llvm::DIGlobal> llvm::symbolize::LLVMSymbolizer::symbolizeDataCommon<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, llvm::object::SectionedAddress) + 58
13 libLLVM-17.so.1    0x00007f7ce0f14769 llvm::symbolize::LLVMSymbolizer::symbolizeData(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, llvm::object::SectionedAddress) + 9
14 llvm-symbolizer-17 0x00005586f49f8893
15 llvm-symbolizer-17 0x00005586f49f74ef
16 llvm-symbolizer-17 0x00005586f49f6860
17 libc.so.6          0x00007f7cde029d90
18 libc.so.6          0x00007f7cde029e40 __libc_start_main + 128
19 llvm-symbolizer-17 0x00005586f49f4905
==76700==WARNING: Can't read from symbolizer at fd 8
==76700==WARNING: Can't write to symbolizer at fd 15
LLVM ERROR: Sections with relocations should have an address of 0
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /usr/bin/llvm-symbolizer-17 --demangle --inlines --default-arch=x86_64
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  libLLVM-17.so.1    0x00007f12a6ccc406 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 54
1  libLLVM-17.so.1    0x00007f12a6cca5b0 llvm::sys::RunSignalHandlers() + 80
2  libLLVM-17.so.1    0x00007f12a6ccca9b
3  libc.so.6          0x00007f12a5842520
4  libc.so.6          0x00007f12a58969fc pthread_kill + 300
5  libc.so.6          0x00007f12a5842476 raise + 22
6  libc.so.6          0x00007f12a58287f3 abort + 211
7  libLLVM-17.so.1    0x00007f12a6c2eb15 llvm::report_fatal_error(llvm::Twine const&, bool) + 437
8  libLLVM-17.so.1    0x00007f12a6c2e956
9  libLLVM-17.so.1    0x00007f12a83b3002
10 libLLVM-17.so.1    0x00007f12a85855a8 llvm::DWARFContext::create(llvm::object::ObjectFile const&, llvm::DWARFContext::ProcessDebugRelocations, llvm::LoadedObjectInfo const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, std::function<void (llvm::Error)>, std::function<void (llvm::Error)>) + 4328
11 libLLVM-17.so.1    0x00007f12a8717fcf llvm::symbolize::LLVMSymbolizer::getOrCreateModuleInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) + 2479
12 libLLVM-17.so.1    0x00007f12a87147aa llvm::Expected<llvm::DIGlobal> llvm::symbolize::LLVMSymbolizer::symbolizeDataCommon<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, llvm::object::SectionedAddress) + 58
13 libLLVM-17.so.1    0x00007f12a8714769 llvm::symbolize::LLVMSymbolizer::symbolizeData(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, llvm::object::SectionedAddress) + 9
14 llvm-symbolizer-17 0x00005614bacd4893
15 llvm-symbolizer-17 0x00005614bacd34ef
16 llvm-symbolizer-17 0x00005614bacd2860
17 libc.so.6          0x00007f12a5829d90
18 libc.so.6          0x00007f12a5829e40 __libc_start_main + 128
19 llvm-symbolizer-17 0x00005614bacd0905
==76700==WARNING: Can't read from symbolizer at fd 8
2023-12-12 15:28:16.230671546 +0000 [info]: T7f36dc2ff640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 9
2023-12-12 15:28:16.249109513 +0000 [info]: T7f36dc2ff640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 9
2023-12-12 15:28:16.284143943 +0000 [info]: T7f36cf7fa640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 2
2023-12-12 15:28:16.300969510 +0000 [info]: T7f36cf7fa640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 2
2023-12-12 15:28:16.331467917 +0000 [info]: T7f36d07fb640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 3
2023-12-12 15:28:16.344776151 +0000 [info]: T7f36d07fb640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 3
2023-12-12 15:28:16.631431376 +0000 [info]: T7f36d27fd640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 6
2023-12-12 15:28:16.632357574 +0000 [info]: T7f36d27fd640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 6
2023-12-12 15:28:16.700983790 +0000 [info]: T7f36dedff640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 8
2023-12-12 15:28:16.702557775 +0000 [info]: T7f36dedff640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 8
2023-12-12 15:28:16.722868851 +0000 [info]: T7f36d47ff640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 7
2023-12-12 15:28:16.724069064 +0000 [info]: T7f36d47ff640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 7
2023-12-12 15:28:16.760789707 +0000 [info]: T7f36d37fe640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 4
2023-12-12 15:28:16.763157102 +0000 [info]: T7f36d37fe640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 4
2023-12-12 15:28:16.772291545 +0000 [info]: T7f36d17fc640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 5
2023-12-12 15:28:16.774043434 +0000 [info]: T7f36d17fc640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 5
2023-12-12 15:28:16.791601069 +0000 [info]: T7f36cd7f8640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 0
2023-12-12 15:28:16.793051823 +0000 [info]: T7f36cd7f8640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(179): Destroyed thread cache id 0
2023-12-12 15:28:16.815875854 +0000 [info]: T7f36ce7f9640: TEST-SHM: memory_manager.cpp:destroy_thread_cache(170): Destroying thread cache id 1
==76700==WARNING: Can't write to symbolizer at fd 15
==76700==WARNING: Failed to use and restart external symbolizer!
==================
WARNING: ThreadSanitizer: data race (pid=76700)
  Write of size 8 at 0x7f36e08e5140 by thread T289:
    #0 <null> <null> (libipc_unit_test.exec+0x76454d) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #1 <null> <null> (libipc_unit_test.exec+0x732a9b) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #2 <null> <null> (libipc_unit_test.exec+0x770bf4) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #3 <null> <null> (libipc_unit_test.exec+0x797533) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #4 <null> <null> (libipc_unit_test.exec+0x798fdd) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #5 <null> <null> (libipc_unit_test.exec+0x745dbb) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #6 <null> <null> (libipc_unit_test.exec+0x740872) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #7 <null> <null> (libipc_unit_test.exec+0x72cd58) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #8 <null> <null> (libipc_unit_test.exec+0x4fc012) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #9 <null> <null> (libipc_unit_test.exec+0x50ea6a) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #10 <null> <null> (libipc_unit_test.exec+0x50e7f3) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #11 <null> <null> (libipc_unit_test.exec+0x412ac9) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #12 <null> <null> (libipc_unit_test.exec+0x412f20) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #13 <null> <null> (libipc_unit_test.exec+0x4fe235) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #14 <null> <null> (libc.so.6+0x45d9e) (BuildId: a43bfc8428df6623cd498c9c0caeb91aec9be4f9)

  Previous read of size 8 at 0x7f36e08e5140 by thread T290 (mutexes: write M0, write M1):
    #0 <null> <null> (libipc_unit_test.exec+0x769d24) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #1 <null> <null> (libipc_unit_test.exec+0x76473e) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #2 <null> <null> (libipc_unit_test.exec+0x764564) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #3 <null> <null> (libipc_unit_test.exec+0x732a9b) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #4 <null> <null> (libipc_unit_test.exec+0x770bf4) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #5 <null> <null> (libipc_unit_test.exec+0x797275) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #6 <null> <null> (libipc_unit_test.exec+0x796720) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #7 <null> <null> (libipc_unit_test.exec+0x79a5b6) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #8 <null> <null> (libc.so.6+0x91690) (BuildId: a43bfc8428df6623cd498c9c0caeb91aec9be4f9)

  Mutex M0 (0x7f36e08032a8) created at:
    #0 <null> <null> (libipc_unit_test.exec+0xc19b0) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #1 <null> <null> (libipc_unit_test.exec+0x773e98) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #2 <null> <null> (libipc_unit_test.exec+0x762fdd) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #3 <null> <null> (libipc_unit_test.exec+0x738cdd) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #4 <null> <null> (libipc_unit_test.exec+0x720a43) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #5 <null> <null> (libipc_unit_test.exec+0x72d9b9) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #6 <null> <null> (libipc_unit_test.exec+0x73099a) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #7 <null> <null> (libipc_unit_test.exec+0x72d38b) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #8 <null> <null> (libc.so.6+0x29eba) (BuildId: a43bfc8428df6623cd498c9c0caeb91aec9be4f9)

  Mutex M1 (0x561bce9e3ca8) created at:
    #0 <null> <null> (libipc_unit_test.exec+0xc19b0) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #1 <null> <null> (libipc_unit_test.exec+0x773e98) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #2 <null> <null> (libipc_unit_test.exec+0x7740bd) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #3 <null> <null> (libipc_unit_test.exec+0x768c09) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #4 <null> <null> (libipc_unit_test.exec+0x72d90c) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #5 <null> <null> (libipc_unit_test.exec+0x73099a) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #6 <null> <null> (libipc_unit_test.exec+0x72d38b) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #7 <null> <null> (libc.so.6+0x29eba) (BuildId: a43bfc8428df6623cd498c9c0caeb91aec9be4f9)

  Thread T289 (tid=77197, running) created by main thread at:
    #0 <null> <null> (libipc_unit_test.exec+0xbffcb) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #1 <null> <null> (libstdc++.so.6+0xe6388) (BuildId: 2db998bd67acbfb235c464c0275d4070061695fb)
    #2 <null> <null> (libipc_unit_test.exec+0x644b76) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #3 <null> <null> (libipc_unit_test.exec+0x61c32d) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #4 <null> <null> (libipc_unit_test.exec+0x61dc86) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #5 <null> <null> (libipc_unit_test.exec+0x61ef64) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #6 <null> <null> (libipc_unit_test.exec+0x637499) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #7 <null> <null> (libipc_unit_test.exec+0x646017) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #8 <null> <null> (libipc_unit_test.exec+0x636c51) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #9 <null> <null> (libipc_unit_test.exec+0x48b154) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #10 <null> <null> (libc.so.6+0x29d8f) (BuildId: a43bfc8428df6623cd498c9c0caeb91aec9be4f9)

  Thread T290 (tid=77198, finished) created by main thread at:
    #0 <null> <null> (libipc_unit_test.exec+0xbffcb) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #1 <null> <null> (libstdc++.so.6+0xe6388) (BuildId: 2db998bd67acbfb235c464c0275d4070061695fb)
    #2 <null> <null> (libipc_unit_test.exec+0x644b76) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #3 <null> <null> (libipc_unit_test.exec+0x61c32d) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #4 <null> <null> (libipc_unit_test.exec+0x61dc86) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #5 <null> <null> (libipc_unit_test.exec+0x61ef64) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #6 <null> <null> (libipc_unit_test.exec+0x637499) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #7 <null> <null> (libipc_unit_test.exec+0x646017) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #8 <null> <null> (libipc_unit_test.exec+0x636c51) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #9 <null> <null> (libipc_unit_test.exec+0x48b154) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76)
    #10 <null> <null> (libc.so.6+0x29d8f) (BuildId: a43bfc8428df6623cd498c9c0caeb91aec9be4f9)

SUMMARY: ThreadSanitizer: data race (/home/runner/work/ipc/ipc/install/RelWithDebInfo/bin/libipc_unit_test.exec+0x76454d) (BuildId: 7992d1dbec863670d21d6bb09687761223088e76) 

What happens next depends. I've seen:

  • unit_test: It may or may not pause for a while, but it prints many warnings in a row (like 20 of them – which is completely abnormal otherwise and makes no logical sense there, plus it is not observed locally for me nor for anyone I know of in clang-15/16), all with nonsensical stack trace lines as seen above. However, actually, the test suite completes and even passes fine eventually – including the triggering test. On the other hand these still show up as warnings, plus the exit code is no longer 0, plus these warnings appear impossible to suppress, plus we can hardly trust TSAN's work after this point in time. Bottom line is there's no choice but to omit this test.
  • transport_test (with the above-listed options): I'll just paste:
# First the reason in detail: This run semi-reliably (50%+) fails at this point in the server binary:
#   2023-12-20 11:36:11.322479842 +0000 [info]: Tguy: ex_srv.hpp:send_req_b(1428): App_session [0x7b3800008180]:
#     Chan B[0]: Filling/send()ing payload (description = [reuse out-message + SHM-handle to modified (unless
#     SHM-jemalloc) existing STL data]; alt-payload? = [0]; reusing msg? = [1]; reusing SHM payload? = [1]).
#   LLVM ERROR: Sections with relocations should have an address of 0
#   PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
#   Stack dump:
#   0.  Program arguments: /usr/bin/llvm-symbolizer-17 --demangle --inlines --default-arch=x86_64
#   Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
#   ...
#   ==77990==WARNING: Can't read from symbolizer at fd 599
#   2023-12-20 11:36:31.592293322 +0000 [info]: Tguy: ex_srv.hpp:send_req_b(1547): App_session [0x7b3800008180]: Chan B[0]: Filling done.  Now to send.
# Sometimes the exact point is different, depending on timing; but in any case it is always the above
# TSAN/LLVM error, at which point the thread gets stuck for a long time (10+ seconds); but eventually gets
# unstuck; however transport_test happens to be testing a feature in a certain way so that a giant blocking
# operation in this thread delays certain processing, causes an internal timeout, and the test exits/fails.
# Sure, we could make some changes to the test for that to not happen, but that's beside the point: TSAN
# at run-time is trying to do something and fails terribly; I have no wish to try to work around that situation;
# literally it says "PLEASE submit a bug report [to clang devs]."
#
# TODO: Revisit; figure out how to not trigger this; re-enable.  For the record, I (ygoldfel) cannot reproduce
# in a local clang-17, albeit with libc++ (LLVM STL) instead of libstdc++ (GNU STL).  I've also tried to
# reduce optimization to -O1, as well as with and without LTO, and with and without -fno-omit-frame-pointer;
# same result.

We should:

  • In fact submit a bug report – ideally with a min repro case, which would also help us find a work-around perhaps – as they beg in the error message itself.
  • Look into it. Looking at LLVM source (where TSAN is) might help; it has helped for some other items for me.
  • Contacting their devs could help; they ask people to do that in various contexts.

Do note that I've tried a few things (see last paragraph); but that's different from an investigation into the nitty-gritty of it. What's with this relocation thing? We should find out.

The priority is medium. We have TSAN coverage for the problematic tests (which are themselves only a subset), just not with that particular compiler, clang-17 (and locally that works too).

Last point! TSAN is officially in beta. This is not said in the ASAN, UBSAN, or even MSAN docs. So some level of problems is to be expected. THAT said, even though TSAN can be quite a pain in the butt with things like this, it is worth remembering that it has found tricky real problems and is an extremely valuable tool. It is worth fighting for, so to speak.

@ygoldfeld ygoldfeld added bug Something isn't working from-akamai-pre-open Issue origin is Akamai, before opening source test Unit and functional tests; demo/example programs labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working from-akamai-pre-open Issue origin is Akamai, before opening source test Unit and functional tests; demo/example programs
Projects
None yet
Development

No branches or pull requests

1 participant