Description
Describe the bug
I am a new comer of Julia, and plan to conduct some research with the help of deep learning with Julia using Lux.jl
and LuxCUDA.jl
.
Firstly, the Nvidia driver (v560.94), CUDA driver (v12.6) and CUDA runtime (v12.1.66) were installed on the system. Then, the CUDA.jl
packaged is installed in Julia.
Before I went to LuxCUDA.jl
, I executed the test suite of the package by Pkg.test("CUDA")
as indicated in the document of CUDA.jl
. Then, long error information were presented in the REPL. Among the information, I found a sentence instructing me to submit this bug report.
The platform is Windows 11 23H2 (22631.4391), with Julia 1.11.0 (2024-10-07). The CPU is a AMD Ryzen 9 7950X 16-Core Processor and the GPU is NVIDIA GeForce RTX 4060 Ti (16 GiB). The error information is given as follows:
| | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
core/initialization (2) | 4.33 | 1.32 | 30.4 | 0.00 | N/A | 0.00 | 0.0 | 97.62 | 1073.35 |
gpuarrays/random (2) | 35.95 | 0.00 | 0.0 | 0.03 | N/A | 0.31 | 0.9 | 2332.12 | 1437.32 |
gpuarrays/vectors (2) | 0.34 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 27.46 | 1437.32 |
gpuarrays/base (8) | 56.37 | 0.01 | 0.0 | 8.90 | N/A | 1.06 | 1.9 | 5428.06 | 1494.33 |
gpuarrays/reductions/== isequal (7) | 62.81 | 0.01 | 0.0 | 1.07 | N/A | 1.15 | 1.8 | 5987.16 | 1557.92 |
gpuarrays/constructors (2) | 53.65 | 0.01 | 0.0 | 0.65 | N/A | 0.44 | 0.8 | 3086.47 | 1686.26 |
gpuarrays/reductions/reduce (4) | 121.36 | 0.01 | 0.0 | 1.21 | N/A | 2.16 | 1.8 | 11221.25 | 1772.12 |
gpuarrays/math/intrinsics (4) | 2.61 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 118.23 | 1806.80 |
gpuarrays/statistics (7) | 80.63 | 0.00 | 0.0 | 1.51 | N/A | 1.10 | 1.4 | 5853.54 | 2546.65 |
gpuarrays/reductions/mapreducedim! (5) | 146.05 | 0.01 | 0.0 | 1.54 | N/A | 1.96 | 1.3 | 9308.33 | 1976.00 |
gpuarrays/uniformscaling (5) | 9.34 | 0.00 | 0.0 | 0.01 | N/A | 0.06 | 0.6 | 477.51 | 2173.38 |
gpuarrays/reductions/sum prod (3) | 168.37 | 0.02 | 0.0 | 3.24 | N/A | 2.36 | 1.4 | 12055.64 | 2933.79 |
gpuarrays/reductions/any all count (3) | 11.14 | 0.00 | 0.0 | 0.00 | N/A | 0.12 | 1.1 | 976.57 | 3139.08 |
gpuarrays/interface (3) | 2.55 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 177.97 | 3165.24 |
gpuarrays/reductions/mapreduce (8) | 127.43 | 0.01 | 0.0 | 1.81 | N/A | 2.15 | 1.7 | 11542.56 | 2237.68 |
gpuarrays/reductions/mapreducedim!_large (7) | 45.69 | 0.00 | 0.0 | 818.34 | N/A | 0.94 | 2.1 | 4559.68 | 2986.98 |
gpuarrays/indexing find (8) | 17.45 | 0.00 | 0.0 | 0.13 | N/A | 0.22 | 1.3 | 1616.33 | 2526.84 |
gpuarrays/linalg/mul!/matrix-matrix (4) | 95.12 | 0.01 | 0.0 | 0.12 | N/A | 1.33 | 1.4 | 7807.45 | 2791.06 |
gpuarrays/indexing multidimensional (3) | 48.30 | 0.00 | 0.0 | 2.07 | N/A | 0.54 | 1.1 | 4017.65 | 3667.15 |
gpuarrays/math/power (8) | 34.86 | 0.00 | 0.0 | 0.01 | N/A | 0.61 | 1.7 | 4260.19 | 2794.88 |
gpuarrays/linalg/mul!/vector-matrix (7) | 49.25 | 0.00 | 0.0 | 0.02 | N/A | 0.58 | 1.2 | 4306.55 | 3280.41 |
gpuarrays/broadcasting (6) | 242.02 | 0.01 | 0.0 | 2.00 | N/A | 2.82 | 1.2 | 14503.63 | 2744.07 |
gpuarrays/indexing scalar (8) | 10.98 | 0.00 | 0.0 | 0.01 | N/A | 0.04 | 0.4 | 738.64 | 2974.64 |
gpuarrays/linalg/norm (2) | 160.79 | 0.01 | 0.0 | 0.02 | N/A | 2.73 | 1.7 | 12389.18 | 4814.56 |
From worker 2: WARNING: Method definition var"#3764#kernel"(Any) in module Main at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\core\execution.jl:360 overwritten at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\core\execution.jl:368.
core/execution (2) | 42.03 | 0.00 | 0.0 | 0.02 | N/A | 0.33 | 0.8 | 2268.80 | 5234.02 |
gpuarrays/reductions/reducedim! (3) | 67.76 | 0.00 | 0.0 | 1.03 | N/A | 0.79 | 1.2 | 3682.17 | 4235.64 |
gpuarrays/reductions/minimum maximum extrema (5) | 174.75 | 0.01 | 0.0 | 2.19 | N/A | 3.02 | 1.7 | 13054.22 | 4498.32 |
core/cudadrv (5) | 7.84 | 0.00 | 0.0 | 0.00 | N/A | 0.05 | 0.6 | 455.91 | 4578.10 |
libraries/cusparse (6) | 124.91 | 0.03 | 0.0 | 12.58 | N/A | 1.74 | 1.4 | 7966.58 | 3373.87 |
gpuarrays/linalg (4) | 150.82 | 0.01 | 0.0 | 26.35 | N/A | 2.27 | 1.5 | 9424.42 | 3881.22 |
base/array (3) | 76.95 | 0.10 | 0.1 | 1271.01 | N/A | 0.87 | 1.1 | 5957.10 | 5846.94 |
From worker 3:
From worker 3: Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
From worker 3: Exception: EXCEPTION_BREAKPOINT at 0x7ffedc5503ec -- _ZNK4llvm15NVPTXAsmPrinter24getPTXFundamentalTypeStrB5cxx11EPNS_4TypeEb at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: in expression starting at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\core\device\intrinsics\atomics.jl:5
From worker 3: _ZNK4llvm15NVPTXAsmPrinter24getPTXFundamentalTypeStrB5cxx11EPNS_4TypeEb at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm15NVPTXAsmPrinter21emitFunctionParamListEPKNS_8FunctionERNS_11raw_ostreamE at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm15NVPTXAsmPrinter22emitFunctionEntryLabelEv at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm10AsmPrinter18emitFunctionHeaderEv at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm10AsmPrinter16emitFunctionBodyEv at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm15NVPTXAsmPrinter20runOnMachineFunctionERNS_15MachineFunctionE at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: _ZL21LLVMTargetMachineEmitP23LLVMOpaqueTargetMachineP16LLVMOpaqueModuleRN4llvm17raw_pwrite_streamE19LLVMCodeGenFileTypePPc at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: LLVMTargetMachineEmitToMemoryBuffer at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: LLVMTargetMachineEmitToMemoryBuffer at C:\Users\admin\.julia\packages\LLVM\wMjUU\lib\16\libLLVM.jl:11138
From worker 3: emit at C:\Users\admin\.julia\packages\LLVM\wMjUU\src\targetmachine.jl:118
From worker 3: mcgen at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\mcgen.jl:75
From worker 3: mcgen at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\compiler\compilation.jl:127
From worker 3: unknown function (ip: 00000293d5f00007)
From worker 3: macro expansion at C:\Users\admin\.julia\packages\TimerOutputs\NRdsv\src\TimerOutput.jl:253 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:403 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\TimerOutputs\NRdsv\src\TimerOutput.jl:253 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:400 [inlined]
From worker 3: #emit_asm#209 at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\utils.jl:108
From worker 3: emit_asm at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\utils.jl:106 [inlined]
From worker 3: #codegen#184 at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:120
From worker 3: codegen at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:82 [inlined]
From worker 3: #compile#183 at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:79
From worker 3: compile at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:74 [inlined]
From worker 3: #1145 at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\compiler\compilation.jl:250 [inlined]
From worker 3: #JuliaContext#182 at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:34
From worker 3: unknown function (ip: 000002939b0a5aec)
From worker 3: JuliaContext at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\driver.jl:25
From worker 3: compile at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\compiler\compilation.jl:249
From worker 3: actual_compilation at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\execution.jl:237
From worker 3: unknown function (ip: 000002939b0a0c49)
From worker 3: cached_compilation at C:\Users\admin\.julia\packages\GPUCompiler\2CW9L\src\execution.jl:151
From worker 3: macro expansion at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:380 [inlined]
From worker 3: macro expansion at .\lock.jl:273 [inlined]
From worker 3: #cufunction#1169 at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:375
From worker 3: cufunction at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\compiler\execution.jl:372
From worker 3: unknown function (ip: 00000294259948c9)
From worker 3: jl_apply at C:/workdir/src\julia.h:2157 [inlined]
From worker 3: do_call at C:/workdir/src\interpreter.c:126
From worker 3: eval_value at C:/workdir/src\interpreter.c:223
From worker 3: eval_body at C:/workdir/src\interpreter.c:562
From worker 3: eval_body at C:/workdir/src\interpreter.c:539
From worker 3: eval_body at C:/workdir/src\interpreter.c:539
From worker 3: eval_body at C:/workdir/src\interpreter.c:539
From worker 3: eval_body at C:/workdir/src\interpreter.c:539
From worker 3: eval_body at C:/workdir/src\interpreter.c:539
From worker 3: eval_body at C:/workdir/src\interpreter.c:539
From worker 3: jl_interpret_toplevel_thunk at C:/workdir/src\interpreter.c:821
From worker 3: jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:943
From worker 3: jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:886
From worker 3: ijl_toplevel_eval at C:/workdir/src\toplevel.c:952 [inlined]
From worker 3: ijl_toplevel_eval_in at C:/workdir/src\toplevel.c:994
From worker 3: eval at .\boot.jl:430 [inlined]
From worker 3: include_string at .\loading.jl:2628
From worker 3: _include at .\loading.jl:2688
From worker 3: include at .\sysimg.jl:38 [inlined]
From worker 3: #11 at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\runtests.jl:87 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:63 [inlined]
From worker 3: macro expansion at C:\workdir\usr\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:63 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\utilities.jl:35 [inlined]
From worker 3: macro expansion at C:\Users\admin\.julia\packages\CUDA\2kjXI\src\memory.jl:829 [inlined]
From worker 3: top-level scope at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:62
From worker 3: jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:934
From worker 3: ijl_toplevel_eval at C:/workdir/src\toplevel.c:952 [inlined]
From worker 3: ijl_toplevel_eval_in at C:/workdir/src\toplevel.c:994
From worker 3: eval at .\boot.jl:430 [inlined]
From worker 3: runtests at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:74
From worker 3: jl_apply at C:/workdir/src\julia.h:2157 [inlined]
From worker 3: jl_f__call_latest at C:/workdir/src\builtins.c:875
From worker 3: jl_apply at C:/workdir/src\julia.h:2157 [inlined]
From worker 3: do_apply at C:/workdir/src\builtins.c:831
From worker 3: #invokelatest#2 at .\essentials.jl:1054
From worker 3: jl_apply at C:/workdir/src\julia.h:2157 [inlined]
From worker 3: do_apply at C:/workdir/src\builtins.c:831
From worker 3: invokelatest at .\essentials.jl:1051
From worker 3: jl_apply at C:/workdir/src\julia.h:2157 [inlined]
From worker 3: do_apply at C:/workdir/src\builtins.c:831
From worker 3: #110 at C:\workdir\usr\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:287
From worker 3: run_work_thunk at C:\workdir\usr\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:70
From worker 3: #109 at C:\workdir\usr\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:287
From worker 3: unknown function (ip: 00000293ad7b518b)
From worker 3: jl_apply at C:/workdir/src\julia.h:2157 [inlined]
From worker 3: start_task at C:/workdir/src\task.c:1202
From worker 3: Allocations: 512212331 (Pool: 512175810; Big: 36521); GC: 187
core/device/intrinsics/atomics (3) | failed at 2024-11-04T16:29:46.881
Worker 3 terminated.
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
[1] (::Base.var"#wait_locked#832")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
@ Base .\stream.jl:970
[2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
@ Base .\stream.jl:978
[3] unsafe_read
@ .\io.jl:891 [inlined]
[4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
@ Base .\io.jl:890
[5] read!
@ .\io.jl:895 [inlined]
[6] deserialize_hdr_raw
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\messages.jl:167 [inlined]
[7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:172
[8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:133
[9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:121
libraries/cusolver/dense (8) | 159.34 | 0.08 | 0.1 | 262.80 | N/A | 3.01 | 1.9 | 12823.25 | 4214.03 |
libraries/cusparse/generic (5) | 72.26 | 0.05 | 0.1 | 5.69 | N/A | 0.85 | 1.2 | 5195.97 | 4935.46 |
libraries/cublas (7) | failed at 2024-11-04T16:30:14.729
libraries/cusparse/conversions (8) | 18.16 | 0.01 | 0.0 | 1.69 | N/A | 0.27 | 1.5 | 1812.94 | 4320.41 |
core/device/intrinsics (10) | 34.20 | 0.00 | 0.0 | 0.00 | N/A | 0.24 | 0.7 | 1800.91 | 1322.61 |
core/device/intrinsics/cooperative_groups (5) | 51.16 | 0.00 | 0.0 | 19.36 | N/A | 0.29 | 0.6 | 1974.93 | 6827.81 |
base/sorting (6) | 95.61 | 0.01 | 0.0 | 668.44 | N/A | 3.63 | 3.8 | 13333.46 | 6428.61 |
base/texture (8) | 35.47 | 0.00 | 0.0 | 0.09 | N/A | 0.47 | 1.3 | 2854.34 | 4607.99 |
core/device/intrinsics/wmma (4) | 94.02 | 0.01 | 0.0 | 0.63 | N/A | 0.73 | 0.8 | 4484.64 | 5136.11 |
libraries/cusparse/interfaces (2) | 169.39 | 0.14 | 0.1 | 41.73 | N/A | 2.12 | 1.3 | 10002.96 | 5989.71 |
core/device/array (8) | 4.28 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 275.82 | 4645.19 |
core/codegen (2) | 4.58 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 151.96 | 6127.87 |
core/device/intrinsics/memory (4) | 8.64 | 0.00 | 0.0 | 0.02 | N/A | 0.00 | 0.0 | 428.73 | 5360.70 |
libraries/cusolver/dense_generic (8) | 14.03 | 0.00 | 0.0 | 0.24 | N/A | 0.09 | 0.6 | 872.69 | 4945.31 |
core/device/intrinsics/output (4) | 12.67 | 0.00 | 0.0 | 0.00 | N/A | 0.06 | 0.5 | 771.77 | 5610.56 |
libraries/cusolver/sparse (10) | 25.98 | 0.00 | 0.0 | 0.22 | N/A | 0.52 | 2.0 | 2275.71 | 1562.21 |
base/random (6) | 29.95 | 0.00 | 0.0 | 256.59 | N/A | 0.20 | 0.7 | 1752.58 | 6428.61 |
core/pointer (6) | 0.30 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 9.08 | 6428.61 |
libraries/cusparse/bmm (5) | 33.06 | 0.01 | 0.0 | 0.90 | N/A | 0.75 | 2.3 | 4377.31 | 7180.67 |
core/device/ldg (10) | 7.83 | 0.00 | 0.0 | 0.00 | N/A | 0.06 | 0.8 | 550.05 | 1664.46 |
core/nvml (5) | 0.86 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 55.99 | 7180.67 |
core/device/random (8) | 19.42 | 0.00 | 0.0 | 0.17 | N/A | 0.06 | 0.3 | 875.75 | 5237.04 |
libraries/cusolver/multigpu (4) | 19.63 | 0.00 | 0.0 | 545.60 | N/A | 0.17 | 0.9 | 1403.80 | 5680.43 |
base/broadcast (6) | 13.71 | 0.06 | 0.4 | 0.00 | N/A | 0.06 | 0.5 | 913.70 | 6428.61 |
base/iterator (6) | 2.70 | 0.00 | 0.0 | 1.93 | N/A | 0.00 | 0.0 | 218.71 | 6428.61 |
core/device/intrinsics/math (2) | 39.40 | 0.00 | 0.0 | 0.00 | N/A | 0.33 | 0.8 | 2177.75 | 7337.01 |
core/utils (2) | 0.89 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 59.30 | 7337.01 |
base/threading (6) | 2.00 | 0.00 | 0.1 | 10.94 | N/A | 0.00 | 0.0 | 148.84 | 6428.61 |
libraries/cusparse/device (2) | 0.13 | 0.00 | 0.1 | 0.01 | N/A | 0.00 | 0.0 | 4.52 | 7337.01 |
libraries/staticarrays (6) | 1.17 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 193.19 | 6428.61 |
libraries/cusolver/sparse_factorizations (8) | 15.84 | 0.00 | 0.0 | 3.73 | N/A | 0.24 | 1.5 | 1931.48 | 5406.59 |
core/pool (6) | 3.80 | 0.00 | 0.0 | 0.00 | N/A | 0.88 | 23.2 | 371.56 | 6428.61 |
core/apiutils (6) | 0.15 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 1.12 | 6428.61 |
libraries/cufft (9) | 128.90 | 0.01 | 0.0 | 197.64 | N/A | 1.73 | 1.3 | 7504.18 | 1827.29 |
libraries/cusparse/reduce (9) | 0.01 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 0.29 | 1841.33 |
base/examples (6) | 7.07 | 5.70 | 80.6 | 0.00 | N/A | 0.07 | 1.0 | 1333.85 | 6428.61 |
libraries/curand (6) | 0.06 | 0.00 | 0.0 | 0.00 | N/A | 0.00 | 0.0 | 3.54 | 6428.61 |
libraries/cusparse/broadcast (2) | 29.55 | 0.00 | 0.0 | 0.05 | N/A | 0.31 | 1.0 | 2271.44 | 7624.40 |
libraries/cusparse/linalg (5) | 50.31 | 0.01 | 0.0 | 682.93 | N/A | 2.67 | 5.3 | 14034.14 | 12997.38 |
base/linalg (8) | 32.31 | 0.00 | 0.0 | 1547.52 | N/A | 3.34 | 10.3 | 6091.42 | 7156.66 |
base/kernelabstractions (9) | 50.58 | 0.00 | 0.0 | 71.04 | N/A | 0.96 | 1.9 | 4178.91 | 2461.08 |
base/exceptions (10) | 191.91 | 0.28 | 0.1 | 0.00 | N/A | 0.00 | 0.0 | 12.02 | 1665.53 |
core/profile (4) | 286.90 | 0.00 | 0.0 | 0.00 | N/A | 1.58 | 0.5 | 8662.59 | 5822.62 |
Testing finished in 13 minutes, 21 seconds, 949 milliseconds
core/device/intrinsics/atomics: Error During Test at none:1
Got exception outside of a @test
ProcessExitedException(3)
Worker 7 failed running test libraries/cublas:
Some tests did not pass: 3228 passed, 0 failed, 1 errored, 0 broken.
libraries/cublas: Error During Test at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:1794
Got exception outside of a @test
LLVM error: Cannot select: 0x102f5c58760: v8bf16 = X86ISD::VFPROUND 0x102f5c4e5f0, C:\Users\admin\.julia\packages\BFloat16s\u3WQc\src\bfloat16.jl:158 @[ broadcast.jl:673 @[ broadcast.jl:646 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ]
0x102f5c4e5f0: v8f64,ch = load<(load (s512) from %ir.uglygep212, align 8, !tbaa !194, !alias.scope !196, !noalias !81)> 0x102cbe96150, 0x102f5c523a0, undef:i64, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c523a0: i64 = add 0x102f5c58610, Constant:i64<-192>, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c58610: i64 = add 0x102f5c4da90, 0x102f5c58680, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c4da90: i64,ch = CopyFromReg 0x102cbe96150, Register:i64 %53, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c51d80: i64 = Register %53
0x102f5c58680: i64 = shl 0x102f5c58530, Constant:i8<3>, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c58530: i64,ch = CopyFromReg 0x102cbe96150, Register:i64 %54, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c4e580: i64 = Register %54
0x102f5c52480: i8 = Constant<3>
0x102f5c58ed0: i64 = Constant<-192>
0x102f5c59100: i64 = undef
In function: julia_materialize_244170
Stacktrace:
[1] handle_error(reason::Cstring)
@ LLVM C:\Users\admin\.julia\packages\LLVM\wMjUU\src\core\context.jl:194
[2] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:1806 [inlined]
[3] macro expansion
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
[4] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:1795 [inlined]
[5] macro expansion
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
[6] top-level scope
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:669
[7] include
@ .\sysimg.jl:38 [inlined]
[8] #11
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\runtests.jl:87 [inlined]
[9] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:63 [inlined]
[10] macro expansion
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
[11] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:63 [inlined]
[12] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\src\utilities.jl:35 [inlined]
[13] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\src\memory.jl:829 [inlined]
[14] top-level scope
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:62
[15] eval
@ .\boot.jl:430 [inlined]
[16] runtests(f::Function, name::String, time_source::Symbol)
@ Main C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:74
[17] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::@Kwargs{})
@ Base .\essentials.jl:1054
[18] invokelatest(::Any, ::Any, ::Vararg{Any})
@ Base .\essentials.jl:1051
[19] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:287
[20] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:70
[21] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:287
Test Summary: | Pass Error Broken Total Time
Overall | 25094 2 12 25108
core/initialization | 34 34
gpuarrays/random | 64 64
gpuarrays/vectors | 10 10
gpuarrays/base | 96 96
gpuarrays/reductions/== isequal | 312 312
gpuarrays/constructors | 966 966
gpuarrays/reductions/reduce | 264 264
gpuarrays/math/intrinsics | 12 12
gpuarrays/statistics | 84 84
gpuarrays/reductions/mapreducedim! | 312 312
gpuarrays/uniformscaling | 56 56
gpuarrays/reductions/sum prod | 862 862
gpuarrays/reductions/any all count | 101 101
gpuarrays/interface | 7 7
gpuarrays/reductions/mapreduce | 396 396
gpuarrays/reductions/mapreducedim!_large | 50 50
gpuarrays/indexing find | 45 45
gpuarrays/linalg/mul!/matrix-matrix | 432 432
gpuarrays/indexing multidimensional | 101 101
gpuarrays/math/power | 72 72
gpuarrays/linalg/mul!/vector-matrix | 168 168
gpuarrays/broadcasting | 364 364
gpuarrays/indexing scalar | 477 477
gpuarrays/linalg/norm | 696 696
core/execution | 86 86
gpuarrays/reductions/reducedim! | 192 192
gpuarrays/reductions/minimum maximum extrema | 666 666
core/cudadrv | 157 3 160
libraries/cusparse | 871 871
gpuarrays/linalg | 443 443
base/array | 399 399
core/device/intrinsics/atomics | 1 1
libraries/cusolver/dense | 3948 3948
libraries/cusparse/generic | 1300 1300
libraries/cublas | 3228 1 3229
libraries/cusparse/conversions | 136 136
core/device/intrinsics | 38 38
core/device/intrinsics/cooperative_groups | 515 515
base/sorting | 276 276
base/texture | 38 4 42
core/device/intrinsics/wmma | 446 446
libraries/cusparse/interfaces | 2136 2136
core/device/array | 20 20
core/codegen | 17 17
core/device/intrinsics/memory | 16 16
libraries/cusolver/dense_generic | 108 108
core/device/intrinsics/output | 41 41
libraries/cusolver/sparse | 112 112
base/random | 236 236
core/pointer | 35 35
libraries/cusparse/bmm | 40 40
core/device/ldg | 41 41
core/nvml | 27 1 28
core/device/random | 156 156
libraries/cusolver/multigpu | 30 30
base/broadcast | 32 32
base/iterator | 45 45
core/device/intrinsics/math | 112 112
core/utils | 52 52
base/threading | 0
libraries/cusparse/device | 10 10
libraries/staticarrays | 1 1
libraries/cusolver/sparse_factorizations | 36 36
core/pool | 10 10
core/apiutils | 6 6
libraries/cufft | 368 368
libraries/cusparse/reduce | 0
base/examples | 5 5
libraries/curand | 1 1
libraries/cusparse/broadcast | 65 65
libraries/cusparse/linalg | 94 94
base/linalg | 39 39
base/kernelabstractions | 2441 4 2445
base/exceptions | 21 21
core/profile | 21 21
FAILURE
Error in testset core/device/intrinsics/atomics:
Error During Test at none:1
Got exception outside of a @test
ProcessExitedException(3)
Error in testset libraries/cublas:
Error During Test at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:1794
Got exception outside of a @test
LLVM error: Cannot select: 0x102f5c58760: v8bf16 = X86ISD::VFPROUND 0x102f5c4e5f0, C:\Users\admin\.julia\packages\BFloat16s\u3WQc\src\bfloat16.jl:158 @[ broadcast.jl:673 @[ broadcast.jl:646 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ]
0x102f5c4e5f0: v8f64,ch = load<(load (s512) from %ir.uglygep212, align 8, !tbaa !194, !alias.scope !196, !noalias !81)> 0x102cbe96150, 0x102f5c523a0, undef:i64, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c523a0: i64 = add 0x102f5c58610, Constant:i64<-192>, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c58610: i64 = add 0x102f5c4da90, 0x102f5c58680, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c4da90: i64,ch = CopyFromReg 0x102cbe96150, Register:i64 %53, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c51d80: i64 = Register %53
0x102f5c58680: i64 = shl 0x102f5c58530, Constant:i8<3>, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c58530: i64,ch = CopyFromReg 0x102cbe96150, Register:i64 %54, essentials.jl:916 @[ array.jl:919 @[ multidimensional.jl:702 @[ broadcast.jl:639 @[ broadcast.jl:670 @[ broadcast.jl:645 @[ broadcast.jl:605 @[ broadcast.jl:968 @[ simdloop.jl:77 @[ broadcast.jl:967 @[ broadcast.jl:920 @[ broadcast.jl:892 @[ broadcast.jl:867 ] ] ] ] ] ] ] ] ] ] ] ]
0x102f5c4e580: i64 = Register %54
0x102f5c52480: i8 = Constant<3>
0x102f5c58ed0: i64 = Constant<-192>
0x102f5c59100: i64 = undef
In function: julia_materialize_244170
Stacktrace:
[1] handle_error(reason::Cstring)
@ LLVM C:\Users\admin\.julia\packages\LLVM\wMjUU\src\core\context.jl:194
[2] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:1806 [inlined]
[3] macro expansion
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
[4] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:1795 [inlined]
[5] macro expansion
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
[6] top-level scope
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\libraries\cublas.jl:669
[7] include
@ .\sysimg.jl:38 [inlined]
[8] #11
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\runtests.jl:87 [inlined]
[9] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:63 [inlined]
[10] macro expansion
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Test\src\Test.jl:1700 [inlined]
[11] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:63 [inlined]
[12] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\src\utilities.jl:35 [inlined]
[13] macro expansion
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\src\memory.jl:829 [inlined]
[14] top-level scope
@ C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:62
[15] eval
@ .\boot.jl:430 [inlined]
[16] runtests(f::Function, name::String, time_source::Symbol)
@ Main C:\Users\admin\.julia\packages\CUDA\2kjXI\test\setup.jl:74
[17] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::@Kwargs{})
@ Base .\essentials.jl:1054
[18] invokelatest(::Any, ::Any, ::Vararg{Any})
@ Base .\essentials.jl:1051
[19] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:287
[20] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:70
[21] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
@ Distributed C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:287
ERROR: LoadError: Test run finished with errors
in expression starting at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\runtests.jl:501
ERROR: Package CUDA errored during testing
Stacktrace:
[1] pkgerror(msg::String)
@ Pkg.Types C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\Types.jl:68
[2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
@ Pkg.Operations C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\Operations.jl:2102
[3] test
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\Operations.jl:1987 [inlined]
[4] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Cmd, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::@Kwargs{io::IOContext{IO}})
@ Pkg.API C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\API.jl:475
[5] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOContext{IO}, kwargs::@Kwargs{})
@ Pkg.API C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\API.jl:159
[6] test(pkgs::Vector{Pkg.Types.PackageSpec})
@ Pkg.API C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\API.jl:148
[7] test
@ C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\API.jl:147 [inlined]
[8] test(pkg::String)
@ Pkg.API C:\Users\admin\AppData\Local\julias\julia-1.11\share\julia\stdlib\v1.11\Pkg\src\API.jl:146
[9] top-level scope
@ REPL[5]:1
To reproduce
The Minimal Working Example (MWE) for this bug:
using Pkg
using CUDA
Pkg.test("CUDA")
Manifest.toml
...
[[deps.CUDA]]
deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "Crayons", "DataFrames", "ExprTools", "GPUArrays", "GPUCompiler", "KernelAbstractions", "LLVM", "LLVMLoopInfo", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "NVTX", "Preferences", "PrettyTables", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "StaticArrays", "Statistics", "demumble_jll"]
git-tree-sha1 = "e0725a467822697171af4dae15cec10b4fc19053"
uuid = "052768ef-5323-5732-b1bb-66c8b64840ba"
version = "5.5.2"
weakdeps = ["ChainRulesCore", "EnzymeCore", "SpecialFunctions"]
[deps.CUDA.extensions]
ChainRulesCoreExt = "ChainRulesCore"
EnzymeCoreExt = "EnzymeCore"
SpecialFunctionsExt = "SpecialFunctions"
[[deps.CUDA_Driver_jll]]
deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
git-tree-sha1 = "ccd1e54610c222fadfd4737dac66bff786f63656"
uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc"
version = "0.10.3+0"
[[deps.CUDA_Runtime_Discovery]]
deps = ["Libdl"]
git-tree-sha1 = "33576c7c1b2500f8e7e6baa082e04563203b3a45"
uuid = "1af6417a-86b4-443c-805f-a4643ffb695f"
version = "0.3.5"
[[deps.CUDA_Runtime_jll]]
deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "e43727b237b2879a34391eeb81887699a26f8f2f"
uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2"
version = "0.15.3+0"
[[deps.CUDNN_jll]]
deps = ["Artifacts", "CUDA_Runtime_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "9851af16a2f357a793daa0f13634c82bc7e40419"
uuid = "62b44479-cb7b-5706-934f-f13b2eb2e645"
version = "9.4.0+0"
...
[[deps.GPUArrays]]
deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"]
git-tree-sha1 = "62ee71528cca49be797076a76bdc654a170a523e"
uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7"
version = "10.3.1"
...
[[deps.GPUCompiler]]
deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "PrecompileTools", "Preferences", "Scratch", "Serialization", "TOML", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "1d6f290a5eb1201cd63574fbc4440c788d5cb38f"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.27.8"
...
[[deps.LLVM]]
deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Preferences", "Printf", "Unicode"]
git-tree-sha1 = "d422dfd9707bec6617335dc2ea3c5172a87d5908"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "9.1.3"
weakdeps = ["BFloat16s"]
[deps.LLVM.extensions]
BFloat16sExt = "BFloat16s"
...
[[deps.cuDNN]]
deps = ["CEnum", "CUDA", "CUDA_Runtime_Discovery", "CUDNN_jll"]
git-tree-sha1 = "4b3ac62501ca73263eaa0d034c772f13c647fba6"
uuid = "02a925ec-e4fe-4b08-9a7e-0d78e3d38ccd"
version = "1.4.0"
...
Expected behavior
It is expected that the test can be passed without errors.
Version info
Details on Julia:
versioninfo()
Julia Version 1.11.0
Commit 501a4f25c2 (2024-10-07 11:40 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 32 × AMD Ryzen 9 7950X 16-Core Processor
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver4)
Threads: 1 default, 0 interactive, 1 GC (on 32 virtual cores)
Environment:
JULIA_PKG_SERVER = https://mirrors.cernet.edu.cn/julia/
Details on CUDA:
CUDA.versioninfo()
CUDA runtime 12.1, artifact installation
CUDA driver 12.6
NVIDIA driver 560.94.0
CUDA libraries:
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 2023.1.1 (API 18.0.0)
- NVML: 12.0.0+560.94
Julia packages:
- CUDA: 5.5.2
- CUDA_Driver_jll: 0.10.3+0
- CUDA_Runtime_jll: 0.15.3+0
Toolchain:
- Julia: 1.11.0
- LLVM: 16.0.6
Preferences:
- CUDA_Runtime_jll.version: 12.1
1 device:
0: NVIDIA GeForce RTX 4060 Ti (sm_89, 14.959 GiB / 15.996 GiB available)
Additional context
The bug report is submitted according to the instruction from the error information shown in Julia REPL.
...
From worker 3:
From worker 3: Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
From worker 3: Exception: EXCEPTION_BREAKPOINT at 0x7ffedc5503ec -- _ZNK4llvm15NVPTXAsmPrinter24getPTXFundamentalTypeStrB5cxx11EPNS_4TypeEb at C:\Users\admin\AppData\Local\julias\julia-1.11\bin\libLLVM-16jl.dll (unknown line)
From worker 3: in expression starting at C:\Users\admin\.julia\packages\CUDA\2kjXI\test\core\device\intrinsics\atomics.jl:5
...