Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strymread.state_space() function causing tensorflow related error #50

Open
rahulbhadani opened this issue May 17, 2023 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@rahulbhadani
Copy link
Member

System: Ubuntu 20.04
Tensorflow Version: 2.12.0
Strym version: 0.4.22

Calling state_space() function of an strymread object when called with the data 2021-03-08-22-35-14_2T3MWRFVXLW056972_CAN_Messages.csv is causing the following error:

2023-05-17 16:09:09.997727: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-17 16:09:10.731797: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-05-17 16:09:11.468970: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:11.497757: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:11.498114: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:11.499155: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:11.499497: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:11.499775: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:12.309774: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:12.310122: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:12.310369: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-05-17 16:09:12.310594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1927 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1650 Ti with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5
2023-05-17 16:09:14.080877: I tensorflow/compiler/xla/service/service.cc:169] XLA service 0x7f7e10041040 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-05-17 16:09:14.080905: I tensorflow/compiler/xla/service/service.cc:177]   StreamExecutor device (0): NVIDIA GeForce GTX 1650 Ti with Max-Q Design, Compute Capability 7.5
2023-05-17 16:09:14.103792: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-05-17 16:09:14.308471: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.340032: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.340256: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.340277: I tensorflow/core/common_runtime/executor.cc:1197] [/job:localhost/replica:0/task:0/device:GPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
         [[{{node StatefulPartitionedCall_12}}]]
2023-05-17 16:09:14.354459: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.378588: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.378901: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.393199: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.417358: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.417727: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.430771: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.454429: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.454683: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.467183: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.491294: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.491547: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.505436: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.528874: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.529148: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.543311: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.567252: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.567605: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.580429: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.603798: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.604192: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.617064: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.640781: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.641033: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.654233: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.678197: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.678473: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.696459: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.720180: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.720462: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.732962: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.756120: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.756370: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.769101: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.792212: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.792469: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
2023-05-17 16:09:14.805640: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.3.2 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-05-17 16:09:14.829079: E tensorflow/compiler/xla/status_macros.cc:57] INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()

        xla::status_macros::MakeErrorStream::Impl::GetStatus()
        xla::gpu::GpuCompiler::OptimizeHloModule(xla::HloModule*, stream_executor::StreamExecutor*, stream_executor::DeviceMemoryAllocator*, xla::gpu::GpuTargetConfig const&, xla::AutotuneResults const*)
        xla::gpu::GpuCompiler::RunHloPasses(std::unique_ptr<xla::HloModule, std::default_delete<xla::HloModule> >, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&)
        xla::Service::BuildExecutable(xla::HloModuleProto const&, std::unique_ptr<xla::HloModuleConfig, std::default_delete<xla::HloModuleConfig> >, xla::Backend*, stream_executor::StreamExecutor*, xla::Compiler::CompileOptions const&, bool)
        xla::LocalService::CompileExecutables(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        xla::LocalClient::Compile(xla::XlaComputation const&, absl::lts_20220623::Span<xla::Shape const* const>, xla::ExecutableBuildOptions const&)
        tensorflow::XlaDeviceCompilerClient::BuildExecutable(tensorflow::XlaCompiler::Options const&, tensorflow::XlaCompilationResult const&)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileStrict(tensorflow::DeviceCompilationClusterSignature const&, tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::NameAttrList const&, tensorflow::DeviceCompilationCache<xla::LocalExecutable>::Value, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tsl::mutex*)
        tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileImpl(tensorflow::XlaCompiler::CompileOptions const&, tensorflow::XlaCompiler::Options const&, tensorflow::NameAttrList const&, std::vector<tensorflow::XlaArgument, std::allocator<tensorflow::XlaArgument> > const&, tensorflow::DeviceCompiler<xla::LocalExecutable, xla::LocalClient>::CompileScope, tensorflow::DeviceCompileMode, tensorflow::OpKernelContext*, tensorflow::DeviceCompilationProfiler*, tensorflow::XlaCompilationResult const**, xla::LocalExecutable**)

        tensorflow::XlaLocalLaunchBase::ComputeAsync(tensorflow::OpKernelContext*, std::function<void ()>)
        tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::function<void ()>)


        Eigen::ThreadPoolTempl<tsl::thread::EigenEnvironment>::WorkerLoop(int)
        std::_Function_handler<void (), tsl::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&)


        clone
*** End stack trace ***

2023-05-17 16:09:14.829332: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:362 : INTERNAL: RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/strym/strymread.py", line 1984, in state_space
    cdiff = strymread.differentiate(c, method="AE")
  File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/strym/strymread.py", line 2464, in differentiate
    model.fit( time, message, epochs=epochs, verbose=verbose)
  File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: Graph execution error:

Detected at node 'StatefulPartitionedCall_12' defined at (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/strym/strymread.py", line 1984, in state_space
      cdiff = strymread.differentiate(c, method="AE")
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/strym/strymread.py", line 2464, in differentiate
      model.fit( time, message, epochs=epochs, verbose=verbose)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/engine/training.py", line 1685, in fit
      tmp_logs = self.train_function(iterator)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/engine/training.py", line 1284, in train_function
      return step_function(self, iterator)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/engine/training.py", line 1268, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in run_step
      outputs = model.train_step(data)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/engine/training.py", line 1054, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 543, in minimize
      self.apply_gradients(grads_and_vars)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1174, in apply_gradients
      return super().apply_gradients(grads_and_vars, name=name)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 650, in apply_gradients
      iteration = self._internal_apply_gradients(grads_and_vars)
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1200, in _internal_apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1250, in _distributed_apply_gradients_fn
      distribution.extended.update(
    File "/home/refulgent/anaconda3/envs/LEC/lib/python3.10/site-packages/keras/optimizers/optimizer.py", line 1245, in apply_grad_to_update_var
      return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_12'
RET_CHECK failure (tensorflow/compiler/xla/service/gpu/gpu_compiler.cc:618) dnn != nullptr 
         [[{{node StatefulPartitionedCall_12}}]] [Op:__inference_train_function_1637]

A possible reason may be related to Autoencoder-based interpolation. Need to investigate further.

@rahulbhadani rahulbhadani self-assigned this May 17, 2023
@rahulbhadani rahulbhadani added the bug Something isn't working label May 17, 2023
@rahulbhadani
Copy link
Member Author

Verified that the issue was not available in Tensorflow version 2.4.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant