cublas
Here are 81 public repositories matching this topic...
Safe rust wrapper around CUDA toolkit
-
Updated
Jun 6, 2024 - Rust
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
-
Updated
Nov 7, 2023 - Cuda
Hooked CUDA-related dynamic libraries by using automated code generation tools.
-
Updated
Dec 12, 2023 - C
code for benchmarking GPU performance based on cublasSgemm and cublasHgemm
-
Updated
May 20, 2022 - Cuda
Algorithms implemented in CUDA + resources about GPGPU
-
Updated
Jan 18, 2022 - Cuda
Real-time GPU Beamformer for DSA110 written in C/CUDA
-
Updated
May 21, 2019 - Jupyter Notebook
Basel morphable face model mesh and texture generator using GPU.
-
Updated
Sep 14, 2020 - C
Bandicoot: C++ library for GPU linear algebra & scientific computing - https://coot.sourceforge.io
-
Updated
Jul 19, 2023
Deep Learning library using GPU(CUDA/cuBLAS)
-
Updated
Sep 18, 2021 - Elixir
The repository targets the OpenCL gemm function performance optimization. It compares several libraries clBLAS, clBLAST, MIOpenGemm, Intel MKL(CPU) and cuBLAS(CUDA) on different matrix sizes/vendor's hardwares/OS. Out-of-the-box easy as MSVC, MinGW, Linux(CentOS) x86_64 binary provided. 在不同矩阵大小/硬件/操作系统下比较几个BLAS库的sgemm函数性能,提供binary,开盒即用。
-
Updated
Mar 28, 2019 - C
Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.
-
Updated
Nov 3, 2023 - Cuda
Improve this page
Add a description, image, and links to the cublas topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the cublas topic, visit your repo's landing page and select "manage topics."