#

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Here are 4,903 public repositories matching this topic...

onediff

siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

cuda pytorch lora lcm performance-optimization inference-engine diffusion-models stable-diffusion diffusers sd-webui comfyui sdxl aigc-serving lcm-lora stable-video-diffusion sdxl-turbo comfyui-workflow

Updated May 22, 2024
Python

ajithmoola / THB-Diff

A Differentiable THB-spline module implemented in JAX and PyTorch

automatic-differentiation cuda pytorch surface-reconstruction jax thb-splines volume-reconstruction

Updated May 22, 2024
Python

replicate / cog

Containers for machine learning

docker machine-learning ai deep-learning containers tensorflow cuda pytorch

Updated May 22, 2024
Python

jimurk / ppl.cv

ppl.cv is a high-performance image processing library of openPPL supporting x86 and cuda platforms.

opencl cuda image-processing risc-v

Updated May 22, 2024
C++

kroma-network / tachyon

Modular ZK(Zero Knowledge) backend accelerated by GPU

c-plus-plus cryptography blockchain cuda cryptocurrency cpp17 zk zero-knowledge tachyon kroma

Updated May 22, 2024
C++

OpenVoiceOS / status

Open Voice OS Status Page

status text-to-speech translator monitoring alerting cuda sam nvidia tts uptime stats speech-to-text stt piper ovos upptime openvoiceos fasterwhisper mimic3

Updated May 22, 2024
Markdown

beam-cloud / beta9

The open-source serverless GPU container runtime.

gpu distributed-computing cuda self-hosted fine-tuning ml-platform large-language-models llm generative-ai llm-inference

Updated May 22, 2024
Go

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated May 22, 2024
Python

rapidsai / cudf

cuDF - GPU DataFrame Library

python data-science cpp gpu arrow pydata cuda pandas data-analysis dask dataframe rapids cudf

Updated May 22, 2024
C++

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learning gpu cuda pytorch jax fp8

Updated May 22, 2024
Python

Moon

kisskaDevi / Moon

Vulkan render

cpp vulkan glsl cuda render vulkan-api 3d-graphics 3d-engine vulkan-library vulkan-demos vulkan-renderer

Updated May 21, 2024
C++

yingding / applyllm

A python package for applying LLM with LangChain and Hugging Face on local CUDA/MPS host

framework pipeline accelerator cuda transformers slurm inference batch mps kubeflow llm langchain

Updated May 21, 2024
Jupyter Notebook

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

machine-learning compiler runtime tensorflow vulkan cuda pytorch spirv jax mlir

Updated May 22, 2024
C++

BeAR

NewStrangeWorlds / BeAR

A Bayesian Nested-Sampling Retrieval Code

retrieval cuda exoplanets

Updated May 21, 2024
C

lucasdelimanogueira / PyNorch

Recreating PyTorch from scratch (C/C++, CUDA and Python, with GPU support and automatic differentiation!)

python c deep-learning neural-network cuda pytorch

Updated May 21, 2024
Python

patientx / ComfyUI-Zluda

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. Now ZLUDA enhanced for better AMD GPU performance.

windows amd cuda rocm stable-diffusion comfyui zluda

Updated May 21, 2024
Python

pytorch / torchrec

Pytorch domain library for recommendation systems

deep-learning gpu cuda pytorch recommendation-system sharding recommender-system

Updated May 21, 2024
Python

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

machine-learning deep-learning cuda pytorch nvidia jetson tensorrt libtorch

Updated May 22, 2024
Python

catboost / catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

python data-science machine-learning data-mining tutorial r big-data gpu cuda kaggle gbdt gbm gpu-computing decision-trees gradient-boosting coreml catboost categorical-features

Updated May 21, 2024
Python

rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.

Updated May 21, 2024
Cuda

Created by Nvidia

Released June 23, 2007

Followers: 199 followers
Website: developer.nvidia.com/cuda-zone
Wikipedia: Wikipedia

Related Topics