infa

Rust + CUDA = Fast and simple inference library from scratch

requirements

Linux computer with CUDA 12~, cublas, rust installed. You need at least sm_80 micro architecture. (This is hardcoded for now.)

WIP

Our first goal is to support bloat16 Llama 3.2 1B inference.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
infa-core		infa-core
infa-cuda-bindings		infa-cuda-bindings
infa		infa
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
test.py		test.py