fan-niu

hank fan-niu

Pinned Loading

vllm-project/vllm vllm-project/vllm Public

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 38.7k 5.8k
NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9.5k 1.1k
mlc-ai/mlc-llm mlc-ai/mlc-llm Public

Universal LLM Deployment Engine with ML Compilation

Python 20k 1.7k