Skip to content
@pallas-inference

Pallas Inference Server

Pallas is a LLM inference server

Popular repositories Loading

  1. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  2. triton-inference-server triton-inference-server Public

    Forked from triton-inference-server/server

    The Triton Inference Server provides an optimized cloud and edge inferencing solution.

    Python

  3. unilm-yoco unilm-yoco Public

    Forked from microsoft/unilm

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

    Python

Repositories

Showing 3 of 3 repositories

Top languages

Loading…

Most used topics

Loading…