NERSC Perlmutter Triton + TensorRT-LLM Demo

Setup

To copy across the model files and download the container image on Perlmutter run on a login node:

./deploy.sh setup

Once the setup is complete, start up an interactive slurm job (replacing your account):

salloc -N 1 -C gpu -G 4 --gpu-bind=closest -t 01:00:00 -q interactive -A <account>

Inside the slurm job run:

./deploy.sh run

Open up notebook.ipynb to connect to the LLM container and test.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deploy.sh		deploy.sh
env.cfg		env.cfg
langchain_nvidia_trt_llms.py		langchain_nvidia_trt_llms.py
notebook.ipynb		notebook.ipynb