Running Persistent LLMs for an Entire Campus

A technical talk on operating always-on, self-hosted LLM inference for Arizona State University: the software stack we run, why our HPC playbook did not fit, the realities of non-NVIDIA hardware, and the fairshare admission layer we are building to replace per-key rate limiting.

Presented at CarCC 2026 by Johnathan Lee — Sr. HPC System Architect, Arizona State University.

View the talk

Live slides: https://thediymaker.github.io/slides-carcc2026/

Press F for fullscreen and P for presenter view.

What it covers

Why self-host inference — fixed-cost capacity, data locality, and a single OpenAI-compatible API for the whole campus.
The production stack — Kubernetes (k3s + Rancher), vLLM on Gaudi2, LiteLLM, HAProxy, CloudNativePG, and an in-house account/usage portal.
Why batch HPC did not fit — a persistent multi-tenant service breaks every assumption Slurm makes.
The Gaudi2 reality — we were handed non-NVIDIA hardware and made it work; the stack no longer cares what is underneath it.
Real-time observability — live node grid, per-model inference telemetry, and the Rancher control plane across ~200 nodes.
Where LiteLLM stops and Obleth begins — token-measured, weighted fairshare admission for a saturated, self-hosted cluster.

Links

Obleth fairshare gateway: https://obleth.com
More from the presenter: https://github.com/thediymaker

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
images		images
tools		tools
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
slides.md		slides.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Running Persistent LLMs for an Entire Campus

View the talk

What it covers

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Running Persistent LLMs for an Entire Campus

View the talk

What it covers

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages