llama-server

Star

Here are 39 public repositories matching this topic...

lordmathis / llamactl

Star

Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.

self-hosted mlx openai-api llm llamacpp llama-cpp vllm llm-inference localllm localllama llama-server llm-router mlx-lm

Updated Jun 8, 2026
Go

ArkaneFans / servllama

Star

1-Click LLM Server on Your Phone — no Termux needed! 无需Termux，一键让你的手机变成LLM服务器！

android chatbot web-ui llama flutter llm llama-cpp ollama llama-server llama-cpp-ui

Updated Jun 7, 2026
Dart

hwpoison / llamacpp-terminal-chat

Star

A lightweight chat terminal-interface for llama.cpp server written in C++ with many features and windows/linux support.

chat roleplay llama teminal-application llamacpp mistral-7b llama-server

Updated Mar 31, 2026
C++

yatesdr / go-llm-proxy

Star

Lightweight proxy for LLM

golang self-hosted homelab codex openai-api llm openai-proxy llama-cpp vllm llm-proxy llama-server claude-code responses-api qwen-code claude-code-cli

Updated Apr 17, 2026
Go

cuolm / pi-sbx-llamacpp

Star

Run Pi coding agent isolated in a Docker Sandbox microVM with a local llama-server as the inference backend

sbx docker-sandbox microvm ai-agent llama-cpp local-llm gguf localllama llama-server pi-agent pi-coding-agent

Updated Jun 11, 2026

lynxai-team / goinfer

Star

Local LLM proxy, DevOps friendly

inference inference-server inference-api openai-api llm openaiapi llamacpp llama-cpp local-llm localllm local-ai llm-proxy llama-api llama-server llm-router language-model-api local-lm local-llm-integration

Updated Apr 28, 2026
Go

thilomichael / llama-buddy

Star

CLI wrapper for llama.cpp providing an ollama-like experience

python cli huggingface llm llama-cpp local-llm gguf llama-server

Updated Apr 27, 2026
Python

biliops / MyLLaMA

Star

大模型推理服务性能调优

docker debian kml bisheng podman kunpeng llama-server qwen3

Updated Jun 10, 2026
Jinja

pkeffect / llama-swap-sync

Star

A robust, production-ready Python toolkit to automate the synchronization between a directory of .gguf model files and a llama-swap config.yaml

python llama-cpp gguf llama-server llama-swap

Updated Nov 15, 2025
Python

mallard1983 / openclaw-kvcache-proxy

Star

FastAPI proxy that strips volatile fields from OpenClaw requests to dramatically improve llama-server KV cache hit rates (~22× faster prompt eval)

proxy fastapi kv-cache llm prompt-caching llama-cpp local-llm llama-server openclaw amd-vulkan

Updated Feb 23, 2026
Python

Llama-Recipe-Manager / llama-recipe-manager

Star

One place to store and manage all your recipe for Llama Server

desktop-app rust ai svelte tauri tauri-app llama-cpp local-llm svelte5 llama-server

Updated Jun 8, 2026
Svelte

Parth3930 / pern

Star

Offline-First Local AI Desktop & Mobile Agent.

rust typescript offline-first discord-bot desktop-assistant local-ai llama-server react-19 tauri-v2 whatsapp-agent mobile-assistant

Updated Jun 13, 2026
Rust

nlkli / lachat

Star

minimal CLI client for llama-server

chat cli cli-app llama gpt llm chatgpt llamacpp llama-server

Updated May 17, 2026
Rust

byang37 / llama-runner

Star

A lightweight desktop GUI for llama-server — multi-model routing, per-model presets, live I/O recording. Built with Go. Support Windows · macOS · Linux

llm llama-cpp local-ai gguf llama-server llama-gui

Updated Mar 16, 2026
HTML

NoSkillGuy / gemma-on-mac-mlx-vs-llama.cpp

Star

Benchmark Gemma 4 E2B on Apple Silicon: MLX (mlx-lm) vs GGUF (llama-server), with TTFT, tokens/sec, and memory.

python macos benchmark machine-learning metal inference gemma mlx apple-silicon llama-cpp gguf llama-server mlx-lm gemma-4

Updated Apr 6, 2026
Python

A production-grade Python SDK for llama-server that streamlines authentication, token rotation, observability, and PII masking—helping AI architects ship secure, traceable LLM systems with enterprise-ready guardrails.

sdk ai openai llama observability governance pii llm generative-ai langchain llama-cpp langfuse llama-server langgraph ai-architecture

Updated Feb 28, 2026
Python

alasgarovs / llamaorch

Star

LlamaOrch is simple Bash-based CLI Orchestrator for llama.cpp server.

orchestration llm llamacpp llm-tools local-ai llama-server llamacpp-server

Updated Apr 14, 2026
Shell

WajihZaman / local-ai-rag-assistant

Star

Enterprise-grade local RAG API assistant backend running on traditional Azure CPU infrastructure. Serves a quantized Llama 3.2 GGUF model via llama-server with offline ChromaDB ingestion and stateful MySQL transaction logging.

mysql sqlalchemy python3 software-architecture azure-deployment backend-api rag audit-logging fastapi vector-database cpu-inference llamacpp local-ai gguf llama-server retrieval-augmented-generation-rag chromadb-vector-search performance-telemetry

Updated May 31, 2026
Python

space-kitty-o / gemi

Star

Claude-Code-style CLI for your own local LLM fleet. Multi-agent delegation, MCP, hooks, autopilot, 100+ tools (file/shell/web/security/free APIs) — all running on your GPU, no cloud calls.

Updated Apr 30, 2026
Python

witong42 / llamapad

Star

Terminal UI for launching llama-server/llama.cpp with auto-discovered local GGUF models

go macos tui terminal-ui bubbletea llama-cpp gguf llama-server

Updated Apr 8, 2026
Go

Improve this page

Add a description, image, and links to the llama-server topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama-server topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-server

Here are 39 public repositories matching this topic...

lordmathis / llamactl

ArkaneFans / servllama

hwpoison / llamacpp-terminal-chat

yatesdr / go-llm-proxy

cuolm / pi-sbx-llamacpp

lynxai-team / goinfer

thilomichael / llama-buddy

biliops / MyLLaMA

pkeffect / llama-swap-sync

mallard1983 / openclaw-kvcache-proxy

Llama-Recipe-Manager / llama-recipe-manager

Parth3930 / pern

nlkli / lachat

byang37 / llama-runner

NoSkillGuy / gemma-on-mac-mlx-vs-llama.cpp

Root1V / axonium-sdk

alasgarovs / llamaorch

WajihZaman / local-ai-rag-assistant

space-kitty-o / gemi

witong42 / llamapad

Improve this page

Add this topic to your repo