prompt-caching

Here are 147 public repositories matching this topic...

esengine / DeepSeek-Reasonix

DeepSeek-native AI coding agent for your terminal. Engineered around prefix-cache stability — leave it running.

agent cli typescript terminal tui developer-tools ink r1 tool-use agent-framework ai-agent llm prompt-caching deepseek ai-coding coding-agent

Updated Jun 14, 2026
Go

AdieLaine / multi-agent-reasoning

Star

The Multi-Agent Reasoning framework creates an interactive chatbot where AI agents collaborate via structured reasoning and Swarm Integration for optimal answers. Simulating a team that discusses, debates, and refines responses, it enables complex problem-solving and precise results. Now with Prompt Caching to reduce latency and costs.

python chatbot multi-agent openai swarm agent-based-modeling reasoning o1 prompt-caching

Updated Jan 23, 2025
Python

Siddhant-K-code / distill

Sponsor

Star

Context intelligence layer for LLM agents: persistent memory with write-time dedup, sensitivity tagging, conflict detection, and hierarchical decay. ~12ms. No LLM calls. MIT.

go golang compression openai developer-tools deterministic deduplication ai-agents pinecone rag vector-database qdrant llm prompt-caching anthropic context-optimization

Updated May 9, 2026
Go

liushuangls / go-anthropic

Star

Anthropic Claude API wrapper for Go

go golang ai vision streaming-api claude tool-use llm prompt-caching anthropic claude-ai claude-api function-calling

Updated Jun 12, 2026
Go

cablate / claude-code-research

Star

Independent research on Claude Code internals, Claude Agent SDK, and related tooling.

research mcp reverse-engineering prompt-caching system-prompt claude-code token-optimization claude-agent-sdk

Updated Mar 31, 2026
HTML

flightlesstux / prompt-caching

Star

Automatic prompt caching for Claude Code. Cuts token costs by up to 90% on repeated file reads, bug fix sessions, and long coding conversations - zero config.

typescript mcp developer-tools claude llm cost-reduction prompt-caching anthropic claude-code token-optimization

Updated Jun 12, 2026
TypeScript

OnlyTerp / prompt-cache-skills

Star

Drop-in prompt-caching fixes for the LLM agent harness you use. Point your AI coding agent at this repo and it ships the patches.

opencode openai cline prompt-engineering aider prompt-caching anthropic llm-agents ai-skills roo-code claude-code agents-md

Updated May 28, 2026
Python

montevive / autocache

Star

🚀 Autocache - Intelligent Anthropic API Cache Proxy Automatically inject cache-control fields into Claude API requests to reduce costs by up to 90% and latency by up to 85%. Works as a transparent drop-in replacement for popular AI platforms like n8n, Flowise, Make.com, LangChain, and LlamaIndex—no code changes required

agent ai proxy cache claude n8n prompt-caching flowise agentic-ai

Updated Feb 2, 2026
Go

GPTSafe / PromptGuard

Star

Build production ready apps for GPT using Node.js & TypeScript

prompt openai gpt gpt-2 gpt-3 prompt-engineering chatgpt prompt-attack prompt-injection prompt-caching prompt-hardening

Updated May 8, 2023
TypeScript

alxsuv / pino

Star

Anthropic API reverse proxy with prompt-cache injection and request body transforms

nodejs reverse-proxy developer-tools claude cost-optimization zero-dependencies prompt-caching anthropic llm-ops claude-code

Updated May 21, 2026
JavaScript

avasol / galadriel-public

Star

A self-hosted Claude agent that actually remembers — verbatim memory palace at zero retrieval cost, Discord + web UI, and 90%-cheaper repeat calls via prompt caching.

agent cache discord-bot self-hosted memory-management claude ai-agent prompt-engineering prompt-caching anthropic mempalace

Updated Jun 14, 2026
Python

agynio / claude-map-reduce-memory

Star

Global, unlimited persistent memory for Claude Code agents. Context-activated hints injected automatically via hooks using scatter-gather map-reduce.

cli map-reduce ai-agents claude llm prompt-caching agent-memory claude-code cmr-memory

Updated Apr 15, 2026
TypeScript

kevinhermawan / swift-llm-chat-anthropic

Sponsor

Star

Interact with Anthropic and Anthropic-compatible chat completion APIs in a simple and elegant way. Supports vision, prompt caching, and more.

swift vision claude prompt-caching anthropic claude-3-opus claude-3-haiku claude-3-5-sonnet

Updated Nov 3, 2024
Swift

ankitvirdi4 / awesome-llm-cost

Star

Tools, libraries, papers, and patterns for reducing the cost of running large language models in production.

awesome gemini openai awesome-list quantization finops cost-engineering llm prompt-caching anthropic llm-observability llm-cost llm-routing llm-caching ai-cost

Updated Jun 5, 2026

Kernel-Dirichlet / CoTARAG

Star

Agentic-AI framework w/o the headaches

sql multi-modal indexing-engine rag llm chain-of-thought prompt-caching prompt-engineering-for-programmers hallucination-detection agentic-framework agentic-workflow hallucination-mitigation agentic-rag agentic-ai

Updated Jan 19, 2026
Python

allytag / claude_code

Star

Run Claude Code with OpenRouter models while keeping tools, file editing, bash, MCP, VS Code support, smart caching, model switching, and safe rollback updates.

nodejs macos zsh mcp vscode-extension cost-optimization apple-silicon local-proxy prompt-caching openrouter claude-code ai-coding-agent coding-agent model-router anthropic-claude-cli safe-updates

Updated May 19, 2026
JavaScript

tigerless-labs / cost-xray

Star

See what Claude Code and Codex actually send to the API — and what each part costs.

proxy observability prompt-caching cost-tracking claude-code codex-cli token-counting

Updated Jun 11, 2026
Python

ParthJadhav / image-read-cache

Sponsor

Star

Agent Skill that caches LLM image descriptions as XMP metadata inside image files, reducing token usage by ~92% on repeated reads. Works with 30+ compatible agents.

opencode image-processing cursor image-cache xmp-metadata ai-agents llm prompt-caching agent-skills claude-code token-optimization

Updated Mar 31, 2026
Python

pleasedodisturb / awesome-llm-token-optimization

Star

A curated list of strategies, tools, papers, and resources for reducing LLM token costs and improving efficiency in production.

Updated Jun 14, 2026

atelier-ws / atelier

Star

Open-source context runtime and governance layer for coding agents. MCP server + Python SDK (Anthropic, LangChain, OpenAI Agents, Google ADK) with shared procedures, failure rescue, loop detection, cost tracking, and cross-vendor model routing across Claude Code, Codex, Copilot, opencode, and any MCP host.

Updated Jun 12, 2026
Python

Improve this page

Add a description, image, and links to the prompt-caching topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the prompt-caching topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt-caching

Here are 147 public repositories matching this topic...

esengine / DeepSeek-Reasonix

AdieLaine / multi-agent-reasoning

Siddhant-K-code / distill

liushuangls / go-anthropic

cablate / claude-code-research

flightlesstux / prompt-caching

OnlyTerp / prompt-cache-skills

montevive / autocache

GPTSafe / PromptGuard

alxsuv / pino

avasol / galadriel-public

agynio / claude-map-reduce-memory

kevinhermawan / swift-llm-chat-anthropic

ankitvirdi4 / awesome-llm-cost

Kernel-Dirichlet / CoTARAG

allytag / claude_code

tigerless-labs / cost-xray

ParthJadhav / image-read-cache

pleasedodisturb / awesome-llm-token-optimization

atelier-ws / atelier

Improve this page

Add this topic to your repo