The simplest Python client for free access to top-tier AI models via public endpoint
freellm is a lightweight, easy-to-use Python package that gives you instant access to powerful models like GPT-4.1 Nano, DeepSeek, Gemini Flash Lite, and Claude 3 Haiku — completely free, no API key, no registration required.
It works by communicating directly with the public web interface, delivering high-quality responses with perfect formatting and minimal setup.
- Zero setup — no accounts, no keys
- Simple
.ask("your message")interface - Four powerful models:
gpt(default),deepseek,google,claude - Optional conversation memory (sends full history when
limitis enabled) - Per-conversation message limit with automatic reset
- Streaming support (token-by-token output)
- Perfect handling of newlines and spacing (no stuck words or visible
\n) - Clean and intuitive CLI
- Minimal dependencies (only
requests)
pip install freellmRequires Python 3.8+
from freellm import FreeLLM
# One-shot query (GPT-4.1 Nano by default)
print(FreeLLM().ask("Tell me a joke"))
# Use DeepSeek
print(FreeLLM(model="deepseek").ask("Explain quantum computing in simple terms"))
# With memory + limit
bot = FreeLLM(model="claude", limit=20)
bot.ask("My name is Alice")
print(bot.ask("What is my name?"))freellm # GPT-4.1 Nano, no memory
freellm --model deepseek # Use DeepSeek
freellm --model google # Gemini 2.0 Flash Lite
freellm --model claude # Claude 3 Haiku
freellm --limit 15 # Enable memory (up to 15 user messages)
freellm --stream # Token-by-token streaming
freellm --model deepseek --limit 20 --stream # All features combined
freellm "Hello, who are you?" # One-shot message# Persistent chat with Claude
bot = FreeLLM(model="claude", limit=10)
bot.ask("Explain how neural networks work")
bot.ask("Now give a real-world analogy")
bot.ask("Make it even simpler for a child")# Quick stateless queries with different models
questions = ["Capital of Japan?", "Best way to learn Python?", "Write a haiku about rain"]
models = ["gpt", "deepseek", "google"]
for q, m in zip(questions, models):
print(f"[{m.upper()}]: {FreeLLM(model=m).ask(q)}\n")freellm --helpusage: freellm [-h] [--model {gpt,deepseek,google,claude}] [--limit LIMIT] [--stream] [message]
FreeLLM - Free access to DeepSeek, Gemini, Claude & GPT
positional arguments:
message Send a single message and exit
options:
-h, --help show this help message
--model {gpt,deepseek,google,claude}
Model: gpt (default), deepseek, google, claude
--limit LIMIT Enable memory: max user messages before conversation reset
--stream Show response token-by-token (streaming)
The underlying service is a free public endpoint and does not officially store conversation state.
When you set --limit or limit=N, FreeLLM sends the full conversation history with every request — this provides the best possible context retention.
Memory works reliably for short-to-medium conversations (up to ~20–30 messages depending on length) and may vary slightly with server load.
IMApurbo
GitHub: @IMApurbo
MIT License
Enjoy frontier-level AI models for free — no barriers, no costs! 🚀
Made with ❤️ by IMApurbo