Route to OpenAI, Anthropic, Google, Groq, Together, and DeepSeek through a single endpoint. OpenAI-compatible. BYO keys. Caching and retries built in.
# Same API, any model — just change the model name $ curl -X POST agent-llm.167.148.41.86.nip.io/api/chat \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [{"role":"user","content":"Hello!"}], "provider_key": "sk-..." }' { "choices": [{ "message": { "content": "Hello! How can I help?" } }], "provider": "openai", "model": "gpt-4o-mini", "usage": { "total_tokens": 25 }, "cached": false }
One integration, every provider. Switch models without changing code.
Drop-in replacement at /v1/chat/completions. Change one URL in your existing code and any provider works instantly.
Use your own provider API keys. Pass per-request or store securely. We never log your keys permanently.
Identical requests with temperature=0 are cached for 5 minutes. Save money and reduce latency on repeated calls.
Failed requests are retried once with a 1-second delay. Handles transient errors without retry logic in your code.
Track token usage per provider. The /health endpoint shows total requests, tokens, and per-provider breakdowns.
OpenAI GPT-4o, Claude, Gemini, Groq Llama, Together AI Mixtral, DeepSeek. All through one API, one auth pattern.
25+ models across 6 providers. All accessible through one endpoint.
| Provider | Models | Key Format |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, o1, o1-mini | sk-... |
| Anthropic | claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5 | sk-ant-... |
| gemini-2.0-flash, gemini-2.0-pro, gemini-1.5-pro | AIza... | |
| Groq | llama-3.3-70b, llama-3.1-8b, mixtral-8x7b, gemma2-9b | gsk_... |
| Together AI | meta-llama/Llama-3.3-70B, mistralai/Mixtral-8x22B | tok_... |
| DeepSeek | deepseek-chat, deepseek-reasoner | sk-... |
Simple REST API with OpenAI compatibility
Switch models by changing one string. Same code, any provider.
# Use any model — just change "model" curl -X POST http://agent-llm.167.148.41.86.nip.io/api/chat \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "messages": [{"role":"user","content":"Explain DeFi in one paragraph"}], "provider_key": "sk-ant-..." }' # OpenAI-compatible endpoint (drop-in replacement) curl -X POST http://agent-llm.167.148.41.86.nip.io/v1/chat/completions \ -H "Content-Type: application/json" \ -H "X-Provider-Key: sk-..." \ -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}' # List all available models curl http://agent-llm.167.148.41.86.nip.io/api/models
import requests API = "http://agent-llm.167.148.41.86.nip.io" # Chat with any model resp = requests.post(f"{API}/api/chat", json={ "model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Explain AI agents"}], "provider_key": "sk-..." }) data = resp.json() print(data["choices"][0]["message"]["content"]) print(f"Tokens: {data['usage']['total_tokens']}") # Switch to Claude — same code, different model resp = requests.post(f"{API}/api/chat", json={ "model": "claude-sonnet-4-6", "messages": [{"role": "user", "content": "Explain AI agents"}], "provider_key": "sk-ant-..." }) # Compare outputs, same API print(resp.json()["choices"][0]["message"]["content"])
const API = "http://agent-llm.167.148.41.86.nip.io"; // Works as OpenAI drop-in replacement const res = await fetch(`${API}/v1/chat/completions`, { method: "POST", headers: { "Content-Type": "application/json", "X-Provider-Key": "sk-..." }, body: JSON.stringify({ model: "gpt-4o-mini", messages: [{ role: "user", content: "Hello" }] }) }); const { choices, usage } = await res.json(); console.log(choices[0].message.content); console.log(`Tokens: ${usage.total_tokens}`); // Switch to Groq for fast inference const fast = await fetch(`${API}/api/chat`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ model: "llama-3.3-70b-versatile", messages: [{ role: "user", content: "What is DeFi?" }], provider_key: "gsk_..." }) }).then(r => r.json());
You pay the provider for tokens. We charge per API call.
35+ APIs built for AI agents. All with free tiers. All accepting USDC.