OpenAI's fastest, cheapest frontier model — great default.
No signup required — try it as a guest. 30,000 free tokens every day once you sign up.
GPT-4o mini is OpenAI's small-but-capable model — roughly 25× cheaper per token than full GPT-4o, with quality that's plenty for everyday tasks. The default pick when you want OpenAI quality at fraction cost.
Dramatically cheaper per message than GPT-4o
Still handles vision (at a reduced fidelity)
Sub-second first token in most cases
128k context window
Around 200–450 tokens per typical message. The 30k free daily tokens buy around 75 messages a day on the free plan.
OpenAI-compatible. Same Faceb.ai tokens cover chat and API. Drop-in replacement for the OpenAI SDK.
curl https://api.faceb.ai/v1/chat/completions \
-H "Authorization: Bearer sk-faceb-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
from openai import OpenAI
client = OpenAI(
base_url="https://api.faceb.ai/v1",
api_key="sk-faceb-YOUR_KEY",
)
stream = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.faceb.ai/v1",
apiKey: "sk-faceb-YOUR_KEY",
});
const stream = await client.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [{ role: "user", content: "Hello!" }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Also works for image generation (image-output model slugs return image_url content parts) and web search (add "web_search": true to the payload). Same endpoint, same wallet.
GPT-4o mini is OpenAI's smaller, faster, much cheaper sibling to full GPT-4o — released in July 2024 to replace GPT-3.5-turbo. It keeps multimodal support.
Use mini for summarisation, day-to-day Q&A, drafting emails, quick code explanations. Reach for full GPT-4o when the task needs deep reasoning or careful code architecture.
You get 30k tokens free every day. A GPT-4o mini message costs ~200–450 tokens, so that's roughly 70+ free messages a day, every day.
For boilerplate and small edits, yes. For architecture-level refactors, Claude 3.5 Sonnet or full GPT-4o are better picks.
Yes, it accepts image input — at reduced fidelity compared to full GPT-4o, but plenty for screenshots and diagrams.
128,000 tokens, same as full GPT-4o.
Both are small-and-fast. Claude Haiku has a larger context window (200k vs 128k); GPT-4o mini has better multimodal support. Try both and pick.
Yes. Any API key from /account/api/ can call it. Model slug: openai/gpt-4o-mini.
It's a much smaller model — fewer parameters means cheaper inference. The trade-off is slightly weaker reasoning, but the cost difference is 20-30×.
Yes — same Oct 2023 cutoff in practice, with some refreshes via OpenAI's training pipeline.
Yes — it inherits OpenAI's moderation. Pair it with a clear system prompt and a PII-stripping pass if you're logging transcripts.
As long as OpenAI serves it. If they deprecate it (they announced one-year notice policies), we'll surface a replacement in the picker before it goes dark.
Your Faceb.ai tokens work for every model — switch per message, no extra subscriptions.
OpenAI's flagship multimodal model — text, vision, and code.
Chat with GPT-4o →Anthropic's fastest, cheapest model — near-instant replies.
Chat with Claude 3 Haiku →Google's fast multimodal model with a 1M-token context window.
Chat with Gemini 2.0 Flash →One subscription covers every frontier model — switch between them per message. No extra API keys, no extra bills.