Google's fast multimodal model with a 1M-token context window.
No signup required — try it as a guest. 30,000 free tokens every day once you sign up.
Gemini 2.0 Flash is Google DeepMind's fast multimodal model, notable for its 1M-token context window (you can paste a whole novel and still have room). Native vision, audio input, and tool use.
1,000,000-token context window (largest anywhere)
Native multimodal — text, images, audio-in
Very fast streaming
Strong at grounded tasks when given enough context
A few hundred tokens per typical chat turn — well within every plan's daily floor.
OpenAI-compatible. Same Faceb.ai tokens cover chat and API. Drop-in replacement for the OpenAI SDK.
curl https://api.faceb.ai/v1/chat/completions \
-H "Authorization: Bearer sk-faceb-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.0-flash-exp",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
from openai import OpenAI
client = OpenAI(
base_url="https://api.faceb.ai/v1",
api_key="sk-faceb-YOUR_KEY",
)
stream = client.chat.completions.create(
model="google/gemini-2.0-flash-exp",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="", flush=True)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.faceb.ai/v1",
apiKey: "sk-faceb-YOUR_KEY",
});
const stream = await client.chat.completions.create({
model: "google/gemini-2.0-flash-exp",
messages: [{ role: "user", content: "Hello!" }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Also works for image generation (image-output model slugs return image_url content parts) and web search (add "web_search": true to the payload). Same endpoint, same wallet.
Gemini 2.0 Flash is Google DeepMind's fast multimodal model — released Dec 2024. Notable for its 1M-token context window and native audio/image input.
Anything that benefits from huge context — reading PDFs, entire codebases, long transcripts. Also solid at visual reasoning.
No. Faceb.ai gives you Gemini 2.0 Flash alongside every other model on one plan — no Google subscription needed.
Every account gets a daily tokens floor that covers a healthy amount of Gemini 2.0 Flash chat for free. Heavy users top up or grab a Pro plan.
Flash wins on context (1M vs 128k) and latency. GPT-4o edges ahead on deep reasoning and creative writing. Gemini is often the cheaper pick.
Yes — native audio input is one of its headline features. Drop an audio file and it transcribes + reasons in one pass.
Yes — our picker has it too. Pro is larger and smarter but slower and pricier than Flash.
Yes. Model slug: google/gemini-2.0-flash-exp. OpenAI-compatible endpoint at https://api.faceb.ai/v1.
It's improving fast, but still a half-step behind Claude 3.5 Sonnet and GPT-4o for code tasks. Great at explaining code you paste in with its huge context window, though.
Varies by Gemini version; 2.0 Flash is roughly June 2024. Google refreshes periodically.
No. We route through paid Google providers with an explicit no-training flag set on the request. Your chat history lives only on your account.
Yes — our catalog auto-fetches from the upstream aggregator, so new Google models appear in the picker on release day.
Your Faceb.ai tokens work for every model — switch per message, no extra subscriptions.
OpenAI's flagship multimodal model — text, vision, and code.
Chat with GPT-4o →Anthropic's best balance of quality and cost — a coder favourite.
Chat with Claude 3.5 Sonnet →Meta's best open-weight model — runs on community hosts.
Chat with Llama 3.3 70B →One subscription covers every frontier model — switch between them per message. No extra API keys, no extra bills.