Meta's best open-weight model — runs on community hosts.
No signup required — 10,000 session credits for guests. 50,000 free credits on account creation.
Llama 3.3 70B is Meta's flagship open-weight model — the weights are public, so it's served by a dozen community hosts who compete on price and speed. Quality rivals GPT-4o mini and Claude Haiku at a fraction of the cost.
Open weights — self-host if you want
Very cheap per message (a few hundred credits)
Solid general-purpose chat quality
128k context
A few hundred credits per message. Good for running long chats without watching the meter.
Llama 3.3 70B Instruct is Meta's flagship open-weight chat model — released December 2024. Weights are public, so it's served by multiple community hosts who compete on price and speed.
Not quite — it's smaller. For day-to-day drafting, explaining, and Q&A, most users can't tell the difference, and it costs 10-20× less per message.
Yes — the weights are public on HuggingFace. Using it through Faceb.ai just saves you the GPU cost. A single 70B model needs ~160GB of VRAM to run unquantized.
128,000 tokens — matches GPT-4o and GPT-4o mini.
The 70B instruct model is text-only. Meta has a separate Llama 3.2 Vision series for multimodal — the picker has both.
3.3 70B matches or beats 3.1 405B on most benchmarks at a fraction of the compute cost. If the picker shows 3.1 405B, pick 3.3 70B instead unless you need the slightly broader world knowledge.
Free-tier community hosts exist and sometimes return $0 cost; paid tiers are very cheap (a few hundred credits per message).
Decent — handles small scripts and typical edits well. For serious code work, Claude 3.5 Sonnet or DeepSeek V3 are better picks.
Yes. Model slug: meta-llama/llama-3.3-70b-instruct. API base: https://api.faceb.ai/v1 with your OpenAI-compatible SDK.
Roughly December 2023, with some refreshes.
No — we route through third-party hosts, not Meta directly. None of our hosts should be training on API traffic; if a specific upstream reserves that right, their terms are linked in the model details.
As soon as Meta ships it, our upstream aggregator adds it and it shows up in the picker. Set-and-forget.
Your Faceb.ai credits work for every model — switch per message, no extra subscriptions.
The open-weight model punching way above its price tag.
Chat with DeepSeek V3 →OpenAI's fastest, cheapest frontier model — great default.
Chat with GPT-4o mini →Google's fast multimodal model with a 1M-token context window.
Chat with Gemini 2.0 Flash →One subscription covers every frontier model — switch between them per message. No extra API keys, no extra bills.