Chat with Llama 3.3 70B

Meta's best open-weight model — runs on community hosts.

No signup required — try it as a guest. 30,000 free tokens every day once you sign up.

Provider
Meta
Model slug
meta-llama/llama-3.3-70b-instruct
Typical cost
A few hundred tokens per message. Good for running long chats without watching t…
Availability
On Faceb.ai · chat + API

About Llama 3.3 70B

Llama 3.3 70B is Meta's flagship open-weight model — the weights are public, so it's served by a dozen community hosts who compete on price and speed. Quality rivals GPT-4o mini and Claude Haiku at a fraction of the cost.

What it's good at

1

Open weights — self-host if you want

2

Very cheap per message (a few hundred tokens)

3

Solid general-purpose chat quality

4

128k context

Pricing on Faceb.ai

A few hundred tokens per message. Good for running long chats without watching the meter.

Use Llama 3.3 70B from the API

OpenAI-compatible. Same Faceb.ai tokens cover chat and API. Drop-in replacement for the OpenAI SDK.

curl https://api.faceb.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-faceb-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.3-70b-instruct",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'
from openai import OpenAI

client = OpenAI(
    base_url="https://api.faceb.ai/v1",
    api_key="sk-faceb-YOUR_KEY",
)

stream = client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.faceb.ai/v1",
  apiKey: "sk-faceb-YOUR_KEY",
});

const stream = await client.chat.completions.create({
  model: "meta-llama/llama-3.3-70b-instruct",
  messages: [{ role: "user", content: "Hello!" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Also works for image generation (image-output model slugs return image_url content parts) and web search (add "web_search": true to the payload). Same endpoint, same wallet.

Full API docs → · Get an API key →

Frequently asked — Llama 3.3 70B

What is Llama 3.3 70B?

Llama 3.3 70B Instruct is Meta's flagship open-weight chat model — released December 2024. Weights are public, so it's served by multiple community hosts who compete on price and speed.

Is Llama 3.3 as good as GPT-4o?

Not quite — it's smaller. For day-to-day drafting, explaining, and Q&A, most users can't tell the difference, and it costs 10-20× less per message.

Can I self-host Llama 3.3?

Yes — the weights are public on HuggingFace. Using it through Faceb.ai just saves you the GPU cost. A single 70B model needs ~160GB of VRAM to run unquantized.

What's Llama 3.3's context window?

128,000 tokens — matches GPT-4o and GPT-4o mini.

Does Llama 3.3 support images?

The 70B instruct model is text-only. Meta has a separate Llama 3.2 Vision series for multimodal — the picker has both.

How does Llama 3.3 compare to Llama 3.1 405B?

3.3 70B matches or beats 3.1 405B on most benchmarks at a fraction of the compute cost. If the picker shows 3.1 405B, pick 3.3 70B instead unless you need the slightly broader world knowledge.

Is Llama 3.3 free on Faceb.ai?

You get 30k tokens free every day — a Llama 3.3 message costs only a few hundred, so the daily floor covers dozens of messages every day.

Is it any good at coding?

Decent — handles small scripts and typical edits well. For serious code work, Claude 3.5 Sonnet or DeepSeek V3 are better picks.

Can I call Llama from the API?

Yes. Model slug: meta-llama/llama-3.3-70b-instruct. API base: https://api.faceb.ai/v1 with your OpenAI-compatible SDK.

What's Llama 3.3's knowledge cutoff?

Roughly December 2023, with some refreshes.

Does Meta train on my prompts here?

No — we route through third-party hosts, not Meta directly. None of our hosts should be training on API traffic; if a specific upstream reserves that right, their terms are linked in the model details.

What about Llama 4?

As soon as Meta ships it, our upstream aggregator adds it and it shows up in the picker. Set-and-forget.

Or try a different model

Your Faceb.ai tokens work for every model — switch per message, no extra subscriptions.

Ready to chat?

One subscription covers every frontier model — switch between them per message. No extra API keys, no extra bills.

Start chatting with Llama 3.3 70B → Go Pro · $14.99/mo