Chat with Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s

No signup required — try it as a guest. 30,000 free tokens every day once you sign up.

Provider
NVIDIA
Model slug
nvidia/llama-3.3-nemotron-super-49b-v1.5
Typical cost
Around 270–675 tokens per typical message. 15M Pro tokens buy roughly 22,222–55,…
Availability
On Faceb.ai · chat + API

About Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code

What it's good at

1

131,072-token context window — enough for long documents.

2

Extremely low per-token price compared to frontier models — good for high-volume workloads.

3

Hosted by NVIDIA — you can access it here alongside GPT-4o, Claude, Gemini and 100+ more on one plan.

4

Switch to any other model mid-conversation from the picker.

Pricing on Faceb.ai

Around 270–675 tokens per typical message. 15M Pro tokens buy roughly 22,222–55,555 messages.

Use Llama 3.3 Nemotron Super 49B V1.5 from the API

OpenAI-compatible. Same Faceb.ai tokens cover chat and API. Drop-in replacement for the OpenAI SDK.

curl https://api.faceb.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-faceb-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/llama-3.3-nemotron-super-49b-v1.5",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'
from openai import OpenAI

client = OpenAI(
    base_url="https://api.faceb.ai/v1",
    api_key="sk-faceb-YOUR_KEY",
)

stream = client.chat.completions.create(
    model="nvidia/llama-3.3-nemotron-super-49b-v1.5",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.faceb.ai/v1",
  apiKey: "sk-faceb-YOUR_KEY",
});

const stream = await client.chat.completions.create({
  model: "nvidia/llama-3.3-nemotron-super-49b-v1.5",
  messages: [{ role: "user", content: "Hello!" }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Also works for image generation (image-output model slugs return image_url content parts) and web search (add "web_search": true to the payload). Same endpoint, same wallet.

Full API docs → · Get an API key →

Frequently asked — Llama 3.3 Nemotron Super 49B V1.5

What is Llama 3.3 Nemotron Super 49B V1.5?

Llama 3.3 Nemotron Super 49B V1.5 is a chat/completion model served by NVIDIA and accessed through Faceb.ai. Try it without signing up — guest users get a small pool of session tokens to experiment.

Is Llama 3.3 Nemotron Super 49B V1.5 free on Faceb.ai?

You get 30,000 tokens free every day — usually enough for a handful of messages a day on this model. Need more? Pro is $14.99/month for 15M tokens, or top-ups from $5.

What's Llama 3.3 Nemotron Super 49B V1.5's context window?

131,072 tokens. Paste your source material in, no need to truncate.

Can I call Llama 3.3 Nemotron Super 49B V1.5 from the API?

Yes. Any API key from /account/api/ works with model slug `nvidia/llama-3.3-nemotron-super-49b-v1.5`. The OpenAI SDK works with base_url=https://api.faceb.ai/v1.

How does Llama 3.3 Nemotron Super 49B V1.5 compare to GPT-4o or Claude?

Depends on the task. Faceb.ai lets you switch models per message — benchmark side-by-side by asking both the same prompt, which is more reliable than abstract comparisons.

How much does Llama 3.3 Nemotron Super 49B V1.5 cost per message here?

Around 270–675 tokens per typical message. 15M Pro tokens buy roughly 22,222–55,555 messages.

Does Faceb.ai train on my Llama 3.3 Nemotron Super 49B V1.5 prompts?

No. We contractually request that upstream providers not train on content routed through us. Your chat history lives only on your account.

Is Llama 3.3 Nemotron Super 49B V1.5 good for coding?

It depends on the model size and training mix. For serious code work the developer favourites are Claude 3.5 Sonnet and DeepSeek V3; for quick edits, most capable models work fine.

Can I use Llama 3.3 Nemotron Super 49B V1.5 on the API with the OpenAI SDK?

Yes — point your SDK at https://api.faceb.ai/v1 and use this model's slug as the model parameter. Everything else works as normal.

Does Llama 3.3 Nemotron Super 49B V1.5 support image inputs?

Check the model catalog — multimodal models are marked in the picker. If it accepts images, you can drop screenshots and diagrams straight into the chat.

Can I switch from Llama 3.3 Nemotron Super 49B V1.5 to another model mid-chat?

Yes — the picker is always at the top of the chat. Previous context carries over.

Will newer versions of Llama 3.3 Nemotron Super 49B V1.5 show up here automatically?

Yes. Our catalog auto-fetches from the upstream aggregator, so provider updates and new versions appear in the picker as soon as they're available.

Or try a different model

Your Faceb.ai tokens work for every model — switch per message, no extra subscriptions.

Ready to chat?

One subscription covers every frontier model — switch between them per message. No extra API keys, no extra bills.

Start chatting with Llama 3.3 Nemotron Super 49B V1.5 → Go Pro · $14.99/mo