Every frontier AI model, one chat

GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash, Llama 3.3, DeepSeek V3, Grok 2, Mistral, Qwen and more — from a single account, one credit balance, one bill. Switch per message. 345 models indexed and kept current.

OpenAI · 61

OpenAI

GPT-4o

OpenAI's flagship multimodal model — text, vision, and code.

Chat with GPT-4o →
OpenAI

GPT-4o mini

OpenAI's fastest, cheapest frontier model — great default.

Chat with GPT-4o mini →
OpenAI

OpenAI: gpt-oss-120b (free)

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose pro

Chat with OpenAI: gpt-oss-120b (free) →
OpenAI

OpenAI: gpt-oss-20b (free)

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B act

Chat with OpenAI: gpt-oss-20b (free) →
OpenAI

OpenAI: GPT Audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and mainta

Chat with OpenAI: GPT Audio →
OpenAI

OpenAI: GPT Audio Mini

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. In

Chat with OpenAI: GPT Audio Mini →
OpenAI

OpenAI: GPT-3.5 Turbo

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.

Chat with OpenAI: GPT-3.5 Turbo →
OpenAI

OpenAI: GPT-3.5 Turbo (older v0613)

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks.

Chat with OpenAI: GPT-3.5 Turbo (older v0613) →
OpenAI

OpenAI: GPT-3.5 Turbo 16k

This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Tr

Chat with OpenAI: GPT-3.5 Turbo 16k →
OpenAI

OpenAI: GPT-3.5 Turbo Instruct

This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.

Chat with OpenAI: GPT-3.5 Turbo Instruct →
OpenAI

OpenAI: GPT-4

OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due t

Chat with OpenAI: GPT-4 →
OpenAI

OpenAI: GPT-4 (older v0314)

GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data: up to Sep 2021.

Chat with OpenAI: GPT-4 (older v0314) →
OpenAI

OpenAI: GPT-4 Turbo

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.

Chat with OpenAI: GPT-4 Turbo →
OpenAI

OpenAI: GPT-4 Turbo (older v1106)

The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.

Chat with OpenAI: GPT-4 Turbo (older v1106) →
OpenAI

OpenAI: GPT-4 Turbo Preview

The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023

Chat with OpenAI: GPT-4 Turbo Preview →
OpenAI

OpenAI: GPT-4.1

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It support

Chat with OpenAI: GPT-4.1 →
OpenAI

OpenAI: GPT-4.1 Mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context w

Chat with OpenAI: GPT-4.1 Mini →
OpenAI

OpenAI: GPT-4.1 Nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size wit

Chat with OpenAI: GPT-4.1 Nano →
OpenAI

OpenAI: GPT-4o (2024-05-13)

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turb

Chat with OpenAI: GPT-4o (2024-05-13) →
OpenAI

OpenAI: GPT-4o (2024-08-06)

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format. Read more [h

Chat with OpenAI: GPT-4o (2024-08-06) →
OpenAI

OpenAI: GPT-4o (2024-11-20)

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readabili

Chat with OpenAI: GPT-4o (2024-11-20) →
OpenAI

OpenAI: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add dep

Chat with OpenAI: GPT-4o Audio →
OpenAI

OpenAI: GPT-4o Search Preview

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

Chat with OpenAI: GPT-4o Search Preview →
OpenAI

OpenAI: GPT-4o-mini (2024-07-18)

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced

Chat with OpenAI: GPT-4o-mini (2024-07-18) →
OpenAI

OpenAI: GPT-4o-mini Search Preview

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

Chat with OpenAI: GPT-4o-mini Search Preview →
OpenAI

OpenAI: GPT-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that requi

Chat with OpenAI: GPT-5 →
OpenAI

OpenAI: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

Chat with OpenAI: GPT-5 Chat →
OpenAI

OpenAI: GPT-5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions

Chat with OpenAI: GPT-5 Codex →
OpenAI

OpenAI: GPT-5 Image

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities. It offers major improvement

Chat with OpenAI: GPT-5 Image →
OpenAI

OpenAI: GPT-5 Image Mini

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for e

Chat with OpenAI: GPT-5 Image Mini →
OpenAI

OpenAI: GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefi

Chat with OpenAI: GPT-5 Mini →
OpenAI

OpenAI: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While

Chat with OpenAI: GPT-5 Nano →
OpenAI

OpenAI: GPT-5 Pro

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that r

Chat with OpenAI: GPT-5 Pro →
OpenAI

OpenAI: GPT-5.1

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural

Chat with OpenAI: GPT-5.1 →
OpenAI

OpenAI: GPT-5.1 Chat

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses

Chat with OpenAI: GPT-5.1 Chat →
OpenAI

OpenAI: GPT-5.1-Codex

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessi

Chat with OpenAI: GPT-5.1-Codex →
OpenAI

OpenAI: GPT-5.1-Codex-Max

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version

Chat with OpenAI: GPT-5.1-Codex-Max →
OpenAI

OpenAI: GPT-5.1-Codex-Mini

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

Chat with OpenAI: GPT-5.1-Codex-Mini →
OpenAI

OpenAI: GPT-5.2

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reas

Chat with OpenAI: GPT-5.2 →
OpenAI

OpenAI: GPT-5.2 Chat

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It use

Chat with OpenAI: GPT-5.2 Chat →
OpenAI

OpenAI: GPT-5.2 Pro

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for comp

Chat with OpenAI: GPT-5.2 Pro →
OpenAI

OpenAI: GPT-5.2-Codex

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development s

Chat with OpenAI: GPT-5.2-Codex →
OpenAI

OpenAI: GPT-5.3 Chat

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accu

Chat with OpenAI: GPT-5.3 Chat →
OpenAI

OpenAI: GPT-5.3-Codex

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasonin

Chat with OpenAI: GPT-5.3-Codex →
OpenAI

OpenAI: GPT-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K outpu

Chat with OpenAI: GPT-5.4 →
OpenAI

OpenAI: GPT-5.4 Mini

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image input

Chat with OpenAI: GPT-5.4 Mini →
OpenAI

OpenAI: GPT-5.4 Nano

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and

Chat with OpenAI: GPT-5.4 Nano →
OpenAI

OpenAI: GPT-5.4 Pro

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It

Chat with OpenAI: GPT-5.4 Pro →
OpenAI

OpenAI: gpt-oss-120b

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose pro

Chat with OpenAI: gpt-oss-120b →
OpenAI

OpenAI: gpt-oss-20b

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B act

Chat with OpenAI: gpt-oss-20b →
OpenAI

OpenAI: gpt-oss-safeguard-20b

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lowe

Chat with OpenAI: gpt-oss-safeguard-20b →
OpenAI

OpenAI: o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale

Chat with OpenAI: o1 →
OpenAI

OpenAI: o1-pro

The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to

Chat with OpenAI: o1-pro →
OpenAI

OpenAI: o3

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technica

Chat with OpenAI: o3 →
OpenAI

OpenAI: o3 Deep Research

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks.

Chat with OpenAI: o3 Deep Research →
OpenAI

OpenAI: o3 Mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model sup

Chat with OpenAI: o3 Mini →
OpenAI

OpenAI: o3 Mini High

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for

Chat with OpenAI: o3 Mini High →
OpenAI

OpenAI: o3 Pro

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to

Chat with OpenAI: o3 Pro →
OpenAI

OpenAI: o4 Mini

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabi

Chat with OpenAI: o4 Mini →
OpenAI

OpenAI: o4 Mini Deep Research

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks.

Chat with OpenAI: o4 Mini Deep Research →
OpenAI

OpenAI: o4 Mini High

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-seri

Chat with OpenAI: o4 Mini High →

Anthropic · 15

Anthropic

Claude 3.5 Sonnet

Anthropic's best balance of quality and cost — a coder favourite.

Chat with Claude 3.5 Sonnet →
Anthropic

Claude 3 Haiku

Anthropic's fastest, cheapest model — near-instant replies.

Chat with Claude 3 Haiku →
Anthropic

Anthropic: Claude 3.5 Haiku

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick

Chat with Anthropic: Claude 3.5 Haiku →
Anthropic

Anthropic: Claude 3.7 Sonnet

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approa

Chat with Anthropic: Claude 3.7 Sonnet →
Anthropic

Anthropic: Claude 3.7 Sonnet (thinking)

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approa

Chat with Anthropic: Claude 3.7 Sonnet (thinking) →
Anthropic

Anthropic: Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude m

Chat with Anthropic: Claude Haiku 4.5 →
Anthropic

Anthropic: Claude Opus 4

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workf

Chat with Anthropic: Claude Opus 4 →
Anthropic

Anthropic: Claude Opus 4.1

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on

Chat with Anthropic: Claude Opus 4.1 →
Anthropic

Anthropic: Claude Opus 4.5

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers

Chat with Anthropic: Claude Opus 4.5 →
Anthropic

Anthropic: Claude Opus 4.6

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than

Chat with Anthropic: Claude Opus 4.6 →
Anthropic

Anthropic: Claude Opus 4.6 (Fast)

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing.

Chat with Anthropic: Claude Opus 4.6 (Fast) →
Anthropic

Anthropic: Claude Opus 4.7

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the coding and agentic strengths of Opus 4.

Chat with Anthropic: Claude Opus 4.7 →
Anthropic

Anthropic: Claude Sonnet 4

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and

Chat with Anthropic: Claude Sonnet 4 →
Anthropic

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performa

Chat with Anthropic: Claude Sonnet 4.5 →
Anthropic

Anthropic: Claude Sonnet 4.6

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative de

Chat with Anthropic: Claude Sonnet 4.6 →

Google · 32

Google

Gemini 2.0 Flash

Google's fast multimodal model with a 1M-token context window.

Chat with Gemini 2.0 Flash →
Google

Google: Gemma 3 12B (free)

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 language

Chat with Google: Gemma 3 12B (free) →
Google

Google: Gemma 3 27B (free)

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 language

Chat with Google: Gemma 3 27B (free) →
Google

Google: Gemma 3 4B (free)

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 language

Chat with Google: Gemma 3 4B (free) →
Google

Google: Gemma 3n 2B (free)

Gemma 3n E2B IT is a multimodal, instruction-tuned model developed by Google DeepMind, designed to operate efficiently at an effective parameter size of 2B whil

Chat with Google: Gemma 3n 2B (free) →
Google

Google: Gemma 3n 4B (free)

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—inc

Chat with Google: Gemma 3n 4B (free) →
Google

Google: Gemma 4 26B A4B (free)

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token dur

Chat with Google: Gemma 4 26B A4B (free) →
Google

Google: Gemma 4 31B (free)

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, c

Chat with Google: Gemma 4 31B (free) →
Google

Google: Lyria 3 Clip Preview

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, yo

Chat with Google: Lyria 3 Clip Preview →
Google

Google: Lyria 3 Pro Preview

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can g

Chat with Google: Lyria 3 Pro Preview →
Google

Google: Gemini 2.0 Flash

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on

Chat with Google: Gemini 2.0 Flash →
Google

Google: Gemini 2.0 Flash Lite

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quali

Chat with Google: Gemini 2.0 Flash Lite →
Google

Google: Gemini 2.5 Flash

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It inclu

Chat with Google: Gemini 2.5 Flash →
Google

Google: Gemini 2.5 Flash Lite

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved through

Chat with Google: Gemini 2.5 Flash Lite →
Google

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved through

Chat with Google: Gemini 2.5 Flash Lite Preview 09-2025 →
Google

Google: Gemini 2.5 Pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilit

Chat with Google: Gemini 2.5 Pro →
Google

Google: Gemini 2.5 Pro Preview 05-06

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilit

Chat with Google: Gemini 2.5 Pro Preview 05-06 →
Google

Google: Gemini 2.5 Pro Preview 06-05

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilit

Chat with Google: Gemini 2.5 Pro Preview 06-05 →
Google

Google: Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro l

Chat with Google: Gemini 3 Flash Preview →
Google

Google: Gemini 3.1 Flash Lite Preview

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and

Chat with Google: Gemini 3.1 Flash Lite Preview →
Google

Google: Gemini 3.1 Pro Preview

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more effici

Chat with Google: Gemini 3.1 Pro Preview →
Google

Google: Gemini 3.1 Pro Preview Custom Tools

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more

Chat with Google: Gemini 3.1 Pro Preview Custom Tools →
Google

Google: Gemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-s

Chat with Google: Gemma 2 27B →
Google

Google: Gemma 3 12B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 language

Chat with Google: Gemma 3 12B →
Google

Google: Gemma 3 27B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 language

Chat with Google: Gemma 3 27B →
Google

Google: Gemma 3 4B

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 language

Chat with Google: Gemma 3 4B →
Google

Google: Gemma 3n 4B

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—inc

Chat with Google: Gemma 3n 4B →
Google

Google: Gemma 4 26B A4B

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token dur

Chat with Google: Gemma 4 26B A4B →
Google

Google: Gemma 4 31B

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, c

Chat with Google: Gemma 4 31B →
Google

Google: Nano Banana (Gemini 2.5 Flash Image)

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is c

Chat with Google: Nano Banana (Gemini 2.5 Flash Image) →
Google

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual qual

Chat with Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) →
Google

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly impr

Chat with Google: Nano Banana Pro (Gemini 3 Pro Image Preview) →

Meta · 14

Meta

Llama 3.3 70B

Meta's best open-weight model — runs on community hosts.

Chat with Llama 3.3 70B →
Meta

Meta: Llama 3.2 3B Instruct (free)

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reas

Chat with Meta: Llama 3.2 3B Instruct (free) →
Meta

Meta: Llama 3.3 70B Instruct (free)

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instr

Chat with Meta: Llama 3.3 70B Instruct (free) →
Meta

Llama Guard 3 8B

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content

Chat with Llama Guard 3 8B →
Meta

Meta: Llama 3 70B Instruct

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 70B instruct-tuned version was optimized for high quality dialogue useca

Chat with Meta: Llama 3 70B Instruct →
Meta

Meta: Llama 3 8B Instruct

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecas

Chat with Meta: Llama 3 8B Instruct →
Meta

Meta: Llama 3.1 70B Instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 70B instruct-tuned version is optimized for high quality dialogue usec

Chat with Meta: Llama 3.1 70B Instruct →
Meta

Meta: Llama 3.1 8B Instruct

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This 8B instruct-tuned version is fast and efficient. It has demonstrated s

Chat with Meta: Llama 3.1 8B Instruct →
Meta

Meta: Llama 3.2 11B Vision Instruct

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as im

Chat with Meta: Llama 3.2 11B Vision Instruct →
Meta

Meta: Llama 3.2 1B Instruct

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as summarization, dialogue, and multilingual

Chat with Meta: Llama 3.2 1B Instruct →
Meta

Meta: Llama 3.2 3B Instruct

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language processing tasks like dialogue generation, reas

Chat with Meta: Llama 3.2 3B Instruct →
Meta

Meta: Llama 4 Maverick

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts a

Chat with Meta: Llama 4 Maverick →
Meta

Meta: Llama 4 Scout

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It sup

Chat with Meta: Llama 4 Scout →
Meta

Meta: Llama Guard 4 12B

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used

Chat with Meta: Llama Guard 4 12B →

xAI · 11

xAI

Grok 2

xAI's conversational model — blunt, fast, current.

Chat with Grok 2 →
xAI

xAI: Grok 3

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possess

Chat with xAI: Grok 3 →
xAI

xAI: Grok 3 Beta

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possess

Chat with xAI: Grok 3 Beta →
xAI

xAI: Grok 3 Mini

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking trac

Chat with xAI: Grok 3 Mini →
xAI

xAI: Grok 3 Mini Beta

Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that generate answers immediately, Grok 3 Mini thinks before responding. It’s id

Chat with xAI: Grok 3 Mini Beta →
xAI

xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note t

Chat with xAI: Grok 4 →
xAI

xAI: Grok 4 Fast

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read

Chat with xAI: Grok 4 Fast →
xAI

xAI: Grok 4.1 Fast

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning

Chat with xAI: Grok 4.1 Fast →
xAI

xAI: Grok 4.20

Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the mar

Chat with xAI: Grok 4.20 →
xAI

xAI: Grok 4.20 Multi-Agent

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep res

Chat with xAI: Grok 4.20 Multi-Agent →
xAI

xAI: Grok Code Fast 1

Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. With reasoning traces visible in the response, developers can steer G

Chat with xAI: Grok Code Fast 1 →

DeepSeek · 11

DeepSeek

DeepSeek V3

The open-weight model punching way above its price tag.

Chat with DeepSeek V3 →
DeepSeek

DeepSeek: DeepSeek V3 0324

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [Deep

Chat with DeepSeek: DeepSeek V3 0324 →
DeepSeek

DeepSeek: DeepSeek V3.1

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extend

Chat with DeepSeek: DeepSeek V3.1 →
DeepSeek

DeepSeek: DeepSeek V3.1 Terminus

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues re

Chat with DeepSeek: DeepSeek V3.1 Terminus →
DeepSeek

DeepSeek: DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduce

Chat with DeepSeek: DeepSeek V3.2 →
DeepSeek

DeepSeek: DeepSeek V3.2 Exp

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces Deep

Chat with DeepSeek: DeepSeek V3.2 Exp →
DeepSeek

DeepSeek: DeepSeek V3.2 Speciale

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attentio

Chat with DeepSeek: DeepSeek V3.2 Speciale →
DeepSeek

DeepSeek: R1

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with

Chat with DeepSeek: R1 →
DeepSeek

DeepSeek: R1 0528

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reaso

Chat with DeepSeek: R1 0528 →
DeepSeek

DeepSeek: R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [Dee

Chat with DeepSeek: R1 Distill Llama 70B →
DeepSeek

DeepSeek: R1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek

Chat with DeepSeek: R1 Distill Qwen 32B →

Mistral AI · 25

Mistral AI

Mistral Large

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-available model and excels at reasoning, code, J

Chat with Mistral Large →
Mistral AI

Mistral Large 2407

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSO

Chat with Mistral Large 2407 →
Mistral AI

Mistral Large 2411

Mistral Large 2 2411 is an update of [Mistral Large 2](/mistralai/mistral-large) released together with [Pixtral Large 2411](/mistralai/pixtral-large-2411) It p

Chat with Mistral Large 2411 →
Mistral AI

Mistral: Codestral 2508

Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middl

Chat with Mistral: Codestral 2508 →
Mistral AI

Mistral: Devstral 2 2512

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256

Chat with Mistral: Devstral 2 2512 →
Mistral AI

Mistral: Devstral Medium

Devstral Medium is a high-performance code generation and agentic reasoning model developed jointly by Mistral AI and All Hands AI. Positioned as a step up from

Chat with Mistral: Devstral Medium →
Mistral AI

Mistral: Devstral Small 1.1

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Fi

Chat with Mistral: Devstral Small 1.1 →
Mistral AI

Mistral: Ministral 3 14B 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counter

Chat with Mistral: Ministral 3 14B 2512 →
Mistral AI

Mistral: Ministral 3 3B 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

Chat with Mistral: Ministral 3 3B 2512 →
Mistral AI

Mistral: Ministral 3 8B 2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

Chat with Mistral: Ministral 3 8B 2512 →
Mistral AI

Mistral: Mistral 7B Instruct v0.1

A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length.

Chat with Mistral: Mistral 7B Instruct v0.1 →
Mistral AI

Mistral: Mistral Large 3 2512

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and re

Chat with Mistral: Mistral Large 3 2512 →
Mistral AI

Mistral: Mistral Medium 3

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost

Chat with Mistral: Mistral Medium 3 →
Mistral AI

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level cap

Chat with Mistral: Mistral Medium 3.1 →
Mistral AI

Mistral: Mistral Nemo

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, Ger

Chat with Mistral: Mistral Nemo →
Mistral AI

Mistral: Mistral Small 3

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it featur

Chat with Mistral: Mistral Small 3 →
Mistral AI

Mistral: Mistral Small 3.1 24B

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provi

Chat with Mistral: Mistral Small 3.1 24B →
Mistral AI

Mistral: Mistral Small 3.2 24B

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved funct

Chat with Mistral: Mistral Small 3.2 24B →
Mistral AI

Mistral: Mistral Small 4

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It com

Chat with Mistral: Mistral Small 4 →
Mistral AI

Mistral: Mistral Small Creative

Mistral Small Creative is an experimental small model designed for creative writing, narrative generation, roleplay and character-driven dialogue, general-purpo

Chat with Mistral: Mistral Small Creative →
Mistral AI

Mistral: Mixtral 8x22B Instruct

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparall

Chat with Mistral: Mixtral 8x22B Instruct →
Mistral AI

Mistral: Mixtral 8x7B Instruct

Mixtral 8x7B Instruct is a pretrained generative Sparse Mixture of Experts, by Mistral AI, for chat and instruction use. Incorporates 8 experts (feed-forward ne

Chat with Mistral: Mixtral 8x7B Instruct →
Mistral AI

Mistral: Pixtral Large 2411

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-large-2411). The model is able to understa

Chat with Mistral: Pixtral Large 2411 →
Mistral AI

Mistral: Saba

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses

Chat with Mistral: Saba →
Mistral AI

Mistral: Voxtral Small 24B 2507

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It e

Chat with Mistral: Voxtral Small 24B 2507 →

Alibaba Qwen · 47

Alibaba Qwen

Qwen: Qwen3 Coder 480B A35B (free)

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as

Chat with Qwen: Qwen3 Coder 480B A35B (free) →
Alibaba Qwen

Qwen: Qwen3 Next 80B A3B Instruct (free)

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targe

Chat with Qwen: Qwen3 Next 80B A3B Instruct (free) →
Alibaba Qwen

Qwen2.5 72B Instruct

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has gre

Chat with Qwen2.5 72B Instruct →
Alibaba Qwen

Qwen2.5 Coder 32B Instruct

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upo

Chat with Qwen2.5 Coder 32B Instruct →
Alibaba Qwen

Qwen: Qwen Plus 0728

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

Chat with Qwen: Qwen Plus 0728 →
Alibaba Qwen

Qwen: Qwen Plus 0728 (thinking)

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

Chat with Qwen: Qwen Plus 0728 (thinking) →
Alibaba Qwen

Qwen: Qwen VL Max

Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in delivering optimal performance for a broader spectrum of complex tasks

Chat with Qwen: Qwen VL Max →
Alibaba Qwen

Qwen: Qwen VL Plus

Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high

Chat with Qwen: Qwen VL Plus →
Alibaba Qwen

Qwen: Qwen-Max

Qwen-Max, based on Qwen2.5, provides the best inference performance among [Qwen models](/qwen), especially for complex multi-step tasks. It's a large-scale MoE

Chat with Qwen: Qwen-Max →
Alibaba Qwen

Qwen: Qwen-Plus

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.

Chat with Qwen: Qwen-Plus →
Alibaba Qwen

Qwen: Qwen-Turbo

Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost, suitable for simple tasks.

Chat with Qwen: Qwen-Turbo →
Alibaba Qwen

Qwen: Qwen2.5 7B Instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has grea

Chat with Qwen: Qwen2.5 7B Instruct →
Alibaba Qwen

Qwen: Qwen2.5 VL 72B Instruct

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, g

Chat with Qwen: Qwen2.5 VL 72B Instruct →
Alibaba Qwen

Qwen: Qwen3 14B

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamle

Chat with Qwen: Qwen3 14B →
Alibaba Qwen

Qwen: Qwen3 235B A22B

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching

Chat with Qwen: Qwen3 235B A22B →
Alibaba Qwen

Qwen: Qwen3 235B A22B Instruct 2507

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active param

Chat with Qwen: Qwen3 235B A22B Instruct 2507 →
Alibaba Qwen

Qwen: Qwen3 235B A22B Thinking 2507

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B

Chat with Qwen: Qwen3 235B A22B Thinking 2507 →
Alibaba Qwen

Qwen: Qwen3 30B A3B

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, mult

Chat with Qwen: Qwen3 30B A3B →
Alibaba Qwen

Qwen: Qwen3 30B A3B Instruct 2507

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thin

Chat with Qwen: Qwen3 30B A3B Instruct 2507 →
Alibaba Qwen

Qwen: Qwen3 30B A3B Thinking 2507

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model

Chat with Qwen: Qwen3 30B A3B Thinking 2507 →
Alibaba Qwen

Qwen: Qwen3 32B

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seaml

Chat with Qwen: Qwen3 32B →
Alibaba Qwen

Qwen: Qwen3 8B

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seam

Chat with Qwen: Qwen3 8B →
Alibaba Qwen

Qwen: Qwen3 Coder 30B A3B Instruct

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code genera

Chat with Qwen: Qwen3 Coder 30B A3B Instruct →
Alibaba Qwen

Qwen: Qwen3 Coder 480B A35B

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as

Chat with Qwen: Qwen3 Coder 480B A35B →
Alibaba Qwen

Qwen: Qwen3 Coder Flash

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autono

Chat with Qwen: Qwen3 Coder Flash →
Alibaba Qwen

Qwen: Qwen3 Coder Next

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total

Chat with Qwen: Qwen3 Coder Next →
Alibaba Qwen

Qwen: Qwen3 Coder Plus

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous progr

Chat with Qwen: Qwen3 Coder Plus →
Alibaba Qwen

Qwen: Qwen3 Max

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail

Chat with Qwen: Qwen3 Max →
Alibaba Qwen

Qwen: Qwen3 Max Thinking

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By sig

Chat with Qwen: Qwen3 Max Thinking →
Alibaba Qwen

Qwen: Qwen3 Next 80B A3B Instruct

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targe

Chat with Qwen: Qwen3 Next 80B A3B Instruct →
Alibaba Qwen

Qwen: Qwen3 Next 80B A3B Thinking

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard

Chat with Qwen: Qwen3 Next 80B A3B Thinking →
Alibaba Qwen

Qwen: Qwen3 VL 235B A22B Instruct

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instru

Chat with Qwen: Qwen3 VL 235B A22B Instruct →
Alibaba Qwen

Qwen: Qwen3 VL 235B A22B Thinking

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is o

Chat with Qwen: Qwen3 VL 235B A22B Thinking →
Alibaba Qwen

Qwen: Qwen3 VL 30B A3B Instruct

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimi

Chat with Qwen: Qwen3 VL 30B A3B Instruct →
Alibaba Qwen

Qwen: Qwen3 VL 30B A3B Thinking

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhanc

Chat with Qwen: Qwen3 VL 30B A3B Thinking →
Alibaba Qwen

Qwen: Qwen3 VL 32B Instruct

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video.

Chat with Qwen: Qwen3 VL 32B Instruct →
Alibaba Qwen

Qwen: Qwen3 VL 8B Instruct

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, an

Chat with Qwen: Qwen3 VL 8B Instruct →
Alibaba Qwen

Qwen: Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex s

Chat with Qwen: Qwen3 VL 8B Thinking →
Alibaba Qwen

Qwen: Qwen3.5 397B A17B

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-o

Chat with Qwen: Qwen3.5 397B A17B →
Alibaba Qwen

Qwen: Qwen3.5 Plus 2026-02-15

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-expe

Chat with Qwen: Qwen3.5 Plus 2026-02-15 →
Alibaba Qwen

Qwen: Qwen3.5-122B-A10B

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-exper

Chat with Qwen: Qwen3.5-122B-A10B →
Alibaba Qwen

Qwen: Qwen3.5-27B

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and

Chat with Qwen: Qwen3.5-27B →
Alibaba Qwen

Qwen: Qwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixtur

Chat with Qwen: Qwen3.5-35B-A3B →
Alibaba Qwen

Qwen: Qwen3.5-9B

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-p

Chat with Qwen: Qwen3.5-9B →
Alibaba Qwen

Qwen: Qwen3.5-Flash

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts

Chat with Qwen: Qwen3.5-Flash →
Alibaba Qwen

Qwen: Qwen3.6 Plus

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and h

Chat with Qwen: Qwen3.6 Plus →
Alibaba Qwen

Qwen: QwQ 32B

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve

Chat with Qwen: QwQ 32B →

Ai21 · 1

Aion Labs · 4

Alfredpros · 1

Alibaba · 1

Allenai · 2

Alpindale · 1

Amazon · 5

Anthracite Org · 1

Arcee Ai · 7

Arcee Ai

Arcee AI: Trinity Large Preview (free)

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters

Chat with Arcee AI: Trinity Large Preview (free) →
Arcee Ai

Arcee AI: Coder Large

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fi

Chat with Arcee AI: Coder Large →
Arcee Ai

Arcee AI: Maestro Reasoning

Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic

Chat with Arcee AI: Maestro Reasoning →
Arcee Ai

Arcee AI: Spotlight

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text grounding tasks. It offers a 3

Chat with Arcee AI: Spotlight →
Arcee Ai

Arcee AI: Trinity Large Thinking

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and re

Chat with Arcee AI: Trinity Large Thinking →
Arcee Ai

Arcee AI: Trinity Mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient rea

Chat with Arcee AI: Trinity Mini →
Arcee Ai

Arcee AI: Virtuoso Large

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike ma

Chat with Arcee AI: Virtuoso Large →

Baidu · 5

Bytedance · 1

Bytedance Seed · 4

Cognitivecomputations · 1

Cohere · 4

Deepcogito · 1

Essentialai · 1

Gryphe · 1

Ibm Granite · 1

Inception · 1

Inflection · 2

Kwaipilot · 1

Liquid · 3

Mancer · 1

Microsoft · 2

Minimax · 8

Minimax

MiniMax: MiniMax M2.5 (free)

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments,

Chat with MiniMax: MiniMax M2.5 (free) →
Minimax

MiniMax: MiniMax M1

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (

Chat with MiniMax: MiniMax M1 →
Minimax

MiniMax: MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 b

Chat with MiniMax: MiniMax M2 →
Minimax

MiniMax: MiniMax M2-her

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and expressive multi-turn conversations. Designed t

Chat with MiniMax: MiniMax M2-her →
Minimax

MiniMax: MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 b

Chat with MiniMax: MiniMax M2.1 →
Minimax

MiniMax: MiniMax M2.5

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex real-world digital working environments,

Chat with MiniMax: MiniMax M2.5 →
Minimax

MiniMax: MiniMax M2.7

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous improvement. Built to actively participat

Chat with MiniMax: MiniMax M2.7 →
Minimax

MiniMax: MiniMax-01

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion paramet

Chat with MiniMax: MiniMax-01 →

Moonshotai · 4

Morph · 2

Nex Agi · 1

Nous Research · 6

NVIDIA · 10

NVIDIA

NVIDIA: Nemotron 3 Nano 30B A3B (free)

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems

Chat with NVIDIA: Nemotron 3 Nano 30B A3B (free) →
NVIDIA

NVIDIA: Nemotron 3 Super (free)

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-a

Chat with NVIDIA: Nemotron 3 Super (free) →
NVIDIA

NVIDIA: Nemotron Nano 12B 2 VL (free)

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a

Chat with NVIDIA: Nemotron Nano 12B 2 VL (free) →
NVIDIA

NVIDIA: Nemotron Nano 9B V2 (free)

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning

Chat with NVIDIA: Nemotron Nano 9B V2 (free) →
NVIDIA

NVIDIA: Llama 3.1 Nemotron 70B Instruct

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.

Chat with NVIDIA: Llama 3.1 Nemotron 70B Instruct →
NVIDIA

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s

Chat with NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 →
NVIDIA

NVIDIA: Nemotron 3 Nano 30B A3B

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems

Chat with NVIDIA: Nemotron 3 Nano 30B A3B →
NVIDIA

NVIDIA: Nemotron 3 Super

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-a

Chat with NVIDIA: Nemotron 3 Super →
NVIDIA

NVIDIA: Nemotron Nano 12B 2 VL

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence. It introduces a

Chat with NVIDIA: Nemotron Nano 12B 2 VL →
NVIDIA

NVIDIA: Nemotron Nano 9B V2

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning

Chat with NVIDIA: Nemotron Nano 9B V2 →

Openrouter · 4

Perplexity · 5

Prime Intellect · 1

Rekaai · 2

Relace · 2

Sao10K · 5

Stepfun · 1

Switchpoint · 1

Tencent · 1

Thedrummer · 4

Tngtech · 1

Undi95 · 1

Upstage · 1

Writer · 1

Xiaomi · 3

Z Ai · 13

Z Ai

Z.ai: GLM 4.5 Air (free)

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixtu

Chat with Z.ai: GLM 4.5 Air (free) →
Z Ai

Z.ai: GLM 4 32B

GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex tasks and has significantly enhanced capabilities in tool use, onlin

Chat with Z.ai: GLM 4 32B →
Z Ai

Z.ai: GLM 4.5

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a

Chat with Z.ai: GLM 4.5 →
Z Ai

Z.ai: GLM 4.5 Air

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixtu

Chat with Z.ai: GLM 4.5 Air →
Z Ai

Z.ai: GLM 4.5V

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B

Chat with Z.ai: GLM 4.5V →
Z Ai

Z.ai: GLM 4.6

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has been expanded from 128K to 200K tokens, en

Chat with Z.ai: GLM 4.6 →
Z Ai

Z.ai: GLM 4.6V

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It su

Chat with Z.ai: GLM 4.6V →
Z Ai

Z.ai: GLM 4.7

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution.

Chat with Z.ai: GLM 4.7 →
Z Ai

Z.ai: GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, str

Chat with Z.ai: GLM 4.7 Flash →
Z Ai

Z.ai: GLM 5

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workflows. Built for expert developers, it de

Chat with Z.ai: GLM 5 →
Z Ai

Z.ai: GLM 5 Turbo

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments such as OpenClaw scenarios. It is deeply op

Chat with Z.ai: GLM 5 Turbo →
Z Ai

Z.ai: GLM 5.1

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minu

Chat with Z.ai: GLM 5.1 →
Z Ai

Z.ai: GLM 5V Turbo

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, a

Chat with Z.ai: GLM 5V Turbo →

Or skip the picking and just start chatting — you can change models any time.

Open chat