Question 1

What is Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini about?

Accepted Answer

On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced Mercury 2, a large language model built on diffusion rather than the standard autoregressive approach. Traditional LLMs generate text token by token from left to right, which Ermon describes as “fancy autocomplete.” In contrast, diffusion models begin with a rough draft and refine it in parallel, similar to image systems like Stable Diffusion. This parallel process allows Mercury 2 to produce over 1,000 tokens per second—five to ten times faster than optimized models from labs such as OpenAI, Anthropic, and Google, according to company tests. Ermon argues diffusion models better leverage GPUs, with support from investor Nvidia to optimize performance. While Mercury 2 matches mid-tier models like Claude Haiku and Google Flash rather than top systems such as Claude Opus or GPT-4, Ermon believes diffusion’s speed and economic advantages will become increasingly compelling as AI applications scale. Learn more from The New Stack about the latest developments around around large language model built on diffusion: How Diffusion-Based LLM AI Speeds Up Reasoning Get Ready for Faster Text Generation With Diffusion LLMs Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Question 2

Where can I listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini?

Accepted Answer

You can listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini online on Radio and Podcast. Open the player on this page to stream the available audio.

Question 3

Which podcast is Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini from?

Accepted Answer

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini is an episode from The New Stack Podcast by The New Stack Podcast.

Question 4

How long is this episode?

Accepted Answer

This episode is 00:43:41 long.

Question 5

When was this episode published?

Accepted Answer

This episode was published on Mar 2, 2026.

Question 6

Can I save Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini for later?

Accepted Answer

Yes. Use the heart button on the episode page to add it to your favorite episodes list.

Question 7

Are there related episodes from The New Stack Podcast?

Accepted Answer

Yes. This page shows related episodes from The New Stack Podcast when more episodes are available from the podcast feed.

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

About This Episode

Questions About This Episode

Quick Answers About This Episode

Where can I listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini?

Which podcast is this episode from?

What are the episode details?

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

About This Episode

Related Episodes

Questions About This Episode

Quick Answers About This Episode

Where can I listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini?

Which podcast is this episode from?

What are the episode details?