Radio and PodcastRadio and PodcastLive Radio & Podcasts
Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini artwork
Technology

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini

The New Stack Podcast by The New Stack Podcast

Mar 2, 202600:43:41Technology

On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced Mercury 2, a large language model built on diffusion rather than the standard autoregressive approach. Traditional LLMs generat...

About This Episode

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini is an episode from The New Stack Podcast by The New Stack Podcast. On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced...

Podcast

This episode belongs to The New Stack Podcast.

Listen Online

Use the player on this page to stream the episode online.

Episode Details

Published Mar 2, 2026, 00:43:41 long, audio available.

Questions About This Episode

What is Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini about?

On a recent episode of the The New Stack Agents, Inception Labs CEO Stefano Ermon introduced Mercury 2, a large language model built on diffusion rather than the standard autoregressive approach. Traditional LLMs generate text token by token from left to right, which Ermon describes as “fancy autocomplete.” In contrast, diffusion models begin with a rough draft and refine it in parallel, similar to image systems like Stable Diffusion. This parallel process allows Mercury 2 to produce over 1,000 tokens per second—five to ten times faster than optimized models from labs such as OpenAI, Anthropic, and Google, according to company tests. Ermon argues diffusion models better leverage GPUs, with support from investor Nvidia to optimize performance. While Mercury 2 matches mid-tier models like Claude Haiku and Google Flash rather than top systems such as Claude Opus or GPT-4, Ermon believes diffusion’s speed and economic advantages will become increasingly compelling as AI applications scale. Learn more from The New Stack about the latest developments around around large language model built on diffusion: How Diffusion-Based LLM AI Speeds Up Reasoning Get Ready for Faster Text Generation With Diffusion LLMs Join our community of newsletter subscribers to stay on top of the news and at the top of your game.

Where can I listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini?

You can listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini online on Radio and Podcast. Open the player on this page to stream the available audio.

Which podcast is Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini from?

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini is an episode from The New Stack Podcast by The New Stack Podcast.

How long is this episode?

This episode is 00:43:41 long.

When was this episode published?

This episode was published on Mar 2, 2026.

Can I save Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini for later?

Yes. Use the heart button on the episode page to add it to your favorite episodes list.

Are there related episodes from The New Stack Podcast?

Yes. This page shows related episodes from The New Stack Podcast when more episodes are available from the podcast feed.

Quick Answers About This Episode

Where can I listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini?

You can listen to Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini on this page when the episode audio is available from the podcast feed.

Which podcast is this episode from?

Inception Labs says its diffusion LLM is 10x faster than Claude, ChatGPT, Gemini is from The New Stack Podcast by The New Stack Podcast.

What are the episode details?

Published Mar 2, 2026 and 00:43:41 long