
How to Engineer AI Inference Systems with Philip Kiely - #766
Apr 30, 2026 - 54:51
Radio and PodcastLive Radio & Podcasts
In this episode, we're joined by Kunle Olukotun, professor of electrical engineering and computer science at Stanford University and co-founder and chief technologist at Sambanova Systems, to discuss reconfigurable dataf...
Dataflow Computing for AI Inference with Kunle Olukotun - #751 is an episode from The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by TWIML. In this episode, we're joined by Kunle Olukotun, professor o...
This episode belongs to The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence).
Use the player on this page to stream the episode online.
Published Oct 14, 2025, 57:37 long, audio available.
In this episode, we're joined by Kunle Olukotun, professor of electrical engineering and computer science at Stanford University and co-founder and chief technologist at Sambanova Systems, to discuss reconfigurable dataflow architectures for AI inference. Kunle explains the core idea of building computers that are dynamically configured to match the dataflow graph of an AI model, moving beyond the traditional instruction-fetch paradigm of CPUs and GPUs. We explore how this architecture is well-suited for LLM inference, reducing memory bandwidth bottlenecks and improving performance. Kunle reviews how this system also enables efficient multi-model serving and agentic workflows through its large, tiered memory and fast model-switching capabilities. Finally, we discuss his research into future dynamic reconfigurable architectures, and the use of AI agents to build compilers for new hardware. The complete show notes for this episode can be found at
You can listen to Dataflow Computing for AI Inference with Kunle Olukotun - #751 online on Radio and Podcast. Open the player on this page to stream the available audio.
Dataflow Computing for AI Inference with Kunle Olukotun - #751 is an episode from The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by TWIML.
This episode is 57:37 long.
This episode was published on Oct 14, 2025.
Yes. Use the heart button on the episode page to add it to your favorite episodes list.
Yes. This page shows related episodes from The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) when more episodes are available from the podcast feed.
You can listen to Dataflow Computing for AI Inference with Kunle Olukotun - #751 on this page when the episode audio is available from the podcast feed.
Dataflow Computing for AI Inference with Kunle Olukotun - #751 is from The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by TWIML.
Published Oct 14, 2025 and 57:37 long