Radio and PodcastRadio and PodcastLive Radio & Podcasts
120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models artwork
Technology

120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models

Towards Data Science by The TDS team

Apr 20, 202200:40:47Technology

AI scaling has really taken off. Ever since GPT-3 came out, it’s become clear that one of the things we’ll need to do to move beyond narrow AI and towards more generally intelligent systems is going to be to massively sc...

About This Episode

120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models is an episode from Towards Data Science by The TDS team . AI scaling has really taken off. Ever since GPT-3 came out, it’s become clear that one of the things we’ll...

Podcast

This episode belongs to Towards Data Science.

Listen Online

Use the player on this page to stream the episode online.

Episode Details

Published Apr 20, 2022, 00:40:47 long, audio available.

Questions About This Episode

What is 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models about?

AI scaling has really taken off. Ever since GPT-3 came out, it’s become clear that one of the things we’ll need to do to move beyond narrow AI and towards more generally intelligent systems is going to be to massively scale up the size of our models, the amount of processing power they consume and the amount of data they’re trained on, all at the same time. That’s led to a huge wave of highly scaled models that are incredibly expensive to train, largely because of their enormous compute budgets. But what if there was a more flexible way to scale AI — one that allowed us to decouple model size from compute budgets, so that we can track a more compute-efficient course to scale? That’s the promise of so-called mixture of experts models, or MoEs. Unlike more traditional transformers, MoEs don’t update all of their parameters on every training pass. Instead, they route inputs intelligently to sub-models called experts, which can each specialize in different tasks. On a given training pass, only those experts have their parameters updated. The result is a sparse model, a more compute-efficient training process, and a new potential path to scale. Google has been pushing the frontier of research on MoEs, and my two guests today in particular have been involved in pioneering work on that strategy (among many others!). Liam Fedus and Barrett Zoph are research scientists at Google Brain, and they joined me to talk about AI scaling, sparsity and the present and future of MoE models on this episode of the TDS podcast. *** Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original mix) - Link to Track: *** Chapters: 2:15 Guests’ backgrounds 8:00 Understanding specialization 13:45 Speculations for the future 21:45 Switch transformer versus dense net 27:30 More interpretable models 33:30 Assumptions and biology 39:15 Wrap-up

Where can I listen to 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models?

You can listen to 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models online on Radio and Podcast. Open the player on this page to stream the available audio.

Which podcast is 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models from?

120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models is an episode from Towards Data Science by The TDS team .

How long is this episode?

This episode is 00:40:47 long.

When was this episode published?

This episode was published on Apr 20, 2022.

Can I save 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models for later?

Yes. Use the heart button on the episode page to add it to your favorite episodes list.

Are there related episodes from Towards Data Science?

Yes. This page shows related episodes from Towards Data Science when more episodes are available from the podcast feed.

Quick Answers About This Episode

Where can I listen to 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models?

You can listen to 120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models on this page when the episode audio is available from the podcast feed.

Which podcast is this episode from?

120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models is from Towards Data Science by The TDS team .

What are the episode details?

Published Apr 20, 2022 and 00:40:47 long