Radio and PodcastRadio and PodcastLive Radio & Podcasts
Why Vision Language Models Ignore What They See with Munawar Hayat - #758 artwork
Technology

Why Vision Language Models Ignore What They See with Munawar Hayat - #758

This Week in Machine Learning & Artificial Intelligence (AI) Podcast by TWIML

Dec 9, 202557:40Technology

In this episode, we’re joined by Munawar Hayat, researcher at Qualcomm AI Research, to discuss a series of papers presented at NeurIPS 2025 focusing on multimodal and generative AI. We dive into the persistent challenge...

About This Episode

Why Vision Language Models Ignore What They See with Munawar Hayat - #758 is an episode from This Week in Machine Learning & Artificial Intelligence (AI) Podcast by TWIML. In this episode, we’re joined by Munawar Hayat, researcher at Qualco...

Listen Online

Use the player on this page to stream the episode online.

Episode Details

Published Dec 9, 2025, 57:40 long, audio available.

Questions About This Episode

What is Why Vision Language Models Ignore What They See with Munawar Hayat - #758 about?

In this episode, we’re joined by Munawar Hayat, researcher at Qualcomm AI Research, to discuss a series of papers presented at NeurIPS 2025 focusing on multimodal and generative AI. We dive into the persistent challenge of object hallucination in Vision-Language Models (VLMs), why models often discard visual information in favor of pre-trained language priors, and how his team used attention-guided alignment to enforce better visual grounding. We also explore a novel approach to generalized contrastive learning designed to solve complex, composed retrieval tasks—such as searching via combined text and image queries—without increasing inference costs. Finally, we cover the difficulties generative models face when rendering multiple human subjects, and the new "MultiHuman Testbench" his team created to measure and mitigate issues like identity leakage and attribute blending. Throughout the discussion, we examine how these innovations align with the need for efficient, on-device AI deployment. The complete show notes for this episode can be found at

Where can I listen to Why Vision Language Models Ignore What They See with Munawar Hayat - #758?

You can listen to Why Vision Language Models Ignore What They See with Munawar Hayat - #758 online on Radio and Podcast. Open the player on this page to stream the available audio.

Which podcast is Why Vision Language Models Ignore What They See with Munawar Hayat - #758 from?

Why Vision Language Models Ignore What They See with Munawar Hayat - #758 is an episode from This Week in Machine Learning & Artificial Intelligence (AI) Podcast by TWIML.

How long is this episode?

This episode is 57:40 long.

When was this episode published?

This episode was published on Dec 9, 2025.

Can I save Why Vision Language Models Ignore What They See with Munawar Hayat - #758 for later?

Yes. Use the heart button on the episode page to add it to your favorite episodes list.

Are there related episodes from This Week in Machine Learning & Artificial Intelligence (AI) Podcast?

Yes. This page shows related episodes from This Week in Machine Learning & Artificial Intelligence (AI) Podcast when more episodes are available from the podcast feed.

Quick Answers About This Episode

Where can I listen to Why Vision Language Models Ignore What They See with Munawar Hayat - #758?

You can listen to Why Vision Language Models Ignore What They See with Munawar Hayat - #758 on this page when the episode audio is available from the podcast feed.

Which podcast is this episode from?

Why Vision Language Models Ignore What They See with Munawar Hayat - #758 is from This Week in Machine Learning & Artificial Intelligence (AI) Podcast by TWIML.

What are the episode details?

Published Dec 9, 2025 and 57:40 long