Radio and PodcastRadio and PodcastLive Radio & Podcasts
Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) artwork
Technology

Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai)

Technically Legal - A Legal Technology and Innovation Podcast by Percipient. LLC

Dec 18, 202527:34Technology

In one of the most popular episodes of the year, Legalbenchmarks.ai Founder Anna Guo discusses her organization's research that tests whether artificial intelligence custom-made for legal tasks better than general AI too...

About This Episode

Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) is an episode from Technically Legal - A Legal Technology and Innovation Podcast by Percipient. LLC. In one of the most popular...

Listen Online

Use the player on this page to stream the episode online.

Episode Details

Published Dec 18, 2025, 27:34 long, audio available.

Questions About This Episode

What is Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) about?

In one of the most popular episodes of the year, Legalbenchmarks.ai Founder Anna Guo discusses her organization's research that tests whether artificial intelligence custom-made for legal tasks better than general AI tools. Anna is a former BigLaw lawyer who left the practice to become an entrepreneur and now focuses her energies on quantifying the utility of AI in the legal industry. Anna's initial anecdotal research for colleagues quickly revealed a strong community interest in a systematic approach to evaluating legal AI tools. This led to the creation of Legalbenchmarks.AI, dedicated to finding out where the promise of humans plus AI is truly better than humans alone or AI alone. The core of the research involves measuring the "delta," or the extent to which AI can elevate human performance. To date, Legalbenchmarks.ai conducted two major studies: one on information extraction from legal sources and a second on contract review and redlining . Key Findings from the Studies: Accuracy vs. Qualitative Usefulness: The highest-performing general-purpose AI tools (like Gemini) were often found to be more accurate and consistent. However, the legal-specific AI tools often received higher marks in qualitative usefulness and helpfulness, as they align more closely with existing legal workflows. Methodology: The testing goes beyond simple accuracy. It includes a three-part assessment: Reliability (objective accuracy and legal adequacy), Usability (qualitative metrics like helpfulness and coherence for tasks such as brainstorming), and Platform Workflow Support (integration, citation checks, and other features). Human-AI Performance: In the contract analysis study, AI tools matched or exceeded the human baseline for reliability in producing first drafts. Crucially, the data demonstrated that the common belief that "human plus AI will always outperform AI alone" was false; the top-performing AI tool alone still had a higher accuracy rate than the human-plus-AI combo. Risk Analysis: A significant finding was that legal AI tools were better at flagging material risks, such as compliance or unenforceability issues in high-risk scenarios, that human lawyers missed entirely. This suggests AI can act as a crucial safety net. Strengths Comparison: AI excels at brainstorming, challenging human bias, and performing mass-scale routine tasks (e.g., mass contract review for simple terms). Humans retain a significant edge in ingesting nuanced context and making commercially reasonable decisions that AI's instruction-following can sometimes lack.

Where can I listen to Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai)?

You can listen to Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) online on Radio and Podcast. Open the player on this page to stream the available audio.

Which podcast is Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) from?

Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) is an episode from Technically Legal - A Legal Technology and Innovation Podcast by Percipient. LLC.

How long is this episode?

This episode is 27:34 long.

When was this episode published?

This episode was published on Dec 18, 2025.

Can I save Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) for later?

Yes. Use the heart button on the episode page to add it to your favorite episodes list.

Are there related episodes from Technically Legal - A Legal Technology and Innovation Podcast?

Yes. This page shows related episodes from Technically Legal - A Legal Technology and Innovation Podcast when more episodes are available from the podcast feed.

Quick Answers About This Episode

Where can I listen to Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai)?

You can listen to Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) on this page when the episode audio is available from the podcast feed.

Which podcast is this episode from?

Best of 2025 - Benchmarking Legal AI: Measuring the Delta Between Man and Machine (Anna Guo Legalbenchmarks.ai) is from Technically Legal - A Legal Technology and Innovation Podcast by Percipient. LLC.

What are the episode details?

Published Dec 18, 2025 and 27:34 long