
Blog
/
Jul 25, 2025
Written by: Muthu Alagappan, Tony Sun, Jessica Fan
Yesterday, Google released the full results of its Articulate Medical Intelligence Explorer (AMIE) study, which tested whether LLMs could support diagnostic interviews under asynchronous physician oversight. The findings were impressive. AMIE outperformed both early-career physicians and experienced nurse practitioners across a range of simulated primary care scenarios. It produced more accurate differential diagnoses, more appropriate clinical management plans, and patient messages rated higher than those written by humans. Most importantly, oversight physicians consistently preferred reviewing AMIE-generated cases, and composite quality scores were highest for the AI-assisted workflow across all groups.
What these results show is that with the right context engineering, LLMs can already serve as powerful physician augmentation tools. At Counsel, that finding feels deeply validating because it mirrors what we’re already seeing.
For the past two years, we’ve been building what Google now demonstrated in its AMIE simulations. Like AMIE, our system runs under physician oversight. We also call ours the Clinician Cockpit, and it’s similarly centered around the SOAP note; much like AMIE proposed, we focus on capturing complex patient histories during intake, and then generate tailored patient messages for physician review. Our system is already operational with real patients, where we ingest complex clinical records from electronic health records into our clinical guidance generation. Every day, Counsel physicians supervise structured asynchronous encounters and make clinical decisions that impact our patients, including which tests to order, which specialists to involve, and when to escalate to higher levels of care. Google validated their Cockpit through a participatory study with 10 outpatient physicians, and we have further validated this across thousands of live encounters.
We also share Google’s commitment to safety. AMIE’s standout architectural feature is its strict separation between intake and medical advice, deferring final decisions to a licensed physician. Counsel’s system follows the same principles. We found that this decoupling led to a 79% improvement in physician efficiency and a 41% reduction in time to clinical resolution. The AMIE study found a similar effect: oversight physicians spent about 40% less time than in traditional consultations.
What the AMIE research team has done is essential. They created a rigorous, externally validated framework for LLMs in clinical settings with clear rubrics and benchmarks for measuring quality and safety. It’s a valuable step forward for the field.
The study also underscores why our work at Counsel matters. While AMIE defines what’s possible in simulation, we’re already building the infrastructure to support it in production. We’ve been heads-down building what we believe is the future of care: AI-powered, physician-led, asynchronous medicine that scales without compromising quality. AMIE’s results validate that vision, and we’re excited to build the system that defines the next era of clinical care.