On Diagnosing Disease from Pixels: What Radiograph AI Needs to Get Right
Diagnostic AI in imaging has attracted extraordinary investment — over $14 billion globally in the past five years by some estimates — and extraordinary expectations. The gap between what models can do in controlled research environments and what they can do reliably across the full distribution of clinical practice remains consequential. Understanding what it takes to close that gap matters for anyone building or investing in clinical imaging AI.
The research-to-clinical gap has two main components. The first is distribution shift: models trained on academic medical center imaging data perform differently when deployed in community hospital or outpatient radiology settings where patient populations, imaging equipment, and clinical protocols differ from the training distribution. The second is the edge case problem: clinical imaging AI is evaluated in research on aggregate performance metrics — sensitivity, specificity, AUC — but clinical deployment is a sequence of individual decisions, each of which may be an edge case. A model with 95% sensitivity on a balanced test set is still missing 5% of positive cases. In the context of a rare disease with a small patient population, that performance profile may be clinically unacceptable.
Our investment in Overjet is an instructive case study in how to close this gap in a specific imaging domain. Dental radiograph AI has several features that made the research-to-clinical transition more tractable than general medical imaging: relatively standardized imaging protocols, a more concentrated dental software vendor ecosystem that simplified integration, and a specific regulatory pathway (FDA De Novo authorization) that provided a defined bar for clinical deployment. The company spent years building its imaging dataset specifically to include the range of equipment types, image qualities, and patient populations present in real dental practice.
The more general lesson: clinical imaging AI companies that will survive the gap between research performance and clinical validation are the ones that treat the regulatory submission and post-market surveillance processes as product design inputs, not compliance hurdles to be navigated after the model is built. FDA's 2023 draft guidance on AI-enabled device functions explicitly calls for predetermined change control plans and performance monitoring protocols. Companies that have built these into their development process from the start are building a regulatory track record that becomes a durable competitive asset.