Jun 25, 2025 Clinithink

Four Peer-Reviewed Use Cases for Clinical AI

The pharmaceutical and life sciences industry has reached an inflection point. With R&D costs soaring and timelines stretching, companies are under immense pressure to demonstrate that AI investments deliver real value today—not just promising pilots for the future. The exploratory phase is over. What matters now is proven impact at scale. 

Across dozens of peer-reviewed papers and live deployments, we see four specific use cases where AI—anchored in clinical natural language processing (CNLP)—already produces measurable results: 

  1. Patient finding for undiagnosed rare disease — Using AI-enabled deep phenotyping to identify patients who might have an underlying undiagnosed treatable rare genetic disorder. 
  1. Automated pre-screening for clinical trials patient matching — Using AI to automate pre-screening, significantly reducing screen failure rate, accelerating recruitment, and reducing cost. 
  1. Care gaps and care journey mapping — Understanding what happens before and after diagnosis and why patients fall through the cracks. 
  1. Predictive identification of patients with early-stage disease — Using novel methods to develop predictive models that can be used to identify patients who may be at early-stage disease and at high risk of developing later/more serious conditions. 

These aren't theoretical applications—they're proven approaches transforming how companies understand and serve patients. The key to each use case is the baseline understanding that 80% of healthcare's most valuable information lives in unstructured clinical narratives.

1. Patient Finding for Undiagnosed Rare Disease

The Challenge

This is where most rare disease work begins—identifying patients who meet the clinical profile for a condition but haven't been diagnosed, or whose diagnosis hasn't made it into structured fields like ICD codes or claims. The signs are there, buried in clinical notes—repeated infections, unexplained seizures, diagnostic dead ends—but the system hasn't connected the dots. 

The rare disease challenge is particularly stark: epidemiological models might suggest 20,000 patients exist, but claims databases show only 10,000 diagnosed or even fewer. The other 10,000 are documented in clinical notes, presenting with symptoms, seeing specialists—but never receiving the right ICD code. Traditional methods miss them entirely. 

Our Approach 

Our engine is a three-part system: Natural Language Processing (NLP) plus ontology knowledge graph plus contextualization. This ontology-aware NLP (CLiX) extracts thousands of granular phenotypic features from each note, then compares every patient record against a disease-specific signature (HPO+SNOMED). The result is a rank-ordered shortlist a clinician can review in minutes. 

Working with experts to configure specific phenotypes, the technology scans large volumes of notes and generates patient lists that can be reviewed by clinicians. It transforms what would typically be months of manual chart review into actionable insights delivered in days.  

This approach does more than simply shorten the timeline, it enables the impossible. It would take years to manually review millions of clinical documents—a timeline that is simply unfeasible. So the real breakthrough here isn’t about doing something faster. It’s about doing something that was previously out of reach entirely. 

That’s the power of what we’re enabling. It’s not an efficiency gain—it’s a fundamental expansion of what’s possible. A real game changer. 

Real-World Impact: 

  • Mount Sinai Health System improved the identification of Fabry disease candidates 10-fold, compressing three months of review into one afternoon. 
  • At Rady Children's Hospital, subtle clinical signals surfaced through NLP resulted in 600% more newborns being prioritized for genetic sequencing at a single site—setting a Guinness World Record of 19 hours from admission to molecular diagnosis. 

These aren't just numbers—they represent patients who can now access life-changing therapies.

2. Automated Pre-Screening for Clinical Trial Patient Matching

The Challenge 

Recruiting patients for clinical trials is often like searching for a needle in a haystack. Sites must sift through thousands of unstructured clinical notes to find candidates who meet dozens of strict inclusion and exclusion criteria. This manual chart review process is painfully slow and labor-intensive, and it simply cannot keep pace with the growing complexity of trial protocols. As eligibility criteria proliferate—they have nearly doubled in the past decade—crucial details get missed and many eligible patients slip through the cracks, driving up operational burden and frustration. 

The operational and financial impact of these traditional methods is enormous. Patient recruitment can consume roughly one-third of a trial's budget, yet much of this effort is wasted—screening success rates have been reported as low as 23%, meaning nearly three out of four patients found by prescreening ultimately fail to qualify. 

The consequences are costly. Trial delays are rampant, with up to 45% of lengthy delays attributed to recruitment problems, and about 46% of trials ultimately fall short of enrollment goals due to recruitment shortfalls. Each lost month in a trial's timeline can forfeit an estimated $25 million in revenue for the sponsor, so every day of delay hurts. Individual study sites also struggle under this paradigm—11% of sites fail to enroll a single patient and 37% under-enroll despite exhaustive chart reviews. In short, manual pre-screening has become a major bottleneck and financial drain in clinical trials, leading to high failure rates, extended timelines, and unsustainable costs. 

Our Approach 

AI reads millions of documents per hour, applying precise inclusion/exclusion logic directly to contemporaneous EHR notes. Sites receive evidence-linked candidate lists and supporting context, enabling "consent-ready" outreach. 

Real-World Impact 

  • Newcastle upon Tyne Hospitals NHS Foundation Trust processed 270,000 records in less than 24 hours and produced 100 high-fidelity candidates for a GI trial—after three years of manual review had surfaced only 30. 
  • At a U.S. academic center, automated NLP pre-screening reduced on-site screen failure by 42%, saving an estimated $1.2M in recruitment costs for a phase III oncology study. 

Automated pre-screening transforms recruitment from a bottleneck into a competitive advantage, allowing sponsors to lock sites earlier and complete enrollment months ahead of plan.

3. Care Gaps and Care Journey Mapping

The Challenge 

This is the next level up—not just whether a patient has been diagnosed, but what happens after. Are they being treated? Are they falling through the cracks? Why is there a delay between diagnosis and therapy? 

Medical Affairs and Market Access teams have shown particular interest in this use case recently. It's about surfacing patterns in how patients are managed and where systems diverge—whether that's geographic variation, differences between community and academic settings, or even socioeconomic disparities. For example, it helps teams understand why some patients get on therapy and others don't. 

The key questions being addressed: 

  • Are patients being treated after diagnosis, or are they falling through the cracks? 
  • Why is there a delay between diagnosis and therapy initiation? 
  • What drives the differences in care between different settings and populations? 

These insights come from analyzing the narrative context in clinical notes that structured data completely misses—the social determinants, the clinical reasoning, the barriers to care that physicians document but never code. 

Real-World Impact: 

  • At ASCO 2023, analysis of 150,000 oncology notes identified critical gaps in patient follow-up. These patterns—invisible in claims data—revealed striking disparities across different health systems and socioeconomic groups. The insights directly improved both clinical outcomes and operational efficiency, with East London NHS Foundation Trust documenting over £840,000 in annual savings through care optimization. Teams are using these insights to understand why patients in community settings experience different outcomes than those in academic centers and how social determinants affect treatment adherence. 

Predictive Factor Discovery

The Challenge 

This is the most research-heavy use case, and indeed, one of the most exciting. Instead of confirming what's already known, pharmaceutical companies are uncovering early, often unexpected phenotypic signals that can inform risk models and early detection strategies. The goal is to surface what no one thought to look for. 

It's a discovery process—extracting hundreds of clinical features from unstructured data, then using AI/ML to find the signals that matter. When physicians document subtle observations—"unusual fatigue pattern," "family history suggestive of genetic predisposition," "atypical presentation"—they're capturing signals that structured data never will. 

Some partners use these insights to train predictive models, others to design screening programs or stratify populations for downstream analytics. The power lies in discovering patterns that haven't been recognized before. 

Real-World Impact: 

  • Our ASCO 2024 study extracted over 800 phenotypic signals from unstructured data, using machine learning to identify the 15 most predictive factors for early lung cancer detection. Several of these had never been recognized as risk factors before. Partners are using these discoveries in different ways—some to train predictive models for their own patient populations, others to design targeted screening programs. Collaborations like the one with AstraZeneca are identifying patients with early-stage lung cancer who fall outside traditional screening criteria, catching cancer when it's still treatable. 

Why Clinical Narratives Hold the Key 

The industry is increasingly recognizing a fundamental truth: 80% of healthcare information is unstructured. When physicians write "patient struggling with medication due to work schedule" or "mother had similar symptoms in her 40s, never diagnosed," that context drives real clinical decisions but remains invisible to traditional analytics. 

The distinction is clear: claims data reveals what happened; clinical notes explain why. 

This depth of insight enables organizations to: 

  • Find patients who match complex clinical profiles but lack proper diagnoses 
  • Map the real barriers and gaps preventing patients from accessing treatment 
  • Discover early warning signals that predict disease years before diagnosis 
  • Generate evidence that captures the full patient story for stakeholders 

From Pilot to Production: Making It Real 

The organizations seeing real results share common characteristics: 

  • Strategic focus. They select one specific use case—patient finding for a single indication, care gaps for one therapeutic area, or predictive factors for a targeted population. 
  • Smart partnerships. Choosing the right health system partners—those with engaged teams and accessible, high-fidelity data—matters more than raw scale. These collaborations enable faster, more actionable insights. 
  • Clinical validation. Success depends on clinician trust in AI-generated outputs, particularly when reviewing patient lists or care recommendations. Clinician review ensures relevance and adoption. 
  • Workflow integration. Technology enhances existing processes—whether that's trial recruitment, medical affairs analysis, or population health management, success comes from fitting naturally into daily operations 
  • Measurable outcomes. Successful projects define what matters from the start, whether the number of identified patients, care gaps uncovered, or models accuracy, and track it throughout to show real-world impact. 

The Bottom Line 

The life sciences and pharmaceutical research industry has reached a pivotal moment, not because of AI hype, but because the technology to unlock rich clinical narratives is already delivering measurable results, backed by peer-reviewed publications. 

The organizations succeeding aren't those with the biggest AI budgets or the flashiest technology. They're the ones who understand that the most valuable healthcare insights have been there all along, written in the words of clinicians caring for patients. 

Whether identifying  hard to find patients for clinical trials, mapping where care breaks down, or discovering predictive signals for early intervention, the patterns in clinical narratives hold the answers. 

The question isn't whether this technology can help—it's whether organizations are ready to scale access to latent insights already in their data. For those who are, the competitive advantage is clear: more patients identified, better understanding of care delivery, and earlier intervention—all from data that already exists. 

At Clinithink, we're seeing these use cases transform from promising concepts to proven strategies that deliver measurable impact. The future of clinical research isn't about collecting more data—it's about understanding the data we already have. 

Sources 

  1. Sedlakova, J., Daniore, P., Horn Wintsch, A., et al. (2023). Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review. PLOS Digital Health, 2(10), e0000347. https://doi.org/10.1371/journal.pdig.0000347
  2. Michalski, A. A., Lis, K., Stankiewicz, J., et al. (2023). Supporting the diagnosis of Fabry disease using a naturallanguageprocessing–based approach. Journal of Clinical Medicine, 12(10), 3599. https://doi.org/10.3390/jcm12103599
  3. Clark, M. M., Hildreth, A., Batalov, S., et al. (2019). Diagnosis of genetic diseases in seriously ill children by rapid wholegenome sequencing and automated phenotyping and interpretation. Science Translational Medicine, 11(489), eaat6177. https://doi.org/10.1126/scitranslmed.aat6177 
  4. Meystre, S. M., Heider, P. M., Kim, Y., Aruch, D. B., & Britten, C. D. (2019). Automatic trial eligibility surveillance based on unstructured clinical data. International Journal of Medical Informatics, 129, 13–19. https://doi.org/10.1016/j.ijmedinf.2019.05.018 
  5. Beck, J. T., et al. (2020). Artificialintelligence tool for optimizing eligibility screening for clinical trials in a large community cancer center. JCO Clinical Cancer Informatics, 4, 50–59. https://doi.org/10.1200/CCI.19.00079 
  6. Schut, M. C., Luik, T. T., Vagliano, I., et al. (2025). Artificial intelligence for early detection of lung cancer in GPs’ clinical notes: A retrospective observational cohort study. British Journal of General Practice, 75(754), e316–e322. https://doi.org/10.3399/BJGP.2023.0489 
  7. Zheng, C., et al. (2017). Natural language processing to identify pulmonary nodules and extract nodule characteristics from radiology reports. Radiology, 284(3), 870878. https://doi.org/10.1148/radiol.2017161659 
  8. Gould, M. K., Huang, B. Z., Tammemagi, M. C., et al. (2021). Machine learning for early lungcancer identification using routine clinical and laboratory data. American Journal of Respiratory and Critical Care Medicine, 204(4), 445–453. https://doi.org/10.1164/rccm.2020072791OC 
  9. Clinithink. (n.d.). Newcastle Hospitals: Accelerating trial recruitment with CNLP [White paper]. https://www.clinithink.com/ (accessed June 2025). 
Published by Clinithink June 25, 2025