A Framework for
Fit-for-Purpose Real-World Imaging Data
EHRs capture what was documented.
Imaging captures what was biologically observed.
This white paper defines what it takes to make that imaging data decision-grade.

The Imaging Gap in Real-World Evidence
Real-world evidence without the scan is structurally incomplete
The life-sciences industry has invested heavily in claims-based and EHR-based real-world evidence. But claims capture what was billed, not what was biologically observed. For the therapeutic areas where imaging is the evidentiary standard (oncology staging, neurodegenerative biomarker confirmation, cardiac phenotyping) generic datasets leave a structural gap.
This Segmed White Paper defines what fit-for-purpose imaging data actually requires, and where generic approaches fall short.

From Framework to Practice

Download

Frequented Asked Questions - F.A.Q.

What does fit-for-purpose mean for real-world imaging data, and how is it different from general data quality?
Fit-for-purpose means the imaging data has been selected, validated, and linked for a specific context of use, whether that is a regulatory submission, a clinical trial, or a real-world evidence study. The distinction from general "data quality"is important: the life-sciences industry has no consensus definition of data quality, and the term means different things depending on the stakeholder's role, regulatory context, and analytic objective. From a biopharma decision-making perspective, the most actionable definition is the simplest: did the data fit the purpose it was selected for?
In practice, this means imaging data must meet five essential requirements simultaneously:
• Context-specific validation: the data has been verified for its intended use, covering technical performance, biological relevance, and clinical utility.
• Relevance and reliability: the right patients, exposures, outcomes, and covariates are present, with consistent cohort definitions applied across sites.
• Standardization balanced with real-world variability: acquisition protocols are harmonized where necessary, while preserving the diversity of scanners, institutions, and care settings that makes findings generalizable.
• Accuracy and completeness: the data reflects the measured variable without gaps that would compromise analysis.
• Regulatory framework alignment: alignment with structured frameworks such as the FDA's BEST (Biomarkers, Endpoints, and other Tools) initiative and the SPIFD (Structured Process to Identify Fit-for-Purpose Data) methodology.
A prebuilt dataset that offers scale and diversity but lacks clinical linkage or context-specific validation may be appropriate for foundation model training, but it is not fit-for-purpose for regulatory evidence, clinical trial design, or biomarker development: where the research question demands precise cohort definition and validated relevance to a defined decision.
Why is multimodal data integration necessary for real-world imaging data to be decision-grade?
Imaging alone provides anatomical, pathological, and functional information, but without clinical context it cannot support the precise cohort construction that biopharma research requires. Consider a concrete example: analyzing lung nodules on chestCT in non-small cell lung cancer (NSCLC) patients requires confirming eligibility through linked clinical data, specific drug exposure (such aspembrolizumab), smoking history (defined pack-years), and confirmed histological diagnosis. Without this linkage, the cohort cannot be reliably defined, and imaging-only analysis lacks the clinical context required for meaningful interpretation.
Multimodal integration (connecting imaging studies to electronic health records, pathology, molecular biomarkers, treatment history, and outcomes at the patient level) is whatmakes fit-for-purpose assessment possible. Each data type independently provides a partial view; combining them yields a more complete characterization of patient health and disease phenotypes. This requirement is becoming more acuteas the industry invests in foundation models, where imaging paired with clinical context (diagnosis, treatment exposure, molecular markers, and outcomes) is necessary to learn representations that generalize across populations and care settings.
Segmed's datasets are structured around this principle: imaging is the foundation, with clinical, molecular, treatment, and outcomes data linked at the patient level across 3.5 million patients, enabling longitudinal, multimodal analysis that neither imaging nor clinical data can support alone.
How does fit-for-purpose real-world imaging data support biopharma drug development and regulatory submission?
Fit-for-purpose real-worldi imaging data (RWiD) provides distinct evidentiary value at every stage of the biopharma lifecycle. In discovery, curated imaging cohorts enriched with clinical and molecular metadata enable characterization of disease biology and identification of biomarker candidates using real clinical phenotypes from routine care settings. In translational research, molecular imaging (PET, SPECT) interrogates target engagement, pharmacokinetics, and drug biodistribution and real-world datasets contextualize these signals across diverse populations beyond narrow trial cohorts.
In clinical development, decision-grade RWiD informs trial design by contextualizing disease prevalence, progression patterns, and endpoint variability in routine care. It supports patient recruitment through imaging-defined eligibility screening, and enables construction of external control arms where full randomization is not feasible, particularly relevant for rare diseases and accelerated approval pathways.
For regulatory submissions, the FDA's guidance on real-world evidence emphasizes data provenance, completeness, consistency, and alignment with a defined context of use. When curated with documented quality, multimodal real-world imaging datasets can contribute supportive evidence across safety, effectiveness, and comparative effectiveness evaluations. After approval, longitudinal imaging cohorts linked to clinical outcomes support post-market surveillance, treatment optimization, and label expansion evidence across populations more diverse than those enrolled in the original trials.
Segmed's white paper details these applications with indication-specific requirements for oncology (RECIST-standardized longitudinal imaging linked to molecular and outcomes data), neurology (amyloid and tau PET with longitudinal clinical follow-up), and cardiology (functional and structural imaging for phenotype differentiation), including a published NSCLC case study demonstrating how Segmed's imaging data contributed to a biomarker discovery that outperformed PD-L1 expression in predicting immunotherapy outcomes (Velcheti et al., ASCO 2025).
