Srinivasan, Gokul; Le, Minh-Khang; Azher, Zarif L.; Liu, Xiaoying; Vaickus, Louis J.; Kaur, Harsimran; Kolling, Fred W.; Palisoul, Scott M.; Perreard, Laurent; Lau, Ken S.; Yao, Keluo; Levy, Joshua J. (2025). Histology-Based Virtual RNA Inference Identifies Pathways Associated With Metastasis Risk in Colorectal Cancer.Modern Pathology, 38(11), 100866. https://doi.org/10.1016/j.modpat.2025.100866
Colorectal cancer (CRC) remains a major health issue, with over 150,000 new cases and 50,000 deaths each year in the United States. The tumor microenvironment (TME)—the mix of cancer and immune cells within a tumor—plays a key role in disease progression and treatment response. However, traditional methods to study the TME lack detailed spatial information, and current spatial transcriptomics (ST) technologies are costly, slow, and difficult to reproduce.
This study introduces virtual RNA inference (VRI), a method that uses standard hematoxylin and eosin (H&E)-stained tissue images to estimate detailed molecular information similar to that from ST. Trained on data from 45 CRC patients with over 300,000 Visium spots, VRI achieved a median Spearman’s correlation of 0.546 between predicted and measured gene expression.
VRI accurately identified region-specific gene signatures and spatial cell-type patterns that matched those from direct ST, and in a larger patient group, these signatures were linked to clinical outcomes such as metastasis. While some tumor pathways are not fully captured by histology alone, VRI can reveal a wide range of biological signals at near-cellular resolution without expensive ST profiling, supporting broader molecular studies in colorectal cancer.

Figure 1
Overview of the virtual RNA inference (VRI) approach. (A) For model training and validation, paired spatial transcriptomics (ST) and hematoxylin and eosin whole-slide image were collected from 45 primary tumors (development cohort) from patients diagnosed with colorectal cancer (CRC). (B) Image patches were extracted, centered around each Visium spot. (C) Neural networks were trained to infer expression for a panel of genes at that spot. (D) Performance of VRI for each of gene is reported based on their correlation between VRI-inferred ST and Visium-ST. Genes were ranked based on predictive performance and pathway analyses were conducted on top-performing genes to identify “histology-associated” pathways. (E) VRI is applied to unseen tissue slides (not profiled with ST) from our expanded cohort. (F) VRI-inferred ST was used to associate with and predict spatial cell types, pathway activity and histologies, for both the development and expanded cohorts. (G) VRI-inferred ST is aggregated within specific tissue regions (eg, tumor interface) across patients within the expanded cohort to identify metastasis-related pathways through differential expression analysis. ECM, extracellular matrix; EMT, epithelial-mesenchymal transition; FC, fold-change expression.