Smith, David S., Ramadass, Karthik, Jones, Laura M., Morse, Jennifer L., Fabbri, Daniel V., Coco, Joseph R., Bao, Shunxing, Basford, Melissa A., Embi, Peter J., & Omary, Reed A. (2025). Secondary use of radiological imaging data: Vanderbilt’s ImageVU approach. Journal of Biomedical Informatics, 170, 104905. https://doi.org/10.1016/j.jbi.2025.104905
This study describes the development of ImageVU, a scalable research imaging system that integrates clinical imaging data with metadata-driven cohort discovery, enabling secure, efficient, and compliant access to imaging for secondary and opportunistic research use. ImageVU was designed to support the reuse of radiological imaging data through a dedicated research imaging store. The system includes four interconnected components: a Research PACS, an Ad Hoc Backfill Host, a Cloud Storage System, and a De-Identification System. Imaging metadata are extracted and stored in both the Research Derivative (RD), an identified clinical data repository, and the Synthetic Derivative (SD), a de-identified research data repository, with access provided through the RD Discover web portal. Researchers interact with the system through structured metadata queries and multiple data delivery options, including web-based viewing, bulk downloads, and preparation of datasets for high-performance computing environments. The integration of metadata-driven search has streamlined cohort discovery and improved imaging data accessibility. As of December 2024, ImageVU has processed 12.9 million MRI and CT series from 1.36 million studies across 453,403 patients. The system has supported 75 project requests, delivering more than 50 TB of imaging data to 55 investigators, resulting in 66 published research papers. Overall, ImageVU demonstrates a scalable and efficient approach for integrating clinical imaging into research workflows. By combining institutional data infrastructure with cloud-based storage and metadata-driven cohort identification, the platform enables secure and compliant access to imaging for translational research.

Fig. 1. Host and network architecture of the ImageVU infrastructure service. The clinical radiology PACS feeds downstream research servers that do not have permission to send studies to the clinical PACS. A primary nightly stream of all CT, MR, and PET studies feeds the research PACS, while missing modalities and studies prior to the initiation of the nightly copy are backfilled on an ad hoc basis using a separate server that can send DICOM C-FIND queries to the clinical PACS and retrieve studies via DICOM C-MOVE. Imaging files are stored in Google Cloud Platform (GCP) in their identified form. Clinical metadata is archived in ImageVU Campus Storage, where ImageVU RD (identified metadata) and ImageVU SD (de-identified metadata) are maintained, for further cloud-based indexing. When researchers request access, the ImageVU Curation Host retrieves the requested imaging from ImageVU Cloud Storage and performs de-identification (RSNA CTP, custom filters, and manual review) before releasing the data for research use. The internal research servers coordinate storage, preparation, and delivery of research studies. Researchers use RD Discover to query metadata before retrieving imaging data through web-based viewers, bulk downloads, or high-performance computing (HPC) processing.