An evaluation of image-based and statistical techniques for harmonizing brain volume measurements

Lu, Yuanchiao, Zuo, Lianrui, Chou, Yiyu, Dewey, Blake E., Remedios, Samuel W., Shinohara, Russell Takeshi, Steele, Sonya Ulrike, Nair, Govind, Reich, Daniel S., & Prince, Jerry L. (2025). “An evaluation of image-based and statistical techniques for harmonizing brain volume measurements.” Imaging Neuroscience, 3, IMAG.a.73. https://doi.org/10.1162/IMAG.a.73

Measuring brain volumes from MRI scans can be tricky because differences in scanner hardware, software, and settings can create inconsistencies. In recent years, researchers have developed “harmonization” methods to correct for these differences. This study compares three such methods: neuroCombat (a statistical correction tool), DeepHarmony (a supervised deep learning method), and HACA3 (an unsupervised deep learning approach). We tested how well these methods produce consistent brain volume measurements across two types of MRI scans (GRE and MPRAGE) and how accurately they detect simulated brain shrinkage (atrophy).

All three methods improved the consistency of brain volume measurements compared to uncorrected scans. Among them, HACA3 performed the best, showing the smallest measurement differences across all brain regions (<3%) and the highest agreement between GRE and MPRAGE scans. HACA3 also had the highest reliability across regions. In tests simulating atrophy, HACA3 most accurately preserved unchanged brain regions, DeepHarmony improved several regions, and neuroCombat showed more variability. Notably, neuroCombat could detect hippocampal atrophy only when trained on sample data, highlighting a limitation when training data are unavailable.

Overall, HACA3 was the most effective method for harmonizing MRI scans, followed by DeepHarmony, with neuroCombat showing improvements over uncorrected scans but more variability.

Fig 1 Harmonization procedures using neuroCombat, DeepHarmony, and HACA3 methods

Explore Story Topics