Liu, Quan; Deng, Ruining; Cui, Can; Yao, Tianyuan; Yang, Yuechen; Nath, Vishwesh; Li, Bingshan; Chen, You; Tang, Yucheng; Huo, Yuankai. “mTREE: Multi-Level Text-Guided Representation End-to-End Learning for Whole Slide Image Analysis.” IS and T International Symposium on Electronic Imaging Science and Technology 37, no. 12 (2025): HPCI-183. https://doi.org/10.2352/EI.2025.37.12.HPCI-183.
Researchers have developed a new way to help computers better understand and analyze large, detailed images of human tissue—like those used to diagnose diseases—by combining both image data and written medical notes.
In medicine, especially in fields like cancer diagnosis, doctors often study massive, high-resolution images of tissue samples under a microscope. These images are so large (sometimes called “gigapixel images”) that it’s hard for computers to analyze them in one go. Existing computer methods usually break the image into smaller pieces, study those, and then try to make sense of the whole picture later. But it’s tricky to combine that image data with written reports from pathologists in a smooth, efficient way.
This study introduces a new method called mTREE (Multi-Level Text-Guided Representation End-to-End Learning). It uses written descriptions—like those found in pathology reports—to guide the computer in figuring out which parts of a tissue image are important. The model learns to focus on both small, detailed parts of the image and the big picture all at once, using the text to improve its understanding of both.
The written notes help in two ways: first, they guide the model to zoom in on key areas, and second, they help it blend that information into a full understanding of the image. The researchers tested their approach on tasks like predicting what kind of disease is present and estimating patient survival, and their new method outperformed other existing tools.
This work could make it easier for doctors and researchers to get useful information from complex medical images by making better use of the text that already comes with them. The tool is publicly available for others to use and build upon.
