Big Healthcare Data Learning (DSI-SRP)
This DSI-SRP fellowship funded Yuzhe Lu to work in the laboratory of Professor Yuankai Huo in the Department of Electrical Engineering and Computer Science during the summer of 2020. Yuzhe anticipates graduating in May 2022 with majors in Math, Computer Science, and Medicine, Health, and Society.
The project funded by this fellowship proposed a novel data augmentation called circlemix that is optimized for biomedical objects and improves the classification performance of glomeruli sclerosis. Their work has been accepted by SPIE 2021 with oral presentation. The data Lu worked with include more than 8000 glomeruli images and their labels stored in XML format, which required him to process them at scale. He used a machine learning algorithm, deep convolutional neural network, to build the sclerosis classifier. The medical dataset is highly imbalanced, so he learned how to leverage resampling techniques and loss function during model training to tackle this inherent bias. When running the experiments, Lu applied rigorous methodologies like cross-validation and explored metrics like F1 score to evaluate the performance of our algorithm. Among all the DSI-SRP programming, I think the most interesting part is the weekly update, where Lu had a chance not only to share his progress, but also to learn how other fellows were applying their data science skills on interesting projects. The title of their paper is “Improve Global Glomerulosclerosis Classification with Imbalanced Data using CircleMix Augmentation.”
In addition to receiving support through a DSI-SRP fellowship, this project was supported and facilitated by the DSI Data Science Team through their regular summer workshops and demo sessions.