Skip to main content

Graduate Student Uncovers New Galactose Assimilation Pathway

Posted by on Friday, April 26, 2024 in featured.

By: Andy Flick Evolutionary Studies Scientific Coordinator

Marie-Claire Harrison sitting on a couch smiling

Not to be outdone by her significant contributions to a study that appeared in April 25’s Science magazine, Marie-Claire Harrison, a graduate student in the Rokas Lab, published a first-author article in the journal Proceedings of the National Academy of Sciences the very next day titled “Machine learning enables identification of a yeast galactose utilization pathway.”

Harrison used machine learning – a subfield of artificial intelligence that develops predictive models that learn from data – to discover that a group of ancient yeasts have evolved a novel, alternate pathway to metabolize a food source (galactose). The broader significance of Harrison’s work is that machine learning may be widely applicable for helping us better understand the genetic basis of trait variation.

Harrison created a model using environmental, metabolic, and genomic data to predict whether or not each of the yeasts (nearly 1000) could grow on more than twenty carbon sources. After some fine-tuning, she dove deeper into galactose utilization. Their original testing showed that an alternative pathway for using galactose existed in yeasts that did not have the canonical so called GAL gene pathway. Harrison then teamed up with Emily Ubbelohde, a graduate student in Chris Todd Hittinger’s lab at the University of Wisconsin-Madison; Ubbelohde went to the lab and grew the species of yeast that lacked the canonical GAL pathway in galactose and found that a second pathway did exist, just as the model had predicted.

When asked about her favorite part of this work, Harrison noted, “it was so cool to be able to generate a hypothesis while testing out a method that was new to me and for the Hittinger lab to be able to test out that hypothesis by confirming the alternative galactose-degrading pathway in several species. I also loved being able to apply the method to 28 other carbon substrates to see how predictable metabolic pathways generally are in Saccharomycotina yeasts when a machine learning model is trained on metabolic, genomic and environmental datasets—there’s more variation in accuracy than I would have previously expected!”

Harrison, who received the 2023 Brighter Ventures Student Award that supports graduate students interested in the application of artificial intelligence in the biomedical research field, found her way to the Rokas Lab by pursuing a wide range of biological topics in college.

“I worked in a lab that studied C. elegans genetics and eggshell structure but found myself drawn to more computational methods after taking a ‘Python for Biologists’ class my senior year,” she said. “I started the Biological Sciences Ph.D. program at Vanderbilt in 2019 with an interest in microbiology and computational biology. I chose to rotate in the Rokas Lab to gain computational skills, with Dr. Abigail LaBella as my mentor, but ended up falling in love with the lab and my project at the time, which looked at the evolution of regulation of galactose metabolism in Saccharomycotina yeasts. I also enjoyed that my work in this lab encompasses a lot of my former interests from undergrad, from ecology to biochemistry.”

Up next, Harrison is continuing pursuing machine learning. She said, “I’ve been expanding my machine learning models to include gene sequence data as well as just predicted gene presence/absence, and I’m currently working on a project where I’m trying to predict antifungal drug resistance in yeasts, which is a major medical challenge, with the great dataset from the Y1000+ project.”

Citation: Harrison, M.C., Ubbelohde, E.J., LaBella, A.L., Opulente, D.A., Wolters, J.F., Zhou, X., Shen, X.X., Groenewald, M., Hittinger, C.T. and Rokas, A., Machine learning enables identification of an alternative yeast galactose utilization pathway. 2024. Proceedings of the National Academy of Science. 121 (18) e2315314121.

Funding Source: This work was performed using resources contained within the Advanced Computing Center for research and Education at Vanderbilt University in Nashville, TN. This work was funded by grants from: the NSF (LR23C140001, DEB-2110403, DEB-2110404), the Fundamental Research Funds for the Central Universities (226-2023-00021), the key research project of Zhejiang Lab (2021PE0AC04), the USDA National Institute of Food and Agriculture (Hatch Projects 1020204 and 7005101), the DOE Great Lakes Bioenergy Research Center (DOE BER Office of Science DE–SC0018409), an H.I. Romnes Faculty Fellowship (Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation), the NIH/National Institute of Allergy and Infectious Diseases (R01 AI153356), and the Burroughs Wellcome Fund

Tags: , , , , , , ,