Bridging the Language Gap: Unlocking the Potential of Generative AI in Hindi Instruction

The Data Science Institute (DSI) at Vanderbilt University, in collaboration with Dr. Elliott McCarter, a Senior Lecturer in Asian Studies at Vanderbilt, is currently engaged in an exciting research project as part of the Data Science for Social Good summer program. They hope to address the language acquisition gap by utilizing generative AI, specifically through the use of ChatGPT.

The project focuses on developing valuable learning materials for Hindi instruction as a second language. One aspect involves helping instructors create vocabulary sets tailored to specific topics, while another aspect involves the development of a tool that allows users to generate vocabulary sets from documents or articles. This will enable users to understand essential grammar concepts necessary for comprehending Hindi documents. By leveraging generative AI, learners will have access to data-supported instructional materials that enhance their language proficiency.

This initiative serves as a proof of concept, laying the foundation for future projects in linguistics and other low resource languages. By utilizing generative AI techniques, the team aspires to provide a model for supporting instruction in less commonly taught languages. To facilitate this research, a substantial collection of 10,000 Hindi newspaper articles has been acquired. This corpus serves as the basis for developing learning materials that empower students to learn the language and successfully understand these articles.

Through the collaborative efforts of the DSI and Dr. McCarter, and the utilization of generative AI techniques, this project strives to advance language instruction, making it more accessible and impactful for learners of Hindi and other languages.