Masters in Data Science

Curriculum

  1. Computation which will focus on programming, data structures, computer systems, and methods.
  2. Data Analysis which will focus on data exploration, analysis, prediction, inference and algorithms.
  3. Practice aimed to impart workplace skills, ethical standards, and awareness of data science to date.

The Vanderbilt Master of Science in Data Science is an in person 9-month, 4-mod, 30 credit program, which includes the completion and presentation of a capstone project (read more on Capstones HERE).  Students will be trained in the three core sequences, gain practical experience, and sharpen workplace skills (teamwork, communication, leadership). Below is an example of course sequencing.

Plan of Study 2026 - 2027

Capstone Project Assignments 

DS 5210 Intro to Programming (1cr)

DS 5110 Intro to Probability and Statistics (1cr)

DS 5310 Mathematics of Machine Learning (1cr)

*Mod 0 courses are able to be waived based on assessment results taken over the summer. These courses do not increase your overall tuition cost. 

What is the Mod System?

Vanderbilt’s M.S. in Data Science runs on a four-module (“Mod”) calendar, short, intensive terms that let you master one cluster of skills before moving to the next. It’s fast, focused, and built for momentum.

Course Descriptions

  • DS 6110 Probability & Statistical Inference

    This course covers the fundamentals of probability theory and statistical inference. Topics in probability include random variables, distributions, expectations, central limit theorem, and law of large number. Topics in inference include maximum likelihood, point estimation; hypothesis and significance testing; re-sampling techniques. Complex mathematical proofs will be illustrated with computational solutions.

    Mod 1, [2cr]

  • DS 6210 Principles in Programming and Simulation

    Students learn the foundations of effective software design and programming practice, including key concepts such as classes, recursion, packages, modules, and vectorized computation (with examples using PyTorch). The course also emphasizes practical skills such as web scraping, model deployment with Docker and AWS, and building interactive data science web applications using Streamlit. Students gain experience with workflow and collaboration tools including Jupyter Notebooks, Markdown, GitHub, and version control. Reproducible programming and data processing methods are emphasized throughout, helping students build strong computational foundations and preparing them for advanced coursework in machine learning and deep learning.

    Mod 1, [2cr]

  • DS 6410 Business Foundations for Data Scientists

    This course introduces the business foundations necessary to design and execute impactful data science projects. Students learn to define project scope, formulate problem statements, and connect analytical work to organizational objectives. Emphasis is placed on understanding core business domains—marketing, operations, finance, accounting, and management—and how data science contributes to decision-making within each. Through structured analysis and applied exercises, students practice translating business questions into data-driven approaches and communicating insights clearly to diverse audiences.

    Mod 1, [2cr]

  • DS 6120 Exploratory Data Analysis

    This course will teach students how to explore, summarize, and graph data (big and small). Topics include principles of perception, how to display data, scatterplots, histograms, boxplots, bar charts, dynamite plots, proper data summaries, dimensionality reduction, multidimensional scaling, and unsupervised clustering algorithms, such as principal component analysis, k-means clustering, and nearest neighbor algorithms.

    Mod 1, [2cr]

  • DS 6420 Data Governance and Ethical Leadership in Data Science

    This course examines the frameworks, principles, and practices that ensure data is managed responsibly and effectively across its lifecycle, as well as current and emerging themes in AI and data ethics. Students will explore topics including data quality, stewardship, privacy, and security, ethical analysis, as well as the organizational policies and regulatory environments that shape responsible data and AI governance. Emphasis is placed on ethical decision-making, transparency, and accountability in the use of data and AI systems. Through case studies and applied exercises, students will learn how to identify and discuss key ethical issues in data science, and to design and implement governance strategies that balance innovation, compliance, and societal impact.

    Mod 2, [2cr]

  • DS 6310 Machine Learning

    This course provides a comprehensive introduction to statistical modeling and machine learning, serving as the first in a two-course sequence focused on both foundational and advanced predictive modeling techniques. It emphasizes key concepts such as prediction and calibration, model evaluation, cross-validation, and the bias–variance trade-off, while covering methods including regression (linear, logistic, and regularized), gradient descent optimization, decision trees, ensemble learning, clustering techniques , and dimensionality reduction using PCA. Students will apply these approaches to real-world problems such as customer segmentation, churn prediction, and fraud detection, gaining both theoretical understanding and practical experience in modern machine learning.

    Mod 2, [2cr]

  • DS 6220 Data Management Systems

    This course covers database management systems, e.g., relation databases, data architecture, and security. Topics include entity-relationship models and relational theory; storage and access of data; complex SQL queries. Students are exposed to database architectures as time allows.

    Mod 2, [2cr]

  • DS 6320 Data Science Algorithms

    Data science algorithms will explore some of the algorithmic aspects central to data science (such as optimization, learning theory, backpropagation, and Markov chains) from a bottom-up approach. Students will practice implementing these methods using numpy, Python, and the Jupyter notebook platform. By the end of the course students should: understand and be able to implement from a bottom-up approach the foundational algorithms of data science; be able to explain to stakeholders how certain algorithms work and suggest methods for a give problem; anticipate how a change in parameters changes an algorithm’s outcome. Additionally, students will gain the skills to interpret published algorithms.

    Mod 3, [2cr]

  • DS 6230 Big Data Scaling

    This course will address key challenges that arise when working with big data and parallel processing. Practical techniques for storing, retrieving, and scaling are discussed. Topics include high-performance computing, parallel processing, commercial cloud architectures, and mapping of data science algorithms onto scalable computing platforms.

    Mod 3, [2cr]

  • DS 6430 Data Science Teamwork in Practice

    Students will work in teams and learn how to use the technology of teams to engage in real world data science problems. Teams will apply their skills in a supervised environment where active learning is reinforced and learn to make practical decisions during a first end-to-end project. Students will gain a practical experience in teamwork tools, commonly used data science technologies, as well as learn how to participate and support teams as the primary data curator and data analyst and practice the soft skills needed to successfully contribute to team projects during a second project proposed by a partnering client.

    Mod 3, [2cr]

  • DS 6340 Generative AI Models in Theory and Practice

    Transformer models are finding wide application (NLP, audio analysis or “textless NLP”, computer vision, and more) and are achieving state of the art performance across multiple tasks. In this course we will discuss the theoretical underpinnings of transformers, cover the skills and tools needed to use transformers, and gain hands-on experience. Students will be assigned two papers to present over the semester, complete self-guided training using the Huggingface.co training material, and will apply a transformer-based model to solve a research problem.

    Mod 3, [2cr]

  • DS 6330 Deep Learning

    Deep Learning expands upon topics learned in Machine Learning and Data Science Algorithms, focusing on applying advanced machine learning techniques. Students will study modern deep learning techniques such as neural networks, convolutional/recurrent neural networks, and reinforcement learning. The course will focus on applications of these models and how to use them in the real world.

    Mod 4, [2cr]

  • DS 6440 Data Storytelling and Communication

    In this course, students will learn how to effectively communicate data insights to stakeholders through written, verbal, and visual means. The course integrates two key skills: communicating with data and data visualization. Students will explore different storytelling techniques and learn how to design effective visualizations that are both aesthetically pleasing and informative. By the end of the course, students will have the skills they need to become effective data storytellers and communicators, able to craft compelling narratives that convey complex data insights in a way that is both informative and engaging for a wide range of audiences.

    Mod 4, [2cr]

  • DS 6670 Capstone Development

    A yearlong, team-based experiential project in which students apply data science methods to a substantive, real-world problem. Projects are sponsored by industry partners, Vanderbilt faculty, or professional contacts identified by students. Over the course of the academic year, students define the problem and scope, secure and manage data and required permissions, develop analytical or product solutions, and deliver results to stakeholders. The capstone is integrated across the curriculum and culminates in formal written and oral presentations. Emphasis is placed on project management, professional communication, and the synthesis of technical and domain knowledge.

    Mod 4, [0cr]

    Read More

Elective Course Descriptions

The following are some possible electives that you may take to fulfill your elective credit requirements that have been pre-approved. Many of these courses have pre-requisites or require instructor approval, so please contact the individual department before enrolling in your choices. You may take your internship or participate in a research practicum for credit and that will count towards your elective credits.

Course Descriptions

  • DS 6360 Context-Augmented Generative AI Applications

    This course explores techniques for leveraging generative AI to build real-world LLM-based solutions. Students will gain a deep understanding of foundational AI approaches, including the varieties of retrieval augmented generation, agent-based systems, and model fine-tuning. Through practical implementation of various AI solutions, students will learn to evaluate and select the most appropriate methods to enhance system performance based on specific client demands. By the end of the course, students will be equipped to effectively apply generative AI in practical settings, with an emphasis on operationalizing and monitoring LLM applications requiring proprietary data.

    [2cr]

     

  • DS 6240 No SQL for Modern Data Science Applications

    This course will prepare students on current and emerging practices for handling unstructured and semi-structured data. Many modern data science applications are highly data intensive, require heavy read/write workloads, and are often unstructured in nature, which requires storage and processing beyond relational databases and management methodologies. NoSQL (Not-Only SQL) databases are non-schema oriented and provide additional capabilities that support these types of applications. This course will introduce NoSQL systems such as MongoDB (document), Redis (key-value), and Neo4j (graph) databases. Students will gain hands-on experience deploying these databases using Docker and learning data modeling, querying, and integration techniques essential for modern analytics and data-driven applications.

    [2cr]

  • DS 6130 Time Series Analysis

    Time series analysis uses patterns of past performance to make informed predictions about the future. This course breaks down the fundamentals of time series analysis up to multivariate forecasting. It focuses on foundational models (exponential smoothing, ARIMA and variants, vector autoregression including error-correction) to build knowledge for more complex time series modeling.

    [2cr]

  • DS 6350 Natural Language Processing

    This course focuses on using computers to automatically analyze language data using word embedding models. The goal of the course is to provide students with the background and computing skills necessary to independently analyze and assess language in naturally occurring text. The course is broken into three parts with the first part focusing on data acquisition, cleaning, provisioning, and documentation. The second part of the course focuses on early word embedding approaches to capture meaning. This will include identifying key words and terms using TF-IDF, topic modeling, and identifying semantic relationships between words using high dimension vectors. The final part of the class introduces transformed-based approaches for language modeling including pre-trained transformers, fine-tuned transformers, sentence transformers, and generative models. The course will be taught using the Python programming language and appropriate Python packages (e.g., Pandas, spaCy, PyTorch, Gensim, Transformers). The course will be practice-based, and the course assignments will require students to analyze language data appropriate to their field of interest, discuss the data from an analytical language perspective, and model and interpret outcomes.

    [2cr]

  • DS 6520 Leveraging NLP in Asset Management

    This is a specialized, project-driven course that dives into the intersection of finance and natural language processing (NLP). It is designed for students who are passionate about applying cutting-edge machine learning techniques to real-world applications in finance. It first introduces foundational NLP concepts such as text processing, word representations, and model evaluation. Students will then engage in hands-on projects tailored to the unique needs of asset management. These projects, in collaboration with industry experts from AB, include harnessing the power of Large Language Models (LLMs) for investment prompts, mastering the art of financial document summarization, exploring classification challenges in finance, and delving into question answering and speech recognition in a financial context. Prerequisite: Completion of the Transformer course. Ideal for those seeking to elevate their financial prowess with AI-driven insights.

    [2cr]

  • DS 6620 Data Science Internship

    A supervised, for-credit internship in which students apply data science knowledge and skills in a professional setting external to Vanderbilt. Internships may be full-time or part-time and may occur during the summer or academic year. While engaged in their internship duties, students complete weekly reflective journal entries through the course portal and submit a final report synthesizing their technical, professional, and experiential learnings. Emphasis is placed on the practical application of classroom concepts, development of professional competencies, and effective communication of lessons gained from the internship experience.

  • DS 6610 Data Science Practicum

    A supervised research experience conducted under the direction of a Vanderbilt or VUMC faculty member. Students engage in data-intensive research activities—such as analysis, modeling, or methodological development—within an established lab or research group. The faculty supervisor oversees the student’s work, expectations, and any deliverables produced as part of the research experience. Prior approval from the supervising faculty member is required. The practicum provides an opportunity to apply data science concepts in an academic research environment while developing technical and professional skills.

    [2cr]

Electives Students Have Taken in the Past

  • PSY 8551. Bayesian Cognitive Modeling

    Offered in Fall.

  • PSY 6775. Models of Human Memory

    Mathematical and computational models of the cognitive processes underlying human memory. Attribute-based models, instance theories, neural network models, retrieved-context models, executive function and working memory models. Methods of fitting models to empirical data.

    Offered in Fall, alternative years. [3]

  • MGT 6509. Health Care Data Analysis

    This course will focus on the key managerial questions in the health care industry, the unique institutional data that is available, and how to develop models to address these questions. Topics will include benchmarking financial, operational, and clinical performance at both the organizational and market levels. Students will be required to develop a basic familiarity with SAS programming.

    [2]

  • NSC 5270. Computational Neuroscience

    Theoretical, mathematical, and simulation models of neurons, neural networks, or brain systems. Computational approaches to analyzing and understanding data such as neurophysiological, electrophysiological, or brain imaging. Demonstrations simulating neural models. No credit for students who have earned credit for 3270.

    Offered in Spring. [3]

  • MGT 6461. Consumer Insight for Marketing Decision Making

    This course is designed to provide an overview of marketing research that yields consumer insights for use in effective marketing decision making. The course emphasizes two things that are very relevant for a marketing manager: 1) how to evaluate the design of research studies to assess whether the results are valid and meaningful, and 2) how to analyze and interpret market research data for marketing decision making. Towards this end, we will examine a variety of marketing research techniques, including focus groups, projective techniques, depth interviews, observation, ethnography, and survey design. This course will provide students with a “hands- on” experience with these various marketing research techniques, through case discussions and group projects.

    Mod 2, Offered second half of Fall semester.

  • MGT 6463. Quantitative Methods for Marketing Decision Making

    The broad objective of this course is to provide a fundamental understanding of the quantitative marketing research methods employed by well-managed firms. The course is aimed at the manager who is the ultimate user of the research and is thus responsible for determining the scope and direction of research conducted. In the course, we will cover different types of research designs, techniques of data collection, and data analysis. Emphasis will be on the interpretation and use of results rather than on mathematical derivations. The course focuses on helping managers recognize the role of systematic information gathering and analysis in making marketing decisions, in addition to developing an appreciation for the potential contributions and limitations of marketing research data.

    Mod 3, Offered first half of Spring Semester [2].

  • MGT 6465. Marketing Analytics

    Marketing decisions are primarily the purview of CEOs, CMOs, consultants, and marketing managers, but, increasingly, marketing has permeated throughout companies such that all managers must consider their customers. Marketing decisions are optimal when they are fact based, and marketing models are informed by both data and judgment. Models will be studied, created, and tested for all elements of marketing: clustering customers into segments, forecasting market sizes, customer relationship management database systems, diffusion rates for new products, advertising budgeting, pricing models, etc.

    Mod 4, offered second half of Spring semester. [2]

  • MGT 6331. Managerial Finance

    This class provides the framework for analyzing the various components needed to value real assets, as well as an introduction to the valuation of financial assets. Topics include the time value of money, capital budgeting, measuring risk in financial markets, market efficiency and an introduction to options.

    Mod 1, offered first half of Fall semester. [2]

  • MGT 6431. Corporate Valuation

    This course focuses on providing students with a strong theoretical and applied understanding of the key tools used in equity valuation and stock selection. Approaches to valuation include dividend discount models, cash flow models, and valuation by multiples. Financial statement data are used in developing cash flow forecasts, and market data are used in estimating the cost of capital. The effects of firm financing policy, corporate taxes, and potential investment options are given special consideration. Applications include capital budgeting, the evaluation of potential mergers and acquisitions, and corporate restructuring. The objective of the course is to show how to manage companies to add value.

    Mod 1,offered in first half of Fall semester. Mod 2, offered second half of Fall semester. And Mod 3, offered first half of Spring semester. [2]

    Pre-requisite: MGT 6331

  • MGT 6430. Investments

    Studies solutions to fundamental problems faced by individual and institutional investors. First, we cover a number of topics in fixed income markets including the different ways of computing bond yields, forecasts of interest rates using the yield curve, and duration and convexity as measures of bond risk. Second, we solve the asset allocation problem to determine an optimal portfolio mix. We review the relevant theory, use an advanced spreadsheet to find an answer, and discuss issues faced by portfolio managers. Third, we use two methods to value options, the Black-Scholes formula and the binomial tree, and show how investors can use options to customize their risk-reward profile. This course is equivalent to MGT 6404 so it is not available for MSF students.

    Mod 3, offered first half of Spring semester. Mod 1, offered in first half of Fall semester. [2]

    Pre-requisite: MGT 6331

  • MGT 6420. The Future of Energy Markets in a Low Carbon Economy

    Every company, regardless of size or industry, relies upon energy and creates carbon emissions in a variety of ways. Increasing demand for energy from emerging economies coupled with the concern over climate change is rapidly changing the nature of energy supply and demand. Companies are increasingly turning to energy conservation, energy efficiency and renewable energy to both create new business opportunities as well as reduce the risks of increasing costs or disruptions in supply. This course will focus on this critical sector of the economy and examine how leading businesses are acting to address this topic.

    Mod 2, offered second half of Fall semester. [2]

  • MGT 6428. Social Enterprise and Entrepreneurship

    Social Enterprise & Entrepreneurship will explore the spectrum of activity in the growing social enterprise arena, where business models and entrepreneurial approaches are increasingly being used to directly address social and environmental issues. Topics addressed will explore nonprofit, hybrid, and for-profit social enterprise models, and the intersection of social entrepreneurship with capital formation issues, international development, technology & innovation, global health, cross-sector models, and microfinance as a case study in social enterprise & innovation. Course content will include a combination of instructor lecture, readings on focus areas, guest speakers representing the leading social entrepreneurs and social enterprises in the field, and a group project that will be integrated with the other course curriculum.

    Mod 4, offered second half of Spring semester. [2]

  • MGT 6472. Supply Chain Management

    This course builds upon the business process innovation concepts introduced in the introductory operations management course and examines material, information, and cash flows between firms within a supply chain. Topics include supply chain strategy, demand forecasting and inventory management methods for short and long lifecycle products, supply chain collaboration and coordination, and operational methods for managing supply chain risk.

    Mod 4, offered second half of Spring semester. [2]

    Prerequisite: MGT 6371.

  • MGT 6473. Health Care Operations

    In the U.S, the health care sector accounts for 17% of gross domestic product. Facing decreasing reimbursements and ever increasing costs, and coupled with pressure to deliver quality under pay-for-performance and bundled payment schemes, health care organizations are under unprecedented pressure to improve efficiency and quality. Consequently, health care organizations need to adopt well-proven operations management concepts to better manage their processes. In this course, we will analyze health care organizations using both qualitative and quantitative principles of operations management to address issues around patient flows, capacity and staff planning, process failure and learning. The course is based on reading current articles, solving case studies, and hands-on data driven exercises. The final project involves students deploying operations management concepts to propose solutions to problems currently faced by a real hospital. The course builds on the core course in operations management, and will benefit students interested in consulting, operations management, and/or health care.

    Mod 1, offered first half of Fall semester. [2]

    Prerequisite: MGT 6371.

  • HGEN. Practical Python Programming and Algorithms for Data Analysis

    This course is intended for students who are focused on big data analysis in the Python programming language, from large scale epidemiologic datasets, electronic medical records, or next generation sequence data. It will cover basic programming, including strings, arrays, dictionaries, conditional statements, data visualization, external data sources, and algorithms, with a focus on using programing to solve challenges within the students’ own research projects. At the end of the course, students will have an understanding of the foundation of programming in Python. They will understand the importance and use of regular expressions and efficient data search tools and will demonstrate proficiency in algorithms and data visualization. Evaluations will be based on a midterm exam, homework, a final project, and class participation. The proposed course is not for undergraduates or professional credit.

    Offered in Spring [3]

  • EECE. Signal Processing and Communications

    AM and FM modulation. Also, advanced topics in signal processing are treated.

    Offered in Spring [3]

    No credit for students who have earned credit for EECE 4252.

*Taking classes from Owen (MGT) may be the most convenient; we share the same academic schedule and their classes are also 2 credits. 

Pre-Approved Computer Science Electives

  • CS 5260. Artificial Intelligence

    Principles and programming techniques of artificial intelligence. Strategies for searching, representation of knowledge and automatic deduction, learning, and adaptive systems. Survey of applications.

    Offered in Fall [3]

    No credit for students who have earned credit for CS 4260.

  • CS 5278. Principals of Software Engineering

    The nature of software. The object-oriented paradigm. Software life-cycle models. Requirements, specification, design, implementation, documentation, and testing of software. Object-oriented analysis and design. Software maintenance.

    Offered in Fall [3]

    No credit for students who have earned credit for CS 4278.

  • CS 5288. Web-based System Architecture

    Core concepts necessary to architect, build, test, and deploy complex web-based systems; analysis of key domain requirements in security, robustness, performance, and scalability.

    Offered in Fall [3]

    No credit for students who have earned credit for CS 4288.

  • CS 5289. Project in Web-based Software Architectures

    Project-based course building on core concepts necessary to architect, build, test, and deploy complex web-based systems. Students form teams, propose project ideas, architect their solutions, and build the initial minimum viable project for their application. In-class discussions focus on advanced topics in web-development.

    Offered in Spring [3]

    No credit for students who have earned credit for CS 4289.

  • CS 5266. Big Data

    Principles and practices of big data processing and analytics. Data storage databases and data modeling techniques, data processing and querying, data analytics and applications of machine learning using these systems.

    Offered in Spring [3]

    Pre-requisite: CS 3251

  • CS 6310. Design and Analysis of Algorithms

    Set manipulation techniques, divide-and-conquer methods, the greedy method, dynamic programming, algorithms on graphs, backtracking, branch-and-bound, lower bound theory, NP-hard and NP-complete problems, approximation algorithms.

    Offered in Spring [3]

    Pre-requisite: CS 3250

  • CS 6311. Graph Algorithms

    Algorithms for dealing with special classes of graphs. Particular emphasis is given to subclasses of perfect graphs and graphs that can be stored in a small amount of space. Interval, chordal, permutation, comparability, and circular-arc graphs; graph decomposition.

    [3]

    Pre-requisite: CS 6310 or Math 4710.

  • CS 6320. Algorithms for Parallel Computing

    Design and analysis of parallel algorithms for sorting, searching, matrix processing, FFT, optimization, and other problems. Existing and proposed parallel architectures, including SIMD machines, MIMD machines, and VLSI systolic arrays.

    [3]

    Pre-requisite: CS 6310

  • CS 5262. Foundations of Machine Learning

    Theoretical and algorithmic foundations of supervised learning, unsupervised learning, and reinforcement learning. Linear and nonlinear regression, kernel methods, support vector machines, neural networks and deep learning methods, instance-based methods, ensemble classifiers, clustering and dimensionality reduction, value and policy iteration. Explainable AI, ethics, and data privacy.

    Offered in Spring [3]

  • CS 6360. Adv Artificial Intel

    Discussion of state-of-the-art and current research issues in heuristic search, knowledge representation, deduction, and reasoning. Related application areas include: planning systems, qualitative reasoning, cognitive models of human memory, user modeling in ICAI, reasoning with uncertainty, knowledge-based system design, and language comprehension.

    [3]

    Pre-requisite: CS 4260 or equivalent

  • CS 6362. Advanced Machine Learning

    Theory and algorithms for designing systems that learn from data including modern machine learning methods that take advantage of increased complexity to provide improved performance. Data types, data pre-processing, measures of similarity and dissimilarity. Supervised learning: decision trees, logistic regression, support vector machines, Bayesian methods, and neural networks; unsupervised learning: partitional, hierarchical, density-based, and graph clustering algorithms. Feature selection for classification and clustering. Evaluation methods. Reinforcement learning: Markov Decision processes, dynamic programming, Monte Carlo methods, TD-learning.

    Offered in Fall [3]

CS Classes are not recommended for anyone without a CS background.

Please note: Course descriptions and numbers are tentative and subject to change