# General Course Descriptions

*In all the required labs, students will review relevant theory and work on applications as a group. Computing solutions and extensions will be emphasized.*

**BIOS 6301. Introduction to Statistical Computing. **This course is for students who seek to develop skills in statistical computing. Students will learn how to use R and STATA for data management, database querying, report generation, data presentation, and data tabulation and summarization. Topics will include organization and documentation of data, input and export of data sets, methods of cleaning data, tabulation and graphing of data, programming capabilities, and simulations and bootstrapping. Students will also be introduced to LaTeX and Sweave for report writing and to SAS and SQL programming.

**BIOS 6306. Introduction to Study Design.** This course will introduce principles of study design in medical and health statistics. The designs considered will be case series, ecologic studies, matched and unmatched case-control studies, observational cohort studies, historically controlled clinical trials, screening trials, and randomized clinical trials. The goal is to examine critical design challenges that ultimately impact the ability to make statistical inferences from observed samples to the target populations. Concepts such as internal and external validity, bias identification and control, and confounding and effect modification will be discussed and illustrated with examples from the medical literature. The dependence of traditional univariate measures of statistical association (absolute risk, relative risk, and odds ratios) on critical design elements will be highlighted. Statistical evaluation of diagnostic tests will also be introduced, along with causal inference. Prerequisites: permission of the instructor; access to STATA statistical software.

**BIOS 6311. Principles of Modern Biostatistics. **This is Part 1 of a two-course series for students who seek to develop skills in modern biostatistical reasoning and data analysis. Students learn the statistical principles that govern the analysis of data in the health sciences and biomedical research. Traditional probabilistic concepts and modern computational techniques will be integrated with applied examples from biomedical and health sciences. Statistical computing uses software packages STATA and R; prior familiarity with these packages is helpful but not required. Topics include types of data, tabulation of data, methods of exploring and presenting data, graphing techniques (boxplots, q-q plots, histograms), indirect and direct standardization of rates, axioms of probability, probability distributions and their moments, properties of estimators, the Law of Large numbers, the Central Limit Theorem, theory of confidence intervals and hypothesis testing (one sample and two sample problems), paradigms of statistical inference (Frequentist, Bayesian, Likelihood), introduction to nonparametric techniques, bootstrapping and simulation, sample size calculations, and basic study design issues. Students are required to take 6311L (the hourlong discussion section/lab for this course) concurrently. Prerequisite: Calculus I.

**BIOS 6312 & 6312L. Modern Regression Analysis. **This is Part 2 of a two-course series for students who seek to develop skills in modern biostatistical reasoning and data analysis. Students learn modern regression analysis and modeling building techniques from an applied perspective. Theoretical principles will be demonstrated with real-world examples from biomedical studies. This course requires substantial statistical computing in STATA and R. The course covers regression modeling for continuous outcomes, including simple linear regression, multiple linear regression, and analysis of variance with one-way, two-way, and three-way analysis of covariance models. It also provides a brief introduction to models for binary outcomes (logistic models), ordinal outcomes (proportional odds models), count outcomes (Poisson/negative binomial models), and time to event outcomes (Kaplan-Meier curves, Cox proportional hazard modeling). Incorporated into the presentation of these models are topics such as regression diagnostics, nonparametric regression, splines, data reduction techniques, model validation, and parametric bootstrapping, plus a very brief look at methods for handling missing data. Students are required to take 6312L (the hourlong discussion section/lab for this course) concurrently. Prerequisites: Biostatistics 6311 or equivalent; familiarity with STATA and R software packages.

**BIOS 6321. Clinical Trials and Experimental Design. **This course covers the statistical aspects of study design, monitoring, and analysis. Emphasis is on studies of human subjects—i.e., clinical trials. Topics include principles of measurement, selection of endpoints, bias, masking, randomization and balance, blocking, study designs, sample size projections, interim monitoring of accumulating results, flexible and adaptive designs, sequential analysis, analysis principles, data and safety monitoring boards (DSMB), Institutional Review Boards (IRB), the ethics of animal and human subject experimentation, the history of clinical trials, and the Belmont Report.

**BIOS 6341 & 6341L. Fundamentals of Probability.** The first in a two-course series (6341–6342), Fundamentals of Probability introduces and explores the probabilistic framework underlying statistical theory. Students learn probability theory—the formal language of uncertainty—and its application to everyday statistical concepts and analysis methods. Students will validate analytical solutions and explore limit theorems using R software. This course covers probability axioms, probability and sample space, events and random variables, transformation of random variables, probability inequalities, independence, discrete and continuous distributions, expectations and variances, conditional expectation, moment generating functions, random vectors, convergence concepts (in probability, in law, almost surely), Central Limit Theorem, weak and strong Law of Large Numbers, extreme value distributions, order statistics, and exponential families. Students are required to take 6341L (the hourlong discussion section/lab for this course) concurrently.

**BIOS 6342 & 6342L. Contemporary Statistical Inference. **The second in a two-course series (6341–6342), Contemporary Statistical Inference introduces and explores the fundamental inferential framework for parameter estimation, testing hypotheses, and interval estimation. Students learn classical methods of inference (hypothesis testing) and modes of inference (Frequentist, Bayesian and Likelihood approaches) and their surrounding controversies. Topics include the delta method, sufficiency, minimal sufficiency, exponential families, ancillarity, completeness, conditionality principle, Fisher’s Information, Cramer-Rao inequality, hypothesis testing (likelihood ratios test, most powerful test, optimality, Neyman-Pearson lemma, inversion of test statistics), Likelihood principle, Law of Likelihood, Bayesian posterior estimation, interval estimation (confidence intervals, support intervals, credible intervals), basic asymptotic and large sample theory, maximum likelihood estimation, and resampling techniques (e.g., bootstrap). Students are required to take 6342L (the hourlong discussion section/lab for this course) concurrently. Prerequisite: Biostatistics 6341 or equivalent.

**BIOS 7323 & 7323L. Applied Survival Analysis.** This course provides an applied introduction to methods for time-to-event data with censoring mechanisms. Topics include life tables, nonparametric approaches (e.g., Kaplan-Meir, log-rank), semiparametric approaches (e.g., Cox model), parametric approaches (e.g., Weibull, gamma, frailty), competing risks (introducing Poisson regression and its connection to the Cox model), and time-dependent covariates. The focus is on fitting the models and the relevance of those models for biomedical application. Students are required to take 7323L (the hourlong discussion section/lab for this course) concurrently.

**BIOS 7330. Regression Modeling Strategies.** The course presents strategies for building predictive models and surveys current thinking on them. Its discussion of multivariable predictive modeling for a single response variable will include using regression splines to relax linearity assumptions, the perils of variable selection and overfitting, where to spend degrees of freedom, shrinkage, imputation of missing data, data reduction, and interaction surfaces. The course will also cover methods for graphically understanding models (e.g., using nomograms), using resampling to estimate a model’s likely performance on new data, and statistical methods related to binary logistic models and ordinal logistic and survival models. Students will develop, validate, and graphically describe multivariable regression models. Prerequisites: BIOS 6311 and 6312 or permission of the instructor.

**BIOS 7345 & 7345L. Advanced Regression Analysis I (Linear and General Linear Models).** Students are exposed to a theoretical framework for linear and generalized models. The first half of the semester covers linear models: multivariate normal theory, least squares estimation, limiting chi-square and F-distributions, sum of squares (partial, sequential) and expected sum of squares, weighted least squares, orthogonality, and analysis of variance (ANOVA). The second half of the semester focuses on generalized linear models (e.g., binomial, Poisson, multinomial errors) as well as introducing categorical data analysis, conditional likelihoods, quasi-likelihoods, model checking. Students are required to take 7345L (the hourlong discussion section/lab for this course) concurrently. Prerequisites: BIOS 6341 and 6342.

**BIOS 7346 & 7346L. Advanced Regression Analysis II (General Linear Models and Longitudinal Data Analysis). **During this second course in a yearlong series, students are exposed to a theoretical framework for generalized linear and longitudinal models. The course covers classic repeated measures models, random effect models, generalized estimating equations (GEEs), hierarchical models, transitional models for binary data, marginal vs. mixed effects models, model fitting, model checking, clustering, and implications for study design. There will also be discussion of missing data techniques, Bayesian and Likelihood methods for GLMs, and various fitting algorithms such as maximum likelihood and generalized least squares. Students are required to take 7346L (the hourlong discussion section/lab for this course) concurrently. Prerequisite: BIOS 7345.

**BIOS 7351. Statistical Collaboration in Health Sciences I.** This is the first course of two on collaboration in statistical science and the variety of problems that arise in collaborative arrangements. The goal is to sharpen students’ consulting skills while exposing them to the application of advanced statistical techniques in routine health science applications. The importance of understanding and learning the science underlying collaborations will be emphasized. Students will role-play with real investigators, discuss real consulting projects that have gone awry, and face real-life problems such as opaque scientific direction, poor scientific formulation, lack of time, and ill-formulated messy data. Students will engage in several consulting projects that will involve the use of a wide range of biostatistics methods, from design to analysis. Course content will also make use of departmental clinics that are run concurrently.

**BIOS 7352. ****Statistical Collaboration in Health Sciences II.** This is the second course of a yearlong sequence on collaboration in statistical science and the variety of problems that arise in collaborative arrangements. The goal is to sharpen students’ consulting skills while exposing them to the application of advanced statistical techniques in routine health science applications. The importance of understanding and learning the science underlying collaborations will be emphasized. Students will role-play with real investigators, discuss real consulting projects that have gone awry, and face real-life problems such as opaque scientific direction, poor scientific formulation, lack of time, and ill-formulated messy data. Students will engage in several consulting projects that will involve the use of a wide range of biostatistics methods, from design to analysis. Course content will also make use of departmental clinics that are run concurrently. Prerequisite: BIOS 7351.

**BIOS 7361. Advanced Concepts in Probability and Real Analysis for Biostatisticians.** Topics include characteristic functions, modes of converge, uniform integrability, Brownian motion, classical limit theorems, L^{p} spaces, projections, sigma-algebras and RVs, martingales, random walks, Markov chains, and probabilistic asymptotics. Emphasis on measure theory is minimal. Concepts are illustrated in biomedical applications whenever possible.

**BIOS 7362 & 7362L. Advanced Statistical Learning and I****nference. **This course is an in-depth examination of modern inferential tools. Topics include high-order asymptotics, Edgeworth expansions, nonparametric statistics, quasi-likelihood and estimating equations theory, multivariate classification methods, resampling techniques, statistical learning, methods and theory of high-dimensional data, estimation-maximization (EM) algorithms, and Gibbs sampling. Concepts are illustrated in biomedical applications whenever possible. Students are required to take 7362L (the hourlong discussion section/lab for this course) concurrently.

**BIOS 8366. Advanced Statistical Computing.** This course covers numerical optimization, Markov chain Monte Carlo (MCMC) estimation-maximization (EM), algorithms, Gaussian processes, the Hamiltonian Monte Carlo method, and data augmentation algorithms with applications for model fitting and techniques for dealing with missing data. Prerequisites: BIOS 6341 and 6342 or permission of the instructor. Offered biennially.

**BIOS 8370. Foundations of Statistical Inference. **This course examines the foundations of statistical inference as viewed from Frequentist, Bayesian, and Likelihood approaches. Famous papers and controversies are discussed, along with statistical theories of evidence and decision theory and their historic significance.

**BIOS 8372. Bayesian Methods.** This course covers the methodology and rationale for Bayesian methods and their applications. Statistical topics include the historical development of hierarchical models, Markov chain Monte Carlo (MCMC) and related sampling methods, specification of priors, sensitivity analysis, and model checking and comparison. This course features applications of Bayesian methods to biomedical research. Prerequisites: BIOS 6301, 6312, 6341, 6342, 7330, and 7345, or the equivalent; for non-biostatistics students, permission of the instructor is required. Offered biennially.

**BIOS 8375. Causal Inference. **This course provides an introduction to causal inference methods for observational data and randomized studies. Topics include the Rubin causal model, directed acyclic graphs, propensity scores, inverse probability weighting, instrumental variables, causal mediation analysis, marginal structural models, g-computation, and sensitivity analyses to examine robustness to untestable assumptions. Students will learn the basic theory behind the methods and apply them to biomedical data examples. Prerequisites: BIOS 6341, 6342, 7323, and 7346, or permission of the instructor. Offered biennially.