Stefan Th. Gries
Contact information
Last updated: 31 May 2023

Teaching at the University of California, Santa Barbara

Ling 202: Advanced research methods and statistics in linguistics (Spring 2023)

Syllabus and overview

This course is a selective introduction to predictive modeling applications in linguistics. We start with a one-session intro of predictive modeling with an emphasis on regression modeling, which will survey model formulation, model selection, multifactoriality, and validation. Then, we work our way through a variety of regression modeling applications: linear regression, binary logistic regression, multinomial, and ordinal regression models. Then, one session will be concerned with model diagnostics and, perhaps, model validation. Finally, there is a session on classification and regression trees. Like its prerequisite course Ling 201, this course is based on the third edition of my textbook Statistics for linguistics with R: a practical introduction (2021) and uses the open source programming language R.

Downloads for class sessions
(files will be made available when appropriate)


Course materials (a zipped folder into which you should add the following files as they become available)
Session 01: slides and code/answer key (HTML)
Session 02: code/answer key (HTML)
Session 03: code/answer key (HTML)
Session 04: code/answer key (HTML)
Session 05: code/answer key (HTML)
Session 06: code/answer key (HTML)
Session 07: no class
Session 08: code/answer key (HTML)
Session 09: code/answer key (HTML)
Session 10: code/answer key (HTML)


Graded assignments: Pick two of these 10 assignments and analyze the data comprehensively (as if they were your own); note the difficulty levels, which also correspond to weights: If you do equally well on two assignments with different difficulty levels, you'll get more points for the one with the higher difficulty level.
Deadline for final submission: 17 June 2023, 23:59:59 PDST (no extensions!)

Links to relevant software and sites

R (at least version 4.2.2, and make sure you update all packages before the course starts)
RStudio (at least version 2022.07.2 Build 576); installing Quarto might also be useful
my 2021 statistics textbook, its companion website, and its StatForLing with R newsgroup, which I moderate.