Stefan Th. Gries |
Teaching at UCSB (2006-) |
(F/W/S = fall/winter/spring quarter) |
|
Last updated: |
W 2025
Ling 201 / 104: Research methodology and statistics in linguistics (2006, 2008, 2014: S; 2008-12, 2014-18, 2020, 2023: F; 2025: W)
This course is a hands-on introduction to fundamental aspects of quantitative/statistical methodology. We begin by looking at a few basic notions such as variables and hypotheses. We then discuss the logic of quantitative studies. We deal with how data from experiments and corpora should be set up for subsequent statistical evaluation. In terms of analysis and evaluation, we explore a variety of descriptive graphs and statistics for frequency data, averages, dispersions, distributions, and correlations. The largest part of the course is concerned with hands-on practice on a variety of statistical tests: Practicing different methods each session, we work on distribution fitting tests, tests for independence, and tests for differences for frequencies, means, dispersions, distributions, and correlations. We use corpus- and psycholinguistic example data, sometimes from published research. This course uses the open source programming language R , and is based on the third edition (2021) of my book Statistics for Linguistics with R: […], which comes with sample data, exercises, answer keys, etc. Since the class requires no prior knowledge of statistics and only very little knowledge of mathematics, it is an entry class for absolute beginners from degree programs esp. in the humanities and social sciences.
Ling 104: Statistical methods in linguistics (2015-17: F; 2022, 2024: W)
This course was a hands-on introduction to fundamentals of quantitative/statistical methodology in linguistics. It was based on the third (2021) edition of my textbook Statistics for linguistics with R: […]. We began by looking at a few basic notions such as variables and hypotheses, familiarized ourselves with how data from experiments and corpora should be set up for subsequent statistical evaluation, and discussed the logic of quantitative studies using the null-hypothesis falsification approach. Then, we were concerned with a variety of descriptive graphs and statistics for frequency data, averages, dispersions, and correlations. The largest part was concerned with a variety of statistical tests: distribution fitting tests, tests for independence, and tests for differences for frequencies, means, dispersions, and correlations. We ended with a small primer for the kind of multifactorial methods that are the subject of Ling 105. We used the open source software tool R .
Ling 105: Predictive modeling in linguistics (2022, 2024: S)
This course was a selective introduction to predictive modeling applications in linguistics. We started with a one-session intro of predictive modeling with an emphasis on regression modeling, which surveyed model formulation, model selection, multifactoriality, and validation. Then, we worked our way through a variety of regression modeling applications: linear regression, binary logistic regression, multinomial, and ordinal regression models. Then, one session was concerned with model diagnostics and model validation. Finally, there was a session on tree-based approaches, specifically classification and regression trees. Like its prerequisite course Ling 104, this course used the open source programming language R , and was based on the third edition (2021) of my book Statistics for Linguistics with R: […], which comes with sample data, exercises, answer keys, etc., plus additional materials.
Ling 110/210: Computational linguistics (2007: W; 2010: S)
This course was a (highly selective) introduction to a discipline known as Computational Linguistics. It featured (i) a brief general introduction to some main areas of research within this field, (ii) an introduction to the programming language R based on my book Quantitative Corpus Linguistics with R: […], with which we worked on linguistic data, and (iii) hands-on work in a computer lab on a variety of case studies from domains such as computational lexicography as well as word sense and synonym disambiguation, information retrieval, automatic text processing, and a few other things such as orthographic similarities of words and spell-checking, computational methods for authorship attribution, and others. Given the practical orientation of the course, this course was ideally suited for students who were thinking of practical applications and wanted to acquire some first computational programming experience (prior experience with R was not necessary, but a larger-than-average computer savviness was recommended). Reading assignments included parts of Manning and Schütze's (2000) Foundations of Statistical Natural Language Processing as well as Jurafsky and Martin's (2000) Speech and Language Processing, supplemented with a variety of introductory chapters and research articles.
Ling 113: Introduction to semantics (2008: W; 2011: F)
This course was an introduction to the linguistic subdiscipline of semantics. After a very brief general introduction to the course and some main semantic concepts, we looked at definitional approaches to word meaning, lexical relations, and cognitive/psycholinguistic approaches to meaning. We then covered sentence meaning, utterances, and propositions as well as logical relations between sentences. Finally, we considered selected aspects of the acquisition of word meaning by children and explored a few central notions of pragmatics (or utterance meaning).
Ling 120: Corpus linguistics (2008, 2014, 2018-19, 2021: S; 2010, 2023, 2025: W)
This course was an introduction to computerized research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. In the first part of this particular class, we began with a theoretical introduction: what is a corpus / what are corpora, what kinds of corpora are there and how are they created/compiled, and why would one use corpora in the first place? In the second part, we familiarized ourselves with the open source programming language and environment R . In the third part, we read a variety of published corpus-linguistic studies as well as replicate, modify, or extend them. The topics covered include syntax (patterns and alternations), lexis/semantics (key words in different cultures and near synonymy), psycholinguistics (disfluencies), and others. Note: This course was based on the second edition (2016) of my textbook Quantitative corpus linguistics with R: […]. New York: Routledge, Taylor & Francis Group, which you will need to have: it will teach you most fundamentals of R programming for text analysis (and can therefore be useful way beyond this course) and contains all readings for the first half of the course as well as additional answer keys and exercises for parts of the second half.
Ling 127 / Psy 127: Psychology of language (2006: F)
This course was an introduction to psycholinguistics concerned with various aspects of language comprehension, production, and acquisition. It was broadly based on Carroll's (2004) Psychology of Language, but also incorporated a variety of additional information/materials.
Ling 137(/237): Introduction to first language acquisition (2007-08: F; 2010, 2018: W)
This course was a selective introduction to the interdisciplinary enterprise of research on first language acquisition. It covered several different though interrelated topics: an introduction to 'the problem of language acquisition', overviews of different theoretical and methodological approaches towards first language acquisition, and introductions to aspects and processes of first language acquisition in different linguistic subdisciplines: phonology/morphology, semantics/lexicon, syntax.
Ling 194: Group studies in linguistics (2006: W)
This course was an introduction to corpus linguistics, involving simple computerized research methods to large data bases of language used in natural communicative settings.
Ling 202: Advanced research methods and statistics in linguistics (2009: F; 2011, 2013, 2015, 2017, 2021: W; 2019, 2023, 2025: S)
This course was a hands-on introduction to more advanced statistical methods to analyze observational and experimental data. After a small recap of monofactorial methods and graphs and an introduction to a process called modeling or model selection, we systematically extended monofactorial tests to their multifactorial and multivariate counterparts. We began with the linear model and extended correlations and t-tests to multiple linear regression, ANOVAs, and ANCOVAs. We then broadened the scope to the powerful methods included in generalized linear modeling (such as binomial logistic regression for binary dependent variables) as well as multinomial and ordinal logistic regression. There was also one session on model assumptions and diagnostics, and we conclude with a session on classification and regression trees. We use the open source software tool the open source software tool R and the third edition (2021) of my book Statistics for Linguistics with R: […].
Ling 204: Statistical methodology (2014, 2016, 2020: W; 2018: S; 2023: F)
This course was a more advanced course on statistical modeling with an emphasis on more sophisticated aspects of regression modeling and other multivariate methods; it presupposed a good understanding of Chapter 1 to 5 of the third edition (2021) of my Statistics for Linguistics with R: [...]. We began with a first brief recap of linear and generalized linear regression modeling. We then discussed the use of contrasts and general linear hypothesis tests for linear and generalized linear regression models, followed by some ideas on how to explore curvature in data (regressions with breakpoints, polynomial regressions, and generalized additive models). This was followed by a larger chunk on linear and generalized linear mixed-effects (or multilevel) modeling, where we reanalyzed published data and discussed numerical and visual exploration of regression results. The last parts were then devoted to random forests and clustering as well as other similarity-based methods. We used the open source software tool R .
Ling 210/110: Computational linguistics (2007: W; 2010: S)
This course was a (highly selective) introduction to a discipline known as Computational Linguistics. It featured (i) a brief general introduction to some main areas of research within this field, (ii) an introduction to the programming language R based on my book Quantitative Corpus Linguistics with R: […], with which we work on linguistic data, and (iii) hands-on work in a computer lab on a variety of case studies from domains such as computational lexicography as well as word sense and synonym disambiguation, information retrieval, automatic text processing, and a few other things such as orthographic similarities of words and spell-checking, computational methods for authorship attribution, and others. Given the practical orientation of the course, this course was ideally suited for students who were thinking of practical applications and wanted to acquire some first computational programming experience (prior experience with R was not necessary, but a larger-than-average computer savviness was recommended). Reading assignments included parts of Manning and Schütze's (2000) Foundations of Statistical Natural Language Processing as well as Jurafsky and Martin's (2000) Speech and Language Processing, supplemented with a variety of introductory chapters and research articles.
Ling 218: Corpus linguistics (2007, 2020: S; 2012: F, 2024: W)
This course was an introduction to computerized research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. There was a bit of 'theoretical' reading on what a corpus is, what kinds of corpora there are, and how they are created/compiled, but nearly all of the course was hands-on practice with the open source programming language and environment R . We began with some basics of R and then dealt with how to write R scripts for the four main corpus-linguistic methods – frequencies, dispersions, association, and keyness – as well as some other applications. For that, we read a few overview articles and a few corpus-linguistic studies that covered topics including syntax (patterns and alternations), lexis/semantics (key words in different cultures and near synonymy), psycholinguistics (disfluencies), and others. Note: This course was based on the second edition of my textbook Quantitative corpus linguistics with R: […]. New York: Routledge, Taylor & Francis Group, which one needed to have: it teaches most fundamentals of R programming for text analysis (and can therefore be useful way beyond this course).
Ling 225: Semantics (2011: S; 2015: W)
In this course, we explored a small range of topics in semantics. Topics we dealt with were structuralist approaches involving necessary and sufficient conditions, the Natural Semantic Metalanguage approach, lexical relations, cognitive semantics (esp. with regard to polysemy and prototypes), computational / distributional semantics, and the acquisition of meaning.
Ling 237(/137): Introduction to first language acquisition (2007-08: F; 2010: W)
This course was a selective introduction to the interdisciplinary enterprise of research on first language acquisition. It covered several different though interrelated topics: an introduction to 'the problem of language acquisition', overviews of different theoretical and methodological approaches towards first language acquisition, and introductions to aspects and processes of first language acquisition in different linguistic subdisciplines: phonology/morphology, semantics/lexicon, syntax.
Ling 252-A/B: Cognitive Linguistics (2006: F; 2007: W)
In the first quarter of this two-quarter research seminar, we explored the set of related approaches known as Cognitive Linguistics. The course provided a brief general introduction to the assumptions governing or underlying most of the field, followed by a variety of case studies focusing on central notions of, and areas of research within, Cognitive Linguistics; these notions and areas of research include metaphor/metonymy, polysemy, Cognitive Grammar, (argument structure) constructions etc.
Ling 253A/B: Quantitative corpus linguistics and legal applications/interpretation (2022: W, S)
In the first quarter of this two-quarter research seminar, we were concerned with the intersection of (quantitative) (corpus) linguistics, lexical semantics, and legal/statutory interpretation. I took some corpus-linguistic knowledge for granted (nothing technical, just the general ideas that are covered in Chapter 2 of the second edition of my corpus textbook). The first three weeks were a slow introduction into various aspects of semantics, in particular (dictionary) definitions, Natural Semantic Metalanguage, sentence semantics, and some aspects of cognitive semantics. After that, we considered several applications of corpus-linguistic methods to legal/statutory interpretation; students read (and partially presented) (i) some Supreme Court opinions; (ii) some corpus-linguistic papers discussing legal applications; (iii) a famous/infamous report currently under litigation; (iv) legal critiques of corpus-linguistic applications; and (v) discussions of experimental jurisprudence; with some discussion of corpus linguistic methods on the side.The goals were to familiarize participants with at least some aspects of legal/forensic linguistics in order for them to be able to write a (likely empirical) paper in the second quarter and maybe be a good expert witness at some point.
Ling 257A/B: Psycholinguistics (2010: F; 2011: W)
In the first quarter of this two-quarter seminar, we explored topics in psycholinguistics from (i) the theoretical perspective of exemplar-/usage-based cognitive/functional linguistics and (ii) the methodological perspective of experimental and observational data and analysis. We read and discussed a variety of papers on topics in language acquisition, language production, 'distributional linguistics', and, depending on participants' choices, language change and sociolinguistics.