Stefan Th. Gries
Contact information
Last updated: 29 April 2021

Teaching at the University of California, Santa Barbara

Ling 120: Corpus linguistics (S2021)

Syllabus and overview

In general, this course is an introduction to computerized research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. In the first part of this particular class, we will begin with a theoretical introduction: what is a corpus / what are corpora, what kinds of corpora are there and how are they created/compiled, and why would one use corpora in the first place? In the second part, we will familiarize ourselves with the open source programming language and environment R . In the third part, we will read a variety of simple but published corpus-linguistic studies as well as replicate, modify, or extend them. The topics to be covered include syntax (patterns and alternations), lexis/semantics (key words in different cultures and near synonymy), psycholinguistics (disfluencies), and others.

Note1: This course is based on the second edition of my textbook Quantitative corpus linguistics with R: a practical introduction. New York: Routledge, Taylor & Francis Group, which you will need to have: it will teach you most fundamentals of R programming for text analysis (and can therefore be useful way beyond this course) and contains all readings for the first half of the course as well as additional answer keys and exercises for parts of the second half.

Note2 and this is very important: We will be using a programming language, which means that the course absolutely requires computer literacy beyond swiping, pinching, long-tapping, and uploading/sending something to/via Facebook, Instagram, Pinterest, Snapchat, or whatever: If you cannot install software, or if you can install software but then don't know 'where the program is', and/or if you download a file on your own personal computer but will then ask me where it went, and/or if you do not know what unzipping a file means (not just opening it, unzipping!), you will not be happy in this course!

Downloads: slides, worksheets, code, data

File(s) for session 01
File(s) for session 02-05
File(s) for session 06
File(s) for session 07
File(s) for session 08
File(s) for session 09
File(s) for session 10

Corpus data

Course-final assignments

Links to relevant software

R from CRAN