Stefan Th. Gries
Home
Contact information
Disclaimer
Last updated: 18 March 2020

Teaching at the University of California, Santa Barbara


Ling 218: Corpus Linguistics (S 2020)

Syllabus and overview


This course is a hands-on introduction to advanced corpus-linguistic research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. It is broadly based on my (2016) textbook Quantitative corpus linguistics with R: a practical introduction and McEnery & Hardie's (2012) Corpus Linguistics, supplemented with a variety of research articles. We begin with an intro into R programming especially for textual data before we read a wide variety of papers on corpus-linguistic applications in particular in usage-/exemplar-based and psycholinguistic approaches while writing R scripts that cover the four main corpus-linguistic methods – frequency, dispersion, co-occurrence, and concordancing – on the basis of a variety of differenyt corpora and corpus formats. We conclude by looking at slightly more advanced applications involving anonymous functions and scripts using parallel execution. We use the open source software tool R .


Course downloads



Course folder

Files for session 01-04
Files for session 05
Files for session 06
Files for session 07
Files for session 09
Files for session 09
Files for session 10


Software



R from CRAN (make sure you have version 3.6.3)
RStudio (make sure you have version 1.2.5033 or 1.3.919-2)
LibreOffice (make sure you have version 6.4.1)