Syllabus and overview |
In general, this course is an introduction to computerized research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. In the first part of this particular class, we will begin with a theoretical introduction: what is a corpus / what are corpora, what kinds of corpora are there and how are they created/compiled, and why would one use corpora in the first place? In the second part, we will familiarize ourselves with the open source programming language and environment R . In the third part, we will read a variety of simple but published corpus-linguistic studies as well as replicate, modify, or extend them. The topics to be covered include syntax (patterns and alternations), lexis/semantics (key words in different cultures and near synonymy), psycholinguistics (disfluencies), and others. Note1: This course is based on the second edition of my textbook Quantitative corpus linguistics with R: a practical introduction. New York: Routledge, Taylor & Francis Group, which you will need: it contains all readings for the first half of the course and additional answer keys and exercises for parts of the second half. Note2 and this is very important: We will be using a programming language, which means that the course absolutely requires computer literacy beyond swiping, pinching, long-tapping, and uploading/sending something to/via Facebook, Instagram, Pinterest, Snapchat, or whatever: If you cannot install software, or if you can install software but then don't know 'where the program is', and/or if you download a file on your own personal computer but will then ask me where it went, and/or if you do not know what unzipping a file means (not just opening it, unzipping!), you will not be happy in this course!
|