Central question: How many X does the phrase some X next to a Y refer to? Your predictors are
OBJECT: the sizes of the objects X: large vs. small;
REFPOINT: the sizes of the reference points Y: large vs. small.
Analyze the data properly with a regression model and summarize the results (briefly). [Difficulty level: 1]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/quantifyingsome.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CASE OBJECT REFPOINT ESTIMATE
Min. : 1.00 large:8 large:8 Min. : 2.0
1st Qu.: 4.75 small:8 small:8 1st Qu.:38.5
Median : 8.50 Median :44.0
Mean : 8.50 Mean :51.5
3rd Qu.:12.25 3rd Qu.:73.0
Max. :16.00 Max. :91.0
2 Assignment 02
Central question: What determines the number of praises in child-caretaker interaction? The data come from recording of different children and contain the following variables :
PRAISES: the response variable, the number of times the children are praised by their caretakers;
CHILD: the name of each child;
SEX: the sex of each child;
CAN: the number of verb phrases where the caretakers use can when speaking about actions of the child;
WANT: the number of verb phrases where the caretakers use want when speaking about actions of the child;
SHOULD_SHALL: the number of verb phrases where the caretakers use should/shall when speaking about actions of the child;
DIRECTIVE: the number of verb phrases where the caretakers uses a directive when speaking about actions of the child;
SUCCESS: the number of times the child does something as intended;
FAILURE: the number of times the child does something not as intended.
You now want to determine to what degree the number of praises is a function of
all predictors as main effects
and interaction of a predictor with SEX.
Analyze the data with properly and summarize the results (briefly). [Difficulty level: 3]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/praises.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CHILD SEX PRAISES CAN WANT
aRetha : 1 f:15 Min. : 0.0 Min. : 0.000 Min. : 0.00
aRnold : 1 m:13 1st Qu.: 2.0 1st Qu.: 1.000 1st Qu.: 0.75
baRbara: 1 Median : 5.0 Median : 4.000 Median : 2.00
beRnard: 1 Mean : 5.5 Mean : 4.321 Mean : 3.25
chRis : 1 3rd Qu.: 7.5 3rd Qu.: 5.250 3rd Qu.: 6.00
chRissy: 1 Max. :13.0 Max. :18.000 Max. :10.00
(Other):22
SHOULD_SHALL DIRECTIVE SUCCESS FAILURE
Min. :0.0000 Min. : 0.00 Min. : 0.000 Min. :0.000
1st Qu.:0.0000 1st Qu.: 9.00 1st Qu.: 4.000 1st Qu.:1.000
Median :0.0000 Median :12.00 Median : 6.500 Median :3.000
Mean :0.8929 Mean :15.61 Mean : 7.679 Mean :3.286
3rd Qu.:1.2500 3rd Qu.:19.50 3rd Qu.:10.000 3rd Qu.:5.250
Max. :6.0000 Max. :46.00 Max. :18.000 Max. :8.000
3 Assignment 03
Central question: is the choice of of- vs. s-genitives (the car of my father vs. my father’s car) dependent in some way on the animacy of the possessor (my father) and/or the possessed (the car)? Your predictors are
POSSESSOR: the animacy of the possessor: abstract vs. animate vs. concrete;
POSSESSED: the animacy of the possessed: abstract vs. animate vs. concrete.
Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/genitivesem.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CASE GENITIVE POSSESSOR POSSESSED
Min. : 1.00 of:150 abstract:139 abstract:206
1st Qu.: 75.75 s :150 animate :118 animate : 20
Median :150.50 concrete: 43 concrete: 74
Mean :150.50
3rd Qu.:225.25
Max. :300.00
4 Assignment 04
Central question: is the choice of try to- vs. try and-constructions (I’m gonna try to fix this problem vs. I’m gonna try and fix this problem, which is in the column TRY) dependent in some way on the following 3 predictors and all their interactions:
MODE: whether the data represent spoken (spk) or written (wrt) English;
VARIETY: whether the data represent American (amer) or British English (brit);
CLAUSE: does the clause in which try is used with to or and already involve another to (as in we’re going -> to <- try and beat this thing) or not (other)?
(Source: Hommerberg, Charlotte & Gunnel Tottie. 2007. Try to or Try and? Verb complementation in British and American English. ICAME Journal 31. 45-64.)
Analyze the data like we discussed and summarize the results (briefly). [Difficulty level: 1]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/tryandtryto.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CASE TRY VARIETY MODE CLAUSE
Min. : 1 and:1631 amer:1187 spk:2257 other:1662
1st Qu.: 808 to :1598 brit:2042 wrt: 972 to :1567
Median :1615
Mean :1615
3rd Qu.:2422
Max. :3229
5 Assignment 05
Central question: is the choice of I vs. you , which is represented in the column MATCH dependent in some way on the following 3 predictors and all their pairwise interactions:
SEX: whether the speaker is female or male;
SENTENCE: where in the file I or you was used on a scale from 0 (first sentence) to 1 (last sentence);
DISTANCE: where in the sentence I or you was used on a scale from 0 (first character) to ≈1 (last character).
The following loads the data and prepares the variable DISTANCE:
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/IvsYou.csv", # this filestringsAsFactors=FALSE)) # don't change categorical variables into factors (!)
CASE FILE SPEAKER SEX
Min. : 1 Length:21102 Length:21102 Length:21102
1st Qu.: 5276 Class :character Class :character Class :character
Median :10552 Mode :character Mode :character Mode :character
Mean :10552
3rd Qu.:15827
Max. :21102
SENTENCE PRECEDING MATCH SUBSEQUENT
Min. :0.0000 Length:21102 Length:21102 Length:21102
1st Qu.:0.2394 Class :character Class :character Class :character
Median :0.5147 Mode :character Mode :character Mode :character
Mean :0.5014
3rd Qu.:0.7573
Max. :1.0000
CASE FILE SPEAKER MATCH SEX
Min. : 1 KRL :4610 PS5VN : 1248 i : 2 : 1043
1st Qu.: 5276 KRH :3590 PS62L : 852 I :11637 f: 6676
Median :10552 KRT :3093 PS63K : 785 you: 8619 m:12480
Mean :10552 KRP :1997 PS5T8 : 655 You: 844 u: 903
3rd Qu.:15827 KR0 :1445 PS5VL : 647
Max. :21102 KRG :1385 PS59B : 632
(Other):4982 (Other):16283
SENTENCE SENTLENGTH DISTANCE
Min. :0.0000 Min. : 1.0 Min. :0.0000
1st Qu.:0.2394 1st Qu.: 65.0 1st Qu.:0.0351
Median :0.5147 Median : 141.0 Median :0.2453
Mean :0.5014 Mean : 181.3 Mean :0.3197
3rd Qu.:0.7573 3rd Qu.: 250.0 3rd Qu.:0.5600
Max. :1.0000 Max. :1353.0 Max. :0.9978
Analyze the data properly and summarize the results (briefly). [Difficulty level: 4]
6 Assignment 06
Central question: Do n-grams returned early by an algorithm (BINRANK: early) get rated better (ordinal response: RATING) than returned late by that algorithm (BINRANK: late) if one controls for the length of the n-gram (SIZE)? The data frame contains the following variables :
RATING: the response variable, integers from 1 to 7;
SIZE: the number of parts of each n-gram;
BINRANK: the main predictor as per the above.
Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/MERGErating.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CASE GRAM PARTICIPANT SCORE SIZE
Min. : 1.0 GRAM001: 5 A1 : 80 Min. :1.000 Min. :2.00
1st Qu.: 400.8 GRAM002: 5 A2 : 80 1st Qu.:1.000 1st Qu.:2.75
Median : 800.5 GRAM003: 5 A3 : 80 Median :3.000 Median :3.50
Mean : 800.5 GRAM004: 5 A4 : 80 Mean :3.758 Mean :3.50
3rd Qu.:1200.2 GRAM005: 5 A5 : 80 3rd Qu.:7.000 3rd Qu.:4.25
Max. :1600.0 GRAM006: 5 B1 : 80 Max. :7.000 Max. :5.00
(Other):1570 (Other):1120
BINRANK
early:800
late :800
7 Assignment 07
Central question: Are results on subordinate clause ordering from the studies of Hampe and Diessel comparable/compatible? Here are the data:
CASE: the usual numbering column;
STUDY: a column indicating to which study a data point in a row belongs: diessel vs. hampe;
ORDER: the response variable in each study, the order of main and subordinate clause (and you know this response from another study in the book);
CONJ: the predictor in each study, the subordinate conjunction used in the subordinate clause:
CASE STUDY ORDER CONJ
Min. : 1.0 diessel: 381 mc-sc:1311 after :379
1st Qu.: 473.2 hampe :1509 sc-mc: 579 before:617
Median : 945.5 once :408
Mean : 945.5 until :486
3rd Qu.:1417.8
Max. :1890.0
Are Hampe’s and Diessel’s findings ‘the same’? Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]
8 Assignment 08
Central question: What determines how speakers rate the acceptability (the 7-level response variable RATING) of to- vs. -ing complementation (as in I like to swim vs. I like swimming) in an experiment?
CX_NOW: whether the current experimental stimulus is a to or an -ing construction?
VNOW_PREF: whether the verb in the current experimental stimulus generally prefers to appear with to or an -ing constructions?
CX_PRV: whether the previous experimental stimulus was a to or an -ing construction?
any interactions of these predictors?
Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/toingpriming.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CASE RATING CXNOW VNOW_PREF CXPREV
Min. : 1.0 Min. :-3.0000 ing:270 ing:280 ing:278
1st Qu.:139.8 1st Qu.:-1.0000 to :286 to :276 to :278
Median :278.5 Median : 0.0000
Mean :278.5 Mean : 0.3705
3rd Qu.:417.2 3rd Qu.: 2.0000
Max. :556.0 Max. : 3.0000
9 Assignment 09
Central question: Do children and their caretakers exhibit different correlations (measured in Cramer’s V values) between tense (past vs. non-past) and aspect (perfective vs. imperfective) such that
adults’ correlation values don’t change over time anymore;
children’s correlation values change over time and approximate the adults’ value(s).
You have data from a corpus study and these are the variables in the data frame:
AGE: the age of the child at recording time: YEAR;MONTH.DAY;
KID: the Cramer’s V value for the child’s tense-aspect correlation in this recording;
CARETAKER: the Cramer’s V value for the caretaker’s tense-aspect correlation in this recording
Note: Whatever graphs involving time you use, the axis representing the age of the child must of course be proportional to the age, not just to the position of an age in the vector of ages. I don’t care about how you do that, if you do that in a spreadsheet software, that’s fine, too.
Analyze the data with properly and summarize the results (briefly). [Difficulty level: 3]
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/russaspect.csv", # this filestringsAsFactors=FALSE)) # don't change categorical variables into factors (!)
CASE AGE KID CARETAKER
Min. : 1.00 Length:80 Min. :0.01645 Min. :0.1627
1st Qu.:20.75 Class :character 1st Qu.:0.31861 1st Qu.:0.3004
Median :40.50 Mode :character Median :0.44217 Median :0.3554
Mean :40.50 Mean :0.45170 Mean :0.3640
3rd Qu.:60.25 3rd Qu.:0.57247 3rd Qu.:0.4355
Max. :80.00 Max. :1.00000 Max. :0.5586
10 Assignment 10
Central question: what factors co-determine how English changed from a 3rd-person singular -th (e.g., He giveth) to the current 3rd-person singular -s (e.g., He gives)? You have data from a corpus study on how the third person singular form in English changed across five time periods (from P1 at about 1480 to P5 at about 1700). This data set contains annotation for third person singular verbs (extracted from letters) with regard to the following variables:
VARIANT: the response variable: the third person singular form as found in the corpus file: es vs. th;
TIME5: the time period: P1 vs. P2 vs. P3 vs. P4 vs. P5;
SENGEND: the sex of the sender of the letter: female vs. male;
RECGEND: the sex of the recipient of the letter: female vs. male;
CLOSEFAM: whether the recipient of the letter is a close family member of the sender or not: no vs. yes;
FINSYB: whether the verb stem ends in a sibilant: no (e.g., see) vs. yes (e.g., seize);
FOLFRIC: what the word following the third person singular form begins with: s (e.g., he sees seagulls) vs. th (e.g., he sees the seagulls) vs. other (e.g., he sees many seagulls);
GRAM: whether the verb in question is used as a grammatical or a lexical verb: yes (grammatical, i.e. be, do and aux. have) vs. no (lexical/other).
rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/thirdpers.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
VARIANT AUTH_GEND REC_SAME_GEND CLOSE_FAM VNCPERIOD FIN_SYB
es:1524 female: 784 no :1210 no :1917 P1: 505 no :3953
th:2619 male :3359 yes:2933 yes:2226 P2: 99 yes: 190
P3:1508
P4:1096
P5: 935
FOL_FRIC GRAM
es : 189 no :2867
other:3666 yes:1276
th : 288
You want to characterize how the predictors and their pairwise interactions with TIME are correlated with the change from -(e)th to -(e)s. Analyze the data properly and summarize the results (briefly). Note: you must conflate the 3 early time periods into one, but once you’re done with everything, you should figure out why. [Difficulty level: 4]
11 Session info
sessionInfo()
R version 4.4.3 (2025-02-28)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Los_Angeles
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets compiler methods
[8] base
other attached packages:
[1] STGmisc_1.0 Rcpp_1.0.14 magrittr_2.0.3
loaded via a namespace (and not attached):
[1] digest_0.6.37 fastmap_1.2.0 xfun_0.51 knitr_1.50
[5] htmltools_0.5.8.1 rmarkdown_2.29 cli_3.6.4 rstudioapi_0.17.1
[9] tools_4.4.3 evaluate_1.0.3 yaml_2.3.10 rlang_1.1.5
[13] jsonlite_2.0.0 htmlwidgets_1.6.4 MASS_7.3-65
Source Code
---title: "Ling 202: all assignments"author: - name: "[Stefan Th. Gries](https://www.stgries.info)" affiliation: - UC Santa Barbara - JLU Giessen orcid: 0000-0002-6497-3958date: "2025-04-02 12:34:56"date-format: "DD MMM YYYY HH-mm-ss"editor: sourceformat: html: page-layout: full code-fold: false code-link: true code-copy: true code-tools: true code-line-numbers: true code-overflow: scroll number-sections: true smooth-scroll: true toc: true toc-depth: 4 number-depth: 4 toc-location: left monofont: lucida console tbl-cap-location: top fig-cap-location: bottom fig-width: 5 fig-height: 5 fig-format: png fig-dpi: 300 fig-align: center embed-resources: trueexecute: cache: false echo: true eval: true warning: false---```{r setup, include=FALSE}knitr::opts_chunk$set(eval=TRUE, echo=TRUE, error=FALSE, warning=FALSE, message=FALSE, fig.align="center", cache=FALSE, cache.lazy=FALSE)```# Assignment 01Central question: How many X does the phrase *some X next to a Y* refer to? Your predictors are* `OBJECT`: the sizes of the objects X: *large* vs. *small*;* `REFPOINT`: the sizes of the reference points Y: *large* vs. *small*.Analyze the data properly with a regression model and summarize the results (briefly). [Difficulty level: 1]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/quantifyingsome.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```# Assignment 02Central question: What determines the number of praises in child-caretaker interaction? The data come from recording of different children and contain the following variables :* `PRAISES`: the response variable, the number of times the children are praised by their caretakers;* `CHILD`: the name of each child;* `SEX`: the sex of each child;* `CAN`: the number of verb phrases where the caretakers use *can* when speaking about actions of the child;* `WANT`: the number of verb phrases where the caretakers use *want* when speaking about actions of the child;* `SHOULD_SHALL`: the number of verb phrases where the caretakers use *should*/*shall* when speaking about actions of the child;* `DIRECTIVE`: the number of verb phrases where the caretakers uses a directive when speaking about actions of the child;* `SUCCESS`: the number of times the child does something as intended;* `FAILURE`: the number of times the child does something not as intended.You now want to determine to what degree the number of praises is a function of* all predictors as main effects* and interaction of a predictor with `SEX`.Analyze the data with properly and summarize the results (briefly). [Difficulty level: 3]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/praises.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```# Assignment 03Central question: is the choice of *of*- vs. *s*-genitives (*the car of my father* vs. *my father's car*) dependent in some way on the animacy of the possessor (*my father*) and/or the possessed (*the car*)? Your predictors are* `POSSESSOR`: the animacy of the possessor: *abstract* vs. *animate* vs. *concrete*;* `POSSESSED`: the animacy of the possessed: *abstract* vs. *animate* vs. *concrete*.Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/genitivesem.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```# Assignment 04Central question: is the choice of *try to*- vs. *try and*-constructions (*I'm gonna try to fix this problem* vs. *I'm gonna try and fix this problem*, which is in the column `TRY`) dependent in some way on the following 3 predictors and all their interactions:* `MODE`: whether the data represent spoken (*spk*) or written (*wrt*) English;* `VARIETY`: whether the data represent American (*amer*) or British English (*brit*);* `CLAUSE`: does the clause in which ~try~ is used with *to* or *and* already involve another *to* (as in *we're going -> to <- **try and** beat this thing*) or not (*other*)?(Source: Hommerberg, Charlotte & Gunnel Tottie. 2007. *Try to* or *Try and*? Verb complementation in British and American English. *ICAME Journal* 31. 45-64.)Analyze the data like we discussed and summarize the results (briefly). [Difficulty level: 1]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/tryandtryto.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```# Assignment 05Central question: is the choice of *I* vs. *you* , which is represented in the column `MATCH` dependent in some way on the following 3 predictors and all their pairwise interactions:* `SEX`: whether the speaker is *female* or *male*;* `SENTENCE`: where in the file *I* or *you* was used on a scale from 0 (first sentence) to 1 (last sentence);* `DISTANCE`: where in the sentence *I* or *you* was used on a scale from 0 (first character) to ≈1 (last character).The following loads the data and prepares the variable `DISTANCE`:```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/IvsYou.csv", # this filestringsAsFactors=FALSE)) # don't change categorical variables into factors (!)d$SENTLENGTH <-nchar(d$PRECEDING) +nchar(d$MATCH) +nchar(d$SUBSEQUENT)d$DISTANCE <-nchar(d$PRECEDING)/d$SENTLENGTHd <- d[,c(1:3,7,4:5,9:10)]; d[,2:5] <-lapply(d[,2:5], as.factor)summary(d)```Analyze the data properly and summarize the results (briefly). [Difficulty level: 4]# Assignment 06Central question: Do *n*-grams returned early by an algorithm (`BINRANK`: *early*) get rated better (ordinal response: `RATING`) than returned late by that algorithm (`BINRANK`: *late*) if one controls for the length of the *n*-gram (`SIZE`)? The data frame contains the following variables :* `RATING`: the response variable, integers from 1 to 7;* `SIZE`: the number of parts of each *n*-gram;* `BINRANK`: the main predictor as per the above.Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/MERGErating.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```# Assignment 07Central question: Are results on subordinate clause ordering from the studies of Hampe and Diessel comparable/compatible? Here are the data:* `CASE`: the usual numbering column;* `STUDY`: a column indicating to which study a data point in a row belongs: *diessel* vs. *hampe*;* `ORDER`: the response variable in each study, the order of main and subordinate clause (and you know this response from another study in the book);* `CONJ`: the predictor in each study, the subordinate conjunction used in the subordinate clause:```{r}rm(list=ls(all.names=TRUE))d <-data.frame(STUDY=rep(c("diessel", "hampe"), 8),ORDER=rep(c("sc-mc", "mc-sc"), each=8),CONJ =rep(rep(c("after", "before", "once", "until"), each=2), 2),FREQ =c(27, 82, 6, 105, 77, 236, 5, 41, 70, 200, 81, 425, 21, 74, 94, 346))d <-data.frame(lapply(d[, -4], \(af) { rep(af, d$FREQ) }))d <-data.frame(lapply(d, as.factor))summary(d <-cbind(CASE=seq(nrow(d)), d))```Are Hampe's and Diessel's findings 'the same'? Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]# Assignment 08Central question: What determines how speakers rate the acceptability (the 7-level response variable `RATING`) of *to*- vs. -*ing* complementation (as in *I like to swim* vs. *I like swimming*) in an experiment?* `CX_NOW`: whether the current experimental stimulus is a *to* or an -*ing* construction?* `VNOW_PREF`: whether the verb in the current experimental stimulus generally prefers to appear with *to* or an -*ing* constructions?* `CX_PRV`: whether the previous experimental stimulus was a *to* or an -*ing* construction?* any interactions of these predictors?Analyze the data properly and summarize the results (briefly). [Difficulty level: 2]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/toingpriming.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```# Assignment 09Central question: Do children and their caretakers exhibit different correlations (measured in Cramer's *V* values) between tense (past vs. non-past) and aspect (perfective vs. imperfective) such that* adults' correlation values don't change over time anymore;* children's correlation values change over time and approximate the adults' value(s).You have data from a corpus study and these are the variables in the data frame:* `AGE`: the age of the child at recording time: YEAR;MONTH.DAY;* `KID`: the Cramer's *V* value for the child's tense-aspect correlation in this recording;* `CARETAKER`: the Cramer's *V* value for the caretaker's tense-aspect correlation in this recordingNote: Whatever graphs involving time you use, the axis representing the age of the child must of course be proportional to the age, not just to the position of an age in the vector of ages. I don't care about how you do that, if you do that in a spreadsheet software, that's fine, too.Analyze the data with properly and summarize the results (briefly). [Difficulty level: 3]```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/russaspect.csv", # this filestringsAsFactors=FALSE)) # don't change categorical variables into factors (!)```# Assignment 10Central question: what factors co-determine how English changed from a 3rd-person singular -*th* (e.g., *He giveth*) to the current 3rd-person singular -*s* (e.g., *He gives*)? You have data from a corpus study on how the third person singular form in English changed across five time periods (from *P1* at about 1480 to *P5* at about 1700). This data set contains annotation for third person singular verbs (extracted from letters) with regard to the following variables:* `VARIANT`: the response variable: the third person singular form as found in the corpus file: *es* vs. *th*;* `TIME5`: the time period: *P1* vs. *P2* vs. *P3* vs. *P4* vs. *P5*;* `SENGEND`: the sex of the sender of the letter: *female* vs. *male*;* `RECGEND`: the sex of the recipient of the letter: *female* vs. *male*;* `CLOSEFAM`: whether the recipient of the letter is a close family member of the sender or not: *no* vs. *yes*;* `FINSYB`: whether the verb stem ends in a sibilant: *no* (e.g., *see*) vs. *yes* (e.g., *seize*);* `FOLFRIC`: what the word following the third person singular form begins with: *s* (e.g., *he sees seagulls*) vs. *th* (e.g., *he sees the seagulls*) vs. *other* (e.g., *he sees many seagulls*);* `GRAM`: whether the verb in question is used as a grammatical or a lexical verb: yes (grammatical, i.e. *be*, *do* and aux. *have*) vs. *no* (lexical/other).```{r}rm(list=ls(all.names=TRUE))summary(d <-read.delim( # summarize d, the result of loadingfile="_input/thirdpers.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors```You want to characterize how the predictors and their pairwise interactions with `TIME` are correlated with the change from -*(e)th* to -*(e)s*. Analyze the data properly and summarize the results (briefly). Note: you must conflate the 3 early time periods into one, but once you're done with everything, you should figure out why. [Difficulty level: 4]# Session info```{r}sessionInfo()```