Not just frequency:
Keyness should integrate frequency, association, and dispersion
1 Introduction
1.1 General introduction
This is how one would compute G2 for the word type of faith in lower-cased Brown D (T) vs. the rest of Brown (R):
(Table1.obs <- matrix(c(43, 34547, 68, 978821), nrow=2,
dimnames=list(FAITH=c("yes", "no"), CORPUS=c("TARGET", "REFERENCE")))) CORPUS
FAITH TARGET REFERENCE
yes 43 68
no 34547 978821
temp <- chisq.test(Table1.obs, correct=FALSE) # for expecteds & residuals
c("G-squared"="*"( # multiply
2 * sign(temp$residuals[1,1]), # m by ...
sum(Table1.obs * log(Table1.obs/temp$expected)))) # ... this sumG-squared
147.0412
1.2 Overview of the present paper
2 Methods
2.1 Data
Let’s create the small example corpus:
x.tar <- data.frame(
WORD=rep(c("a","b","a","x","z","i","a","c","d","f","g","h","x","a","c",
"b","f","g","i","z","x","a","c","e","g","i","x","z"),
c(1,4,1,2,1,1,1,1,1,1,1,3,2,1,1,1,1,1,2,1,2,1,1,2,1,1,3,1)),
PART=rep(c("tar1","tar2","tar3","tar4"), c(9,10,10,11)))
x.ref <- data.frame(
WORD=rep(c("a","b","c","a","h","y","x","a","b","d","e","f","e","y","x",
"a","b","d","e","g","i","y","x","a","b","d","e","g","x"),
c(1,1,3,1,1,1,2,1,1,1,1,3,1,1,1,1,1,1,1,1,2,1,2,1,1,1,1,1,5)),
PART=rep(c("ref1","ref2","ref3","ref4"), c(11,10,10,9)))
x <- rbind(x.tar, x.ref)Let’s compute a word-by-corpus matrix with absolute frequencies:
Show it as it is shown in the paper:
2.2 The three component of keyness
2.2.1 The frequency component
For the frequency component, we
- take each word type ever attested in T or R;
- take its frequency in T (which might be 0), add 1, and compute the binary log of that;
- just for record-keeping,
- take its frequency in R (which might be 0), add 1, and compute the binary log of that;
- take its frequency in T and R combined, add 1 (just for homogeneity), and compute the binary log of that;
- min-max transform both vectors of logged values,
- begin to store everything in a data frame
results:
results <- data.frame(
WORD=rownames(WORD.by.CORPUS.abs), # word types
FREQTAR=WORD.by.CORPUS.abs[,"tar"], # their freqs in T
FREQREF=WORD.by.CORPUS.abs[,"ref"], # their freqs in R
KEYFREQTAR=WORD.by.CORPUS.abs[,"tar"] %>% "+"(1) %>% log2 %>% zero2one,
KEYFREQALL=rowSums(WORD.by.CORPUS.abs) %>% "+"(1) %>% log2 %>% zero2one)
results # show results WORD FREQTAR FREQREF KEYFREQTAR KEYFREQALL
a a 5 5 0.7781513 0.6285430
b b 5 4 0.7781513 0.5693234
c c 3 3 0.6020600 0.3477088
d d 1 3 0.3010300 0.1386469
e e 2 4 0.4771213 0.3477088
f f 2 3 0.4771213 0.2519296
g g 3 2 0.6020600 0.2519296
h h 3 1 0.6020600 0.1386469
i i 4 2 0.6989700 0.3477088
x x 9 10 1.0000000 1.0000000
y y 0 3 0.0000000 0.0000000
z z 3 0 0.6020600 0.0000000
2.2.2 The association component
Since we will use the KLD to compute associations between word and corpus, we first define a small function to compute the KLD:
KLD <- function (posterior, prior) {
if (sum(posterior) > 1) { posterior <- posterior / sum(posterior) }
if (sum(prior) > 1) { prior <- prior / sum(prior) }
logged.fractions <- log2(posterior / prior)
logged.fractions[posterior == 0] <- 0
contributions.to.KLD <- posterior * logged.fractions
return(sum(contributions.to.KLD))
}There are two possible directions of association one could compute:
- one quantifies how much the distribution of a word over T and R diverges from the corpus sizes, i.e., in a sense,
- how much each word changes the proportional distribution of the two corpora;
- how much a word type of interest changes the probability that one is looking at T;
- how much better you can predict whether you’re looking at T when you know the word;
- one quantifies how much the presence or absence of a word in T diverges from presence or absence of a word in T and R combined, i.e., in a sense,
- how much the corpus being T changes the proportional distribution of each word, or, typically,
- how much T increases the probability of occurrence of a word.
- how much better you can predict the presence or absence of a word when you know you’re looking at T.
The function obviously allows both directions of computation, but in this paper, we will use the former computation, for which we create a word-by-corpus matrix WORD.by.CORPUS.rel with row proportions, i.e. with proportions that represent what proportions of each word type show up in T and R:
CORP
WORD tar ref
a 0.5000 0.5000
b 0.5556 0.4444
c 0.5000 0.5000
d 0.2500 0.7500
e 0.3333 0.6667
f 0.4000 0.6000
g 0.6000 0.4000
h 0.7500 0.2500
i 0.6667 0.3333
x 0.4737 0.5263
y 0.0000 1.0000
z 1.0000 0.0000
After that, we
- normalize the KLD-values with the odds-to-probabilities transformation (KLD/1+KLD);
- multiply the normalized KLD with
- +1 if the KLD represents an attraction to T;
- -1 if the KLD represents an attraction to R;
- divide the normalized KLDs by their max to ‘stretch the value range’ of word types key for
- T such that it exhausts the complete range of (0, 1];
- R such that it exhausts the range of [corresponding minimum, 0]:
attr.to.T <- sign("-"( # make attracted.to.T the sign of difference of
WORD.by.CORPUS.rel[,"tar"], # the observed proportion in T
prop.table(colSums(WORD.by.CORPUS.abs))["tar"])) # the proportion of T in T+R
results$KEYASSOC %<>% KLD.norm %>% "*"(attr.to.T) %>% "/"(max(.))
results WORD FREQTAR FREQREF KEYFREQTAR KEYFREQALL KEYASSOC
a a 5 5 0.7781513 0.6285430 0.000000000
b b 5 4 0.7781513 0.5693234 0.017690016
c c 3 3 0.6020600 0.3477088 0.000000000
d d 1 3 0.3010300 0.1386469 -0.317520657
e e 2 4 0.4771213 0.3477088 -0.151065640
f f 2 3 0.4771213 0.2519296 -0.056458719
g g 3 2 0.6020600 0.2519296 0.056458719
h h 3 1 0.6020600 0.1386469 0.317520657
i i 4 2 0.6989700 0.3477088 0.151065640
x x 9 10 1.0000000 1.0000000 -0.003990255
y y 0 3 0.0000000 0.0000000 -1.000000000
z z 3 0 0.6020600 0.0000000 1.000000000
2.2.3 The dispersion component
For dispersion, we first need to compute each word type’s dispersion in each of the two corpora T and R. Here are the relevant steps for T: we
- compute two word-by-part matrix for T:
- one with absolute frequencies (which will mostly be used to get the prior: the corpus part sizes);
- one with relative frequencies (the posteriors: namely the proportions of word types across the corpus part);
- compute the KLD for each word type;
- normalize the KLD-values with the odds-to-probabilities transformation (KLD/1+KLD);
- min-max transform the values
- subtract them from 1:
WORD.by.PART.abs <- with(x.tar, table(WORD, PART)) # compute word-by-corpus matrix (abs. freqs)
WORD.by.PART.rel <- prop.table(WORD.by.PART.abs, 1) # compute word-by-corpus matrix row props.)
KEYDISPTAR <- apply( # make KEYDISPTAR the result of appylying to
WORD.by.PART.rel, # WORD.by.PART.rel (for T)
1, # row by row
KLD, colSums(WORD.by.PART.abs)) %>% # the function KLD w/ the col sums as the prior
KLD.norm %>% zero2one %>% "-"(1,.)Then we do the same for R:
WORD.by.PART.abs <- with(x.ref, table(WORD, PART)) # compute word-by-corpus matrix (abs. freqs)
WORD.by.PART.rel <- prop.table(WORD.by.PART.abs, 1) # compute word-by-corpus matrix row props.)
KEYDISPREF <- apply( # make KEYDISPREF the result of appylying to
WORD.by.PART.rel, # WORD.by.PART.rel (for R)
1, # row by row
KLD, colSums(WORD.by.PART.abs)) %>% # the function KLD w/ the col sums as the prior
KLD.norm %>% zero2one %>% "-"(1,.)To add all these results to results, we first add two placeholder columns KEYDISPTAR and KEYDISPREF, which for now only contain 0s:
Then, we insert the computed dispersion values for all word types attested in T and/or R:
After that, we compute for each word type the difference between its dispersion in T minus its dispersion in R and again ‘stretch them’ by dividing by their max. This is because then
- high values will represent word types that are evenly distributed in T but clumpily distributed or unattested in R (i.e. words that are dispersionally key for T);
- low values will represent word types that are evenly distributed in R but clumpily distributed or unattested in T (i.e. words that are dispersionally key for R).
WORD FREQTAR FREQREF KEYFREQTAR KEYFREQALL KEYASSOC KEYDISPTAR KEYDISPREF KEYDISP
a a 5 5 0.7781513 0.6285430 0.000000000 1.00000000 0.47246210 0.77953532
b b 5 4 0.7781513 0.5693234 0.017690016 0.14721358 1.00000000 -1.26015049
c c 3 3 0.6020600 0.3477088 0.000000000 0.70088274 0.02414894 1.00000000
d d 1 3 0.3010300 0.1386469 -0.317520657 0.00000000 0.52624922 -0.77763106
e e 2 4 0.4771213 0.3477088 -0.151065640 0.02826716 0.47788111 -0.66438821
f f 2 3 0.4771213 0.2519296 -0.056458719 0.29422753 0.00000000 0.43477587
g g 3 2 0.6020600 0.2519296 0.056458719 0.70088274 0.22375505 0.70504487
h h 3 1 0.6020600 0.1386469 0.317520657 0.00000000 0.02414894 -0.03568455
i i 4 2 0.6989700 0.3477088 0.151065640 0.61605922 0.00000000 0.91034203
x x 9 10 1.0000000 1.0000000 -0.003990255 0.96546219 0.66863773 0.43861333
y y 0 3 0.0000000 0.0000000 -1.000000000 0.00000000 0.59877181 -0.88479667
z z 3 0 0.6020600 0.0000000 1.000000000 0.65487307 0.00000000 0.96769672
2.3 What to do with those values
2.3.1 Keeping dimensions separate
We first can add for each word type how many of the two dimensions of association and dispersion represent the word as key for T:
results$KEYONHOWMANY <- "+"( # add
pmax(0, sign(results$KEYASSOC)), # 1 of KEYASSOC is >0
pmax(0, sign(results$KEYDISP))) # 1 of KEYDISP is >0
results WORD FREQTAR FREQREF KEYFREQTAR KEYFREQALL KEYASSOC KEYDISPTAR KEYDISPREF KEYDISP KEYONHOWMANY
a a 5 5 0.7781513 0.6285430 0.000000000 1.00000000 0.47246210 0.77953532 1
b b 5 4 0.7781513 0.5693234 0.017690016 0.14721358 1.00000000 -1.26015049 1
c c 3 3 0.6020600 0.3477088 0.000000000 0.70088274 0.02414894 1.00000000 1
d d 1 3 0.3010300 0.1386469 -0.317520657 0.00000000 0.52624922 -0.77763106 0
e e 2 4 0.4771213 0.3477088 -0.151065640 0.02826716 0.47788111 -0.66438821 0
f f 2 3 0.4771213 0.2519296 -0.056458719 0.29422753 0.00000000 0.43477587 1
g g 3 2 0.6020600 0.2519296 0.056458719 0.70088274 0.22375505 0.70504487 2
h h 3 1 0.6020600 0.1386469 0.317520657 0.00000000 0.02414894 -0.03568455 1
i i 4 2 0.6989700 0.3477088 0.151065640 0.61605922 0.00000000 0.91034203 2
x x 9 10 1.0000000 1.0000000 -0.003990255 0.96546219 0.66863773 0.43861333 1
y y 0 3 0.0000000 0.0000000 -1.000000000 0.00000000 0.59877181 -0.88479667 0
z z 3 0 0.6020600 0.0000000 1.000000000 0.65487307 0.00000000 0.96769672 2
2.3.2 Amalgamations
The first two different amalgamations can be implemented very straightforwardly as follows:
- the former weighs the association component by the frequency of the word in T (by multiplication) and adds it to the dispersion component;
- the latter weighs both the association and dispersion components by frequency (by multiplication):
results$AMALGAM1 <- with(results, "+"( # add
"*"(KEYASSOC, # the product of KEYASSOC
KEYFREQTAR), # and KEYFREQTAR
KEYDISP)) # to KEYDISP
results$AMALGAM2 <- with(results, "*"( # multiply
KEYASSOC + KEYDISP, # the sum of KEYASSOC & KEYDISP
KEYFREQTAR)) # by KEYFREQTAR
results WORD FREQTAR FREQREF KEYFREQTAR KEYFREQALL KEYASSOC KEYDISPTAR KEYDISPREF KEYDISP KEYONHOWMANY
a a 5 5 0.7781513 0.6285430 0.000000000 1.00000000 0.47246210 0.77953532 1
b b 5 4 0.7781513 0.5693234 0.017690016 0.14721358 1.00000000 -1.26015049 1
c c 3 3 0.6020600 0.3477088 0.000000000 0.70088274 0.02414894 1.00000000 1
d d 1 3 0.3010300 0.1386469 -0.317520657 0.00000000 0.52624922 -0.77763106 0
e e 2 4 0.4771213 0.3477088 -0.151065640 0.02826716 0.47788111 -0.66438821 0
f f 2 3 0.4771213 0.2519296 -0.056458719 0.29422753 0.00000000 0.43477587 1
g g 3 2 0.6020600 0.2519296 0.056458719 0.70088274 0.22375505 0.70504487 2
h h 3 1 0.6020600 0.1386469 0.317520657 0.00000000 0.02414894 -0.03568455 1
i i 4 2 0.6989700 0.3477088 0.151065640 0.61605922 0.00000000 0.91034203 2
x x 9 10 1.0000000 1.0000000 -0.003990255 0.96546219 0.66863773 0.43861333 1
y y 0 3 0.0000000 0.0000000 -1.000000000 0.00000000 0.59877181 -0.88479667 0
z z 3 0 0.6020600 0.0000000 1.000000000 0.65487307 0.00000000 0.96769672 2
AMALGAM1 AMALGAM2
a 0.7795353 0.6065964
b -1.2463850 -0.9668222
c 1.0000000 0.6020600
d -0.8732143 -0.3296735
e -0.7364648 -0.3890704
f 0.4078382 0.1805032
g 0.7390364 0.4584708
h 0.1554819 0.1696822
i 1.0159324 0.7418921
x 0.4346231 0.4346231
y -0.8847967 0.0000000
z 1.5697567 1.1846715
The Euclidean distance can be computed as follows:
Finally, the Mahalanobis distance could be computed with the standard R function as follows:
The most meaningful way to then return the output might be sorted (in decreasing order by the number of dimensions on which a word is key for T (because one really only wants those words that are key on both association and dispersion) and then one amalagamation score (e.g., the Mahalanobis distance):
WORD FREQTAR FREQREF KEYFREQTAR KEYFREQALL KEYASSOC KEYDISPTAR KEYDISPREF KEYDISP KEYONHOWMANY
z z 3 0 0.6020600 0.0000000 1.000000000 0.65487307 0.00000000 0.96769672 2
i i 4 2 0.6989700 0.3477088 0.151065640 0.61605922 0.00000000 0.91034203 2
g g 3 2 0.6020600 0.2519296 0.056458719 0.70088274 0.22375505 0.70504487 2
b b 5 4 0.7781513 0.5693234 0.017690016 0.14721358 1.00000000 -1.26015049 1
c c 3 3 0.6020600 0.3477088 0.000000000 0.70088274 0.02414894 1.00000000 1
h h 3 1 0.6020600 0.1386469 0.317520657 0.00000000 0.02414894 -0.03568455 1
a a 5 5 0.7781513 0.6285430 0.000000000 1.00000000 0.47246210 0.77953532 1
f f 2 3 0.4771213 0.2519296 -0.056458719 0.29422753 0.00000000 0.43477587 1
x x 9 10 1.0000000 1.0000000 -0.003990255 0.96546219 0.66863773 0.43861333 1
y y 0 3 0.0000000 0.0000000 -1.000000000 0.00000000 0.59877181 -0.88479667 0
d d 1 3 0.3010300 0.1386469 -0.317520657 0.00000000 0.52624922 -0.77763106 0
e e 2 4 0.4771213 0.3477088 -0.151065640 0.02826716 0.47788111 -0.66438821 0
AMALGAM1 AMALGAM2 EUCLID MAHAL
z 1.5697567 1.1846715 1.5162167 4.9497286
i 1.0159324 0.7418921 1.1576280 0.9480818
g 0.7390364 0.4584708 0.9288445 0.5871551
b -1.2463850 -0.9668222 1.4811521 4.3208884
c 1.0000000 0.6020600 1.1672516 1.6284844
h 0.1554819 0.1696822 0.6815930 1.0199254
a 0.7795353 0.6065964 1.1014512 0.9053943
f 0.4078382 0.1805032 0.6479679 0.2963551
x 0.4346231 0.4346231 1.0919696 0.2076996
y -0.8847967 0.0000000 1.3352397 4.8916118
d -0.8732143 -0.3296735 0.8922715 1.2367820
e -0.7364648 -0.3890704 0.8317916 1.0078934
Here’s a visual representation of all three dimensions:
- the association component on the x-axis;
- the dispersion component on the y-axis;
- the frequency component in the font size:
plot(type="n",
xlab="Association component of keyness", x=results$KEYASSOC,
ylab="Dispersion component of keyness" , y=results$KEYDISP)
grid(); abline(h=0, lty=2); abline(v=0, lty=2)
text(x=results$KEYASSOC, y=results$KEYDISP,
labels=results$WORD, cex=0.5+results$KEYDISPTAR)2.4 Bins
This is how one might create frequency, association, and dispersion bins such that each word type is grouped in one bin in the three-dimensional cube; here, I show for every word which combination of association and dispersion components values it scores:
results$KEYASSOCbin <- cut(
results$KEYASSOC,
breaks=c(-100, seq(0, 1, 0.1)),
include.lowest=TRUE,
labels=-1:9)
results$KEYDISPbin <- cut(
results$KEYDISP,
breaks=c(-100, seq(0, 1, 0.1)),
include.lowest=TRUE,
labels=-1:9)
print(tapply(results$WORD,
list(ASSOC=results$KEYASSOCbin, DISP=results$KEYDISPbin),
paste, collapse=", "), na.print=".") DISP
ASSOC -1 0 1 2 3 4 5 6 7 8 9
-1 "y, d, e" . . . . "f, x" . . "a" . "c"
0 "b" . . . . . . . "g" . .
1 . . . . . . . . . . "i"
2 . . . . . . . . . . .
3 "h" . . . . . . . . . .
4 . . . . . . . . . . .
5 . . . . . . . . . . .
6 . . . . . . . . . . .
7 . . . . . . . . . . .
8 . . . . . . . . . . .
9 . . . . . . . . . . "z"
3 Case study: ‘learned’ in Brown
We clear memory, source the function Keyness3D, and load the Brown corpus, which here already comes in a format that facilitates its use with Keyness3D:
WORD PART
1 the a01
2 fulton a01
3 county a01
4 grand a01
5 jury a01
6 said a01
The learned category is represented in the corpus by part names beginning with “j” so we make all of those the target corpus tar and everything else the reference corpus ref; after than, we can apply the function to the two corpora:
The top 50 based on word types’ association to T:
set.seed(1)
(top50.assc <- results$WORD[order(
results$KEYASSOC,
sample(nrow(results)),
decreasing=TRUE)] %>% head(50)) [1] "brucellosis" "biopsy" "respondent's"
[4] "height-to-diameter" "optics" "zero-magnitude"
[7] "unpaired" "gyro-stabilized" "ebb"
[10] "classifying" "synergistic" "nonequivalent"
[13] "celso" "butchered" "iodinate"
[16] "volts" "jurisprudentially" "exogamy"
[19] "bereavements" "argon" "2.405"
[22] "rumscheidt" "electrolysis" "epitomize"
[25] "nakamura" "poland's" "agriculture's"
[28] "haupts'" "dubin" "proteolytic"
[31] "categorizing" "nonspecifically" "misnamed"
[34] "oxygens" "plastering" "echelons"
[37] "3,450" "**zq" "no-valued"
[40] "cardiomegaly" "geatish" "glycerolized"
[43] "interference-like" "disentangle" "solvents"
[46] "discolors" "torsion" "scalar"
[49] "tangent" "diffusely"
The top 50 based on word types’ dispersion in T relative to R:
[1] "results" "such" "may" "these" "1"
[6] "2" "relatively" "various" "possible" "similar"
[11] "method" "amount" "conditions" "however" "distribution"
[16] "assumed" "basis" "due" "types" "essentially"
[21] "therefore" "appears" "af" "whereas" "differences"
[26] "are" "methods" "per" "has" "cases"
[31] "thus" "considerable" "described" "which" "extent"
[36] "used" "ratio" "addition" "defined" "related"
[41] "values" "permit" "isolated" "cannot" "necessary"
[46] "latter" "3" "experimental" "same" "certain"
The top 50 based on word types’ first amalgamation score:
[1] "results" "af" "1" "distribution" "2"
[6] "such" "relatively" "these" "various" "may"
[11] "conditions" "method" "assumed" "differences" "similar"
[16] "experimental" "essentially" "types" "whereas" "defined"
[21] "possible" "appears" "values" "amount" "methods"
[26] "isolated" "however" "described" "measurements" "basis"
[31] "therefore" "analysis" "cases" "systems" "calculated"
[36] "data" "due" "thus" "occurring" "parameters"
[41] "q" "related" "sample" "follows" "thermal"
[46] "variables" "detected" "3" "extent" "proportional"
On the relation between the association (x-axis) and the dispersion (y-axis) components of keyness, with point size reflecting the frequency component:
[1] 0
plot(pch=16, col="#00000010", cex=0.5+1.5*results$KEYFREQTAR,
xlab="Association component", x=results$KEYASSOC,
ylab="Dispersion component", y=results$KEYDISP); grid()
text(0.6, -0.8, paste(
"Spearman's rho",
round(cor(results$KEYASSOC, results$KEYDISP, method="spearman"), 4), sep="="))These results seem quite a bit better than those of G2:
G2 <- function (a.2by2.table) {
temp <- chisq.test(a.2by2.table, correct=FALSE)
output <- a.2by2.table * log(a.2by2.table/temp$expected)
return(2 * sum(output, na.rm=TRUE) * sign(t(temp$residuals)[1,1]))
}
WORD.by.CORPUS.abs <- table(BROWN.df$WORD, substr(BROWN.df$PART, 1, 1)=="j")[,2:1]
G.squareds <- apply(
WORD.by.CORPUS.abs, 1, \(af) {
G2(cbind(af, colSums(WORD.by.CORPUS.abs)-af))})
G.squareds %>% sort(TRUE) %>% names %>% head(50) [1] "af" "of" "is" "anode"
[5] "t" "1" "data" "index"
[9] "the" "2" "surface" "cells"
[13] "system" "stress" "function" "by"
[17] "q" "dictionary" "rate" "reaction"
[21] "temperature" "in" "platform" "sections"
[25] "information" "analysis" "results" "values"
[29] "staining" "which" "binomial" "elections"
[33] "cell" "are" "sample" "be"
[37] "onset" "c" "shear" "systems"
[41] "number" "these" "**zg" "emission"
[45] "wage" "curve" "bronchial" "used"
[49] "questionnaire" "operator"
Here are words scoring relatively high on both the association and the dispersion components:
plot(type="n", # plot nothing
xlab="Association component", xlim=c(0.2, 1), x=results$KEYASSOC,
ylab="Dispersion component", ylim=c(0.2, 1), y=results$KEYDISP)
grid() # add a grid and then the words (w/ their sizes reflecting frequency):
text(results$KEYASSOC, results$KEYDISP, results$WORD, cex=0.5+results$KEYFREQTAR)4 Concluding remarks
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Los_Angeles
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets compiler methods
[8] base
other attached packages:
[1] data.table_1.15.4 STGmisc_1.0 Rcpp_1.0.12 magrittr_2.0.3
loaded via a namespace (and not attached):
[1] digest_0.6.35 fastmap_1.1.1 xfun_0.43 knitr_1.46
[5] htmltools_0.5.8.1 rmarkdown_2.26 cli_3.6.2 rstudioapi_0.16.0
[9] tools_4.4.0 evaluate_0.23 yaml_2.3.8 rlang_1.1.3
[13] jsonlite_1.8.8 htmlwidgets_1.6.4