---
title: "Ling 104, session 08: plotting practice (class)"
author:
   - name: "[Stefan Th. Gries](https://www.stgries.info)"
     affiliation: "UC Santa Barbara & JLU Giessen"
     orcid: 0000-0002-6497-3958
date: "2026-04-01 12:34:56"
date-format: "DD MMM YYYY HH-mm-ss"
editor: source
format:
   html:
      page-layout: full
      code-fold: true
      code-link: true
      code-copy: true
      code-tools: true
      code-line-numbers: true
      code-overflow: scroll
      number-sections: true
      smooth-scroll: true
      toc: true
      toc-depth: 4
      number-depth: 4
      toc-location: left
      monofont: lucida console
      tbl-cap-location: top
      fig-cap-location: bottom
      fig-width: 4
      fig-height: 4
      fig-format: png
      fig-dpi: 300
      fig-align: center
      embed-resources: true
execute:
   cache: false
   echo: true
   eval: true
   warning: false
---

# Introduction

In this session today, we will get some more intensive plotting practice. We'll use a small data set on the 2004 US Presidential elections, which was obtained [here](https://web.archive.org/web/20041106061214/http://www.commonalty.com/iq.txt) (from the Wayback Machine) and referred to by an article in a U.K.-based [tabloid](https://en.wikipedia.org/wiki/Tabloid_journalism), the [Daily Mail (from the Wayback Machine)](https://web.archive.org/web/20051215000102/http://www.mirror.co.uk/news/allnews/page.cfm?objectid=14848162&method=full&siteid=50143). The data are in [_input/us_election_2004.csv](_input/us_election_2004.csv) and you can find information on the nature of the data (and their, let's very generously call it 'credibility') in [_input/us_election_2004.r](_input/us_election_2004.r).

Let's load the data:

```{r prepworkspace}
rm(list=ls(all=TRUE)); library(magrittr)
summary(d <- read.delim(          # make/summarize d, the result of loading
   "_input/us_election_2004.csv", # this file
   stringsAsFactors=TRUE))        # change categorical variables into factors
```

Let's do some general exploratory plotting to find out whether

1. IQ predicts which candidate was elected in a state;
2. average income predicts which candidate was elected in a state;
3. IQ and average income are related (such that the former statistically predicts the latter);
4. population density and average income are related (such that the former statistically predicts the latter);
5. population density and IQ are related.

(I am using *predict* in the statistical sense.) How would we do that?

# Plotting the data, part 1
## Simplest exploratory plots
### Candidate as a function of IQ

Question 1: does IQ predict the elected candidate? How can you generate the following plot?

```{r candAFAiq2}
# replace this with your answers/code
```

### Candidate as a function of income

Question 2: does income predict the elected candidate? How can you generate the following plot?

```{r candAFAincome2}
# replace this with your answers/code
```

### Income as a function of IQ

Question 3: does IQ predict income? How can you generate the following plot?

```{r incomeAFAiq2}
# replace this with your answers/code
```

### Income as a function of pop. density

Question 4: does population density predict average income? How can you generate the following plot?

```{r incomeAFApopdens2}
# replace this with your answers/code
```

### Pop. density and IQ

Question 5: are population density and IQ correlated? How can you generate the following plot?

```{r popdensANDiq2}
# replace this with your answers/code
```

# Plotting the data, part 2

Let's now try and generate a plot that actually shows the relation of all four variables -- `AVE_IQ`, `AVE_INCOME`, `POPDENSSQM`, and `CAND_ELECT` -- in one graph that also shows what each state did. How can we do that?

We

1. plot the relationship between `AVE_IQ`, `AVE_INCOME`;
2. use proper labeling of various plot elements;
3. plot (a) regression line(s) to represent the relationship;
4. use the abbreviated states' names instead of points (check `?abbreviate`);
5. use colors to indicate which candidate won in which state;
6. use font size to represent the states' population densities;
7. use horizontal and vertical lines to indicate means of the numeric variables
   a) in general
   b) per candidate.

## Step 1: basic relationship

We plot the relationship between `AVE_IQ`, `AVE_INCOME`. How can you generate the following plot?

```{r buildupstep1b}
# replace this with your answers/code
```
## Step 2: add labels

We add proper labeling of various plot elements. How can you generate the following plot?

```{r buildupstep2b}
# replace this with your answers/code
```

## Step 3: add regression line(s)

We add (a) regression line(s) to represent the relationship. How can you generate the following plot?

```{r buildupstep3b}
# replace this with your answers/code
```

## Step 4: use states' names

We use the abbreviated states' names instead of points. How can you generate the following plot?

```{r buildupstep4b}
# replace this with your answers/code
```

## Step 5: use colors for candidates

We use colors to indicate which candidate won in which state. How can you generate the following plot?

```{r buildupstep5b}
# replace this with your answers/code
```

## Step 6: use font sizes for pop. densities

We use font size to represent the states’ population densities. How can you generate the following plot?

```{r buildupstep6b}
# replace this with your answers/code
```

## Step 7: add lines for means

We use horizontal and vertical lines to indicate means of the numeric variables (in general and per candidate). How can you generate the following plot?

```{r buildupstep7b}
# replace this with your answers/code
```

## Step 8: add statistical results

We add to the plot

* the *p*-value of the *t*-test for independent samples comparing the average incomes per elected candidate (for which we also use italics and subscript);
* the *p*-value of the *t*-test for independent samples comparing the average IQs per elected candidate (for which we also use italics and subscript);
* the adjusted *R*^2^ for the overall plotted correlation into the main heading (for which we also use italics and superscript).

How can you generate the following plot?

```{r buildupstep8b}
# replace this with your answers/code
```

# Session info

```{r sessionInfo}
sessionInfo()
```
