# Chapter 2 Arrays

In R, arrays are commonly called “vectors”. R likes to be special.

## 2.1 Everything is an array

In R, even single values are arrays. That’s why you see `[1]` in front of results: even single values are the first item in an array of length one.

## 2.2 Creation

`c()` is some sort of legacy nonsense from the S language. I think it means character array even though it can hold things other than characters.

I pronounce it “CAW”. Like the sound a crow makes.

Simple array

``c(8, 6, 7, 5)``
``#> [1] 8 6 7 5``

For multiple types, R converts elements to the most complex type (usually a string). For a real multi-typed collection, see lists

``c(9, 'hello', 7)``
``#> [1] "9"     "hello" "7"``

## 2.3 Array generators

R has a cultural fear of complete words. Many terms are shortcuts or acronyms.

Repeat

``rep(0, 4)``
``#> [1] 0 0 0 0``
``rep(c(1,2,3), 4) # repeate the whole array``
``#>  [1] 1 2 3 1 2 3 1 2 3 1 2 3``
``rep(c(1,2,3), each=4) # repeat each item in the array before moving to the next``
``#>  [1] 1 1 1 1 2 2 2 2 3 3 3 3``

Sequence

``````#increment by 1
4:10``````
``#> [1]  4  5  6  7  8  9 10``
``````#increment by any other value
seq(from=10, to=50, by=5)``````
``#> [1] 10 15 20 25 30 35 40 45 50``

Randomly sample from a given distribution

``````# uniform distribution (not 'run if')
runif(n=5, min=0, max=1)``````
``#> [1] 0.1594836 0.4781883 0.7647987 0.7696877 0.2685485``
``````# normal distribution
rnorm(n=5, mean=0, sd=1)``````
``#> [1]  0.4483395  1.0208067 -0.1378989  0.2103863 -0.6428271``

## 2.4 Combining arrays

An array made up of smaller arrays concatenates them. R doesn’t seem to allow for an array of arrays.

``````x = 1:3
y = c(10, 11) # arrays of arrays get flattened
z = 500

c(x, y, z)``````
``#> [1]   1   2   3  10  11 500``

Note: `z` is technically an array of length 1

Collapse an array into a string

``paste(1:5, collapse=", ")``
``#> [1] "1, 2, 3, 4, 5"``

## 2.5 Indexing

``a = 10:20``

Get the first value - Indices start at 1, not 0

``a[1]``
``#> [1] 10``

2nd and 6th values

``a[c(2,6)]``
``#> [1] 11 15``

Exclude the 2nd and 6th values

``a[c(-2,-6)]``
``#> [1] 10 12 13 14 16 17 18 19 20``

Range of values

``a[2:6]``
``#> [1] 11 12 13 14 15``

Any order or number of repetitions

``a[c(2, 4, 6, 6, 6)]``
``#> [1] 11 13 15 15 15``

Specify values to keep or drop using booleans (keep this in mind for the “Array operations” section)

``a[c(TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE)]``
``#> [1] 10 12 14 16 18 20``

## 2.6 Sampling from an Array

Randomly sample from an array. Elements may repeat.

``sample(1:3, size=10, replace=TRUE)``
``#>  [1] 1 1 2 3 2 2 2 2 3 1``

`replace` means “sample with replacement”, so an element can be sampled more than once

Sample without replacement. Elements will not repeat.

``sample(1:5, size=4, replace=FALSE)``
``#> [1] 4 3 1 5``

Shuffle the order of an array

``sample(a, size=length(a), replace=FALSE)``
``#>  [1] 15 13 17 20 18 10 12 16 14 11 19``

Make sure you have enough elements

``sample(1:5, size=10, replace=FALSE)``
``#> Error in sample.int(length(x), size, replace, prob): cannot take a sample larger than the population when 'replace = FALSE'``

## 2.7 Array constants

The `letters` and `LETTERS` constants hold lower and upper case letters

``letters[1:5]``
``#> [1] "a" "b" "c" "d" "e"``
``LETTERS[1:5]``
``#> [1] "A" "B" "C" "D" "E"``

## 2.8 Array operations

Many functions in R are vectorized, so they apply to arrays.

``a * 2``
``#>  [1] 20 22 24 26 28 30 32 34 36 38 40``

Compare individual elements

``a > 15``
``#>  [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE``

Compare each element across arrays

``a == c(10, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20)``
``#>  [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE``

Select elements using boolean array

``a[a>15]``
``#> [1] 16 17 18 19 20``

You can perform operations on the elements of two arrays even if they are different sizes. The smaller one wraps around.

``````a = 1:5
b = rep(1, 8)
a + b``````
``````#> Warning in a + b: longer object length is not a multiple of shorter object
#> length``````
``#> [1] 2 3 4 5 6 2 3 4``

## 2.9 Array functions

Length

``length(20:50)``
``#> [1] 31``

Reverse

``rev(1:5)``
``#> [1] 5 4 3 2 1``

Math

``sum(1:5)``
``#> [1] 15``
``min(1:5)``
``#> [1] 1``
``max(1:5)``
``#> [1] 5``

`min()` and `max()` are not vectorized. They only return one value.

``max(1:5, 11:15, 21:25)``
``#> [1] 25``

For a vectorized min and max, use `pmin()` and `pmax()`. P does not stand for “vectorized”, but let’s pretend it does.

``pmax(1:5, 11:15, 21:25)``
``#> [1] 21 22 23 24 25``

## 2.10 Array sorting

Sort

``````a = c(70, 20, 80, 20, 10, 40)
sort(a)``````
``#> [1] 10 20 20 40 70 80``

Reverse

``sort(a, decreasing=TRUE)``
``#> [1] 80 70 40 20 20 10``

Get the indices of the sorted values

``order(a)``
``#> [1] 5 2 4 6 1 3``

## 2.11 Test membership

To see if an item is in an array, use `%in%`

``9 %in% 1:10``
``#> [1] TRUE``
``9:11 %in% 1:10``
``#> [1]  TRUE  TRUE FALSE``