# Chapter 3 Types

For this chapter, you’ll need to load the tidyverse library. More on libraries.

R is dramatic and gives “conflict” messages that look like errors. Ignore them.

``library(tidyverse)``

## 3.1 Numbers

R has integers but defaults all numbers to `numeric` which is a double precision float.

``````x = 5 # no decimal but still a double
y = x + 1``````

Most types can be checked with `is.[type]()` and can be converted with `as.[type]()`.

``is.numeric(5)``
``#>  TRUE``
``is.numeric("5") # a string is not a numeric``
``#>  FALSE``
``as.numeric("5") # convert the string to a number``
``#>  5``

Good ol’ float point comparison

``````x = .58
y = .08
x - y == 0.5``````
``#>  FALSE``
``near(x-y, 0.5) # checks if numbers are very close``
``#>  TRUE``

Numeric division returns a double

``9 / 2 # double precision float``
``#>  4.5``
``9 %/% 2 # drop the part after the decimal``
``#>  4``

## 3.2 Strings

Single and double quotes are the same in R, but a given string needs the same in the beginning and end

``"hello world"``
``#>  "hello world"``
``'hello world'``
``#>  "hello world"``
``"single quote ' in a string"``
``#>  "single quote ' in a string"``
``'double quote " in a string'``
``#>  "double quote \" in a string"``

R calls a string, a “character”. Notice that it doesn’t call a number, a “digit”.

``is.character("hello world")``
``#>  TRUE``

Strings are not character arrays in R, so array techniques may not work as expected.

String length

``length('hello world') # R sees this is one string, not many characters``
``#>  1``
``str_length('hello world') # the actual length of the string``
``#>  11``

Substring

``str_sub('hello world', 2, 10)``
``#>  "ello worl"``

Comparison

``'hello' == "hello"``
``#>  TRUE``

### 3.2.1 Strings with special characters

If you want to use special characters in a string, you need to “escape it” by adding `\`

``"string with backslashes \\, double quote \", and unicode \u263A"``
``#>  "string with backslashes \\, double quote \", and unicode <U+263A>"``

Or you can use the literal `r"(text)"` which is useful for a Windows path or regular expression

``r"(c:\hello\world)"``
``#>  "c:\\hello\\world"``

### 3.2.2 String Concatenation

Concatenate with a space in between

``paste('hello', 'world')``
``#>  "hello world"``

Use a different separator

``paste('hello', 'world', sep='_')``
``#>  "hello_world"``

No separator

``paste('hello', 'world', sep='')``
``#>  "helloworld"``
``paste0('hello', 'world') # a shortcut function for no separator``
``#>  "helloworld"``

Combine a set of strings into one

``paste(c("apple", "orange", "banana"), collapse = ", ")``
``#>  "apple, orange, banana"``

See the strings chapter in R4DS for more.

## 3.3 Dates

See the dates chapter in R4DS.

## 3.4 Checking the type

What’s the type?

``class(5)``
``#>  "numeric"``

Remember, arrays are the same as single values.

``class(1:5)``
``#>  "integer"``

An array with multiple types converts the elements to the most abstract type

``class(c(5, 'hi', TRUE))``
``#>  "character"``

Test if numeric

``is.numeric(5)``
``#>  TRUE``

Test if string

``is.character('hi')``
``#>  TRUE``

Test if boolean

``is.logical(TRUE)``
``#>  TRUE``

## 3.5 Converting and parsing

Parse or convert to numeric

``as.numeric(c("5", TRUE, 1:3, "abc"))``
``#> Warning: NAs introduced by coercion``
``#>   5 NA  1  2  3 NA``

To string

``as.character(5)``
``#>  "5"``
``format(1/3)``
``#>  "0.3333333"``
``format(1/3 , digits = 16)``
``#>  "0.3333333333333333"``
``as.character(TRUE)``
``#>  "TRUE"``

Convert to boolean. Zero is false. Other numbers are true.

``as.logical(0:2)``
``#>  FALSE  TRUE  TRUE``

## 3.6 Missing values (NA)

Any type can have missing values.

``class(c(1, 2, 3, NA, 5))``
``#>  "numeric"``

Missing values are very common in datasets.

``is.na(c(NA, 1, ""))``
``#>   TRUE FALSE FALSE``

Any operation performed on NA will also yield NA. So, you can operate on arrays with missing values.

``c(5, NA, 7) + 1``
``#>   6 NA  8``

Be careful about aggregation functions like `min()`, `max()`, and `mean()`. To ignore NAs, use the `na.rm` parameter.

``mean(c(5, NA, 7), na.rm=TRUE)``
``#>  6``

## 3.7 Factor

A factor is like an enum in other languages. It encodes strings as integers via a dictionary.

Create an array with many repeating values

``````data = sample(c("hello", "cruel", "world"), 12, replace=TRUE)
data``````
``````#>   "world" "cruel" "world" "hello" "world" "hello" "cruel" "world" "world"
#>  "hello" "cruel" "world"``````

Make it into a `factor`

``````data = factor(data)
data``````
``````#>   world cruel world hello world hello cruel world world hello cruel world
#> Levels: cruel hello world``````

The array is now an integer array with a dictionary

``as.numeric(data)``
``#>   3 1 3 2 3 2 1 3 3 2 1 3``
``data``
``````#>  world
#> Levels: cruel hello world``````

See the different values in the array

``levels(data)``
``#>  "cruel" "hello" "world"``