mean
Basics
mean is a function that calculates the average of a vector of values.
You will often find yourself using the na.rm argument, short for NA value removal. Most real-life data will contain missing values somewhere, and na.rm = TRUE will automatically remove those values from consideration during a function call or computation. na.rm = FALSE is the default, so make sure to include na.rm = TRUE if you’re unsure of your data’s composition.
|
As mentioned here, |
Examples
How do I get the average of the values in a vector when some of the values are: NA, NaN? What happens if I want to include those values?
Click to see solution
First, we show the implication of not including na.rm = TRUE:
mean(c(1,2,3,NaN))
[1] NaN
That’s obviously not what we want. We would only ever want na.rm = F if we were checking for null values being present in the data.
Now, the rest of the examples, executed properly:
mean(c(1,2,3,NaN), na.rm=TRUE)
[1] 2
mean(c(1,2,3,NA), na.rm=TRUE)
[1] 2
mean(c(1,2,NA,NaN,4), na.rm=TRUE)
[1] 2.333333