Unraveling R data types – The Vectors – A gentle introduction

A Calm Watering Place–Extensive and Boundless Scene with Cattle
Image Soruce: Clevelant Museum of Arts, https://www.jstor.org/stable/community.24620423

The first step in learning R programming is getting familiar with basic R objects and their structure. The fundamental data object in R is a vector. In this article we will define the R objects, create different types of vector objects and understand some of its properties.

What is R Object?

When R is running, variables, data, functions, results, etc, are stored in the active memory of the computer in the form of objects which have a name.

“Everything that exists is an object. Everything that happens is a function”

John Chambers

In R, objects are omnipresent. From single number, simple addition to complex graphics, everything that is created in R becomes an object. For e.g. when we feed the data into linear regression model, we get the linear coefficients as an output. When R executes linear
regression command, it passes on the data table object to linear regression function (which is also an object) and gets a list of objects consisting output of regression model. R creates three objects in this process:

  • Input object – a data set
  • Function object – a regression function
  • Output object – a list of regression output

Every analysis involves various different types of objects. Each object has different behavior. The behavior of objects is decided by its class. For e.g. if we were to take an example from Automobiles, Range Rover is an object belong to SUV class. Object Range Rover will exhibit different behavior (characteristics) than object Audi A4, which belongs to Sedan class.

R has five basic class of objects:

  • numeric
  • integer
  • character
  • logical (TRUE/FALSE)
  • complex

It is important to understand how objects work in order to solve real-world complex analysis tasks with more elegant code and fewer steps. In the following sections, we will create a basic R object – the vector. The vector is fundamental object in R on which more complex objects and data structures are created.

How to create a Object?

Let’s key in following expression in R console.

(cos(15) + sin(15))*log(25)

[1] -0.3521452

The entire expression is an object. If we want to use it later, we will have to type it again. It will simply float around in R Environment unless we store it in some variable. Let’s store it into a variable temp using assignment operator <-

temp <- (cos(15) + sin(15))*log(25)

Notice that assignment operator, which consists of two characters <(“less than”) and -(“minus”) written side by side (without empty space), is used to create a object. The assignment operator points to the object receiving the value of the expression. Once above
expression is evaluated, object temp is created which has the value of the expression stored in it.

temp

[1] -0.3521452

Vectors

The most basic type of R object is vector. A vector is a group of primitive values of same type. A vector can be group of numbers, characters or logical (Boolean) values. It is one of the building blocks of R objects. There are several types of vectors in R. They are different from each other in the type of elements they store. The most commonly used vectors are: Numeric vectors, Character
vectors and Logical vectors.

Numeric Vector

A numeric vector is vector of numeric values. Numbers are generally treated in R as numeric objects. These are double precision real numbers. This means, even you see a number as, say 15, behind the scenes it is treated as double precision numeric object (such as 15.00). Open RStudio and type following number in console.

15

[1] 15

A numeric vector is the most frequently used data type and is the foundation of nearly all of data analysis.

Please note, above example may look like a scalar, however in R, there are no scalar objects. Scalars are considered as vectors with length 1.

To create a vector named vec, consisting of five numbers, namely 10.3,3.4,51,-34.1,21,-4.67 use the R function c( ).

vec <- c(10.3,3.4,51,-34.1,21,-4.67)
vec

[1]  10.30   3.40  51.00 -34.10
[5]  21.00  -4.67

The function c( ) is called as concatenate function, which concatenates all the elements in a vector end to end. There are different ways you can use c( ) function to create a vector.

We can create single vector using combination of multiple vectors

combine <- c(c(1.5,2.34),c(-3.9,4.21))
combine

[1] 1.50 2.34 -3.90 4.21

Integer Vector

If we explicitly want an integer, we need to specify the L suffix. So entering 1 in R gives a numeric object; entering 1L explicitly gives an integer object.

vec_int <- c(1L,2L,3L)

Logical Vector

A logical vector stores a group of TRUE or FALSE values. We can use c( ) function to create logical vector.

logic <- c(TRUE,FALSE)
logic

[1] TRUE FALSE

If we use comparison operators, the output shall be logical vector

1 > 2

[1] FALSE

c(3,4,5)>c(1,2,10)

[1] TRUE TRUE FALSE

Character Vector

A character vector is a group of strings. Here, a character does not mean literally a single letter or symbol in a language, but it means a group of letters and symbols. Both double quotation marks and single quotation mark, can be used to create a character vector, as
follows:

"Plotly Analytics" # A Single string

[1] "Plotly Analytics"

char <- c('Plotly','Analytics') # character vector of multiple strings
char

[1] "Plotly" "Analytics"

Please note, if a number is delimited by quotes(single or double), it is converted into a character object.

char_num <- c("1","2")
char_num

[1] "1" "2"

How to find out class of a Vector?

The class of an object can be found out using class() function.

class(vec)

[1] "numeric"

class(vec_int)

[1] "integer"

class(char)

[1] "character"

class(logic)

[1] "logical"

If we need to ensure that an object is indeed a vector of a specific class, we can use is. family of functions.

is.numeric(vec)

[1] TRUE

is.character(vec)

[1] FALSE

is.logical(logic)

[1] TRUE

Object Attributes

Object attribute specify the kind of data represented by an object. Attribute gives additional information about the object. All objects have two intrinsic attributes – mode and length

mode(vec)

[1] "numeric"

length(logic)

[1] 2

The length is the number of elements of the object. The mode is the basic data type of the elements of the object; there are four main modes: numeric, character, complex, and logical (FALSE or TRUE). Apart from intrinsic attributes, some objects may have additional
attributes, such as dimension in case of a matrix object.

temp_matrix <- matrix(c(1:6),nrow=2)
temp_matrix

    [,1] [,2] [,3]
[1,] 1     3	5
[2,] 2     4    6

attributes(temp_matrix)
$dim
[1] 2 3 # two rows and three columns

The action of a function on an object depends on the attributes of the object. For e.g. numeric class object won’t get added to character class object – 1 + “one” won’t work. Before performing summation, R will check a attribute class of the objects involved in summation.
If the class of a object don’t complement with each other, as in this case, R will produce an error. you don’t have to pay too much attention on the object attributes, however knowledge of object attributes may help in fixing some programming errors related to object classes.
Some examples of R object attributes are:

  • length
  • mode
  • class
  • dimension (of matrix, array, list etc.)
  • names (of columns in data set)

Attributes of an object (if any) can be accessed using the attributes( ) function. Not all R objects contain attributes, in which case the attributes( ) function returns NULL.

Converting vector classes (object coercion)

Vectors can be converted from one class into another using as. family of function. This is called as coercion. Objects are explicitly coerced from one class to another class. For e.g. some data are string representation of number. If we leave these strings as they are, we
won’t be able to perform numeric calculations with them.

strings <- c("1","2","3")
class(strings)

[1] "character"

Strings can not be used to do maths directly

We can use as.numeric to convert the character vector into numeric vector.

string_to_number <- as.numeric(strings)
string_to_number

[1] 1 2 3

class(string_to_number)

[1] "numeric"

Now we can do addition with numbers

string_to_number + 2

[1] 3 4 5

Similar to as.numeric, we can use other as. functions

as.logical(c(1,0,1,-3)) # converts numbers into Boolean

[1] TRUE FALSE TRUE TRUE

as.character(c(1,2,3)) # converts numbers into characters

[1] "1" "2" "3"

as.character(c(TRUE,FALSE)) # converts Boolean into characters

[1] "TRUE" "FALSE"

As for converting a numeric vector to a logical vector, the rule is that only 0 corresponds to FALSE and all non-zero numbers will produce TRUE.

If the conversion is not possible, a missing value is produced instead. NA represents missing value.

as.numeric(c("1","2","plotly"))

[1] 1 2 NA

Vector Arithmetic

Arithmetic operation on numeric vector follow two simple rules:

  • Computing is done in element wise manner
  • Shorter vector is recycled
c(1, 2, 3, 4) + c(5,6,7,8) # Addition

[1] 6 8 10 12

c(1, 2, 3, 4) * c(5,6,7,8) # Multiplication

[1] 5 12 21 32

c(1, 2, 3, 4) / c(5,6,7,8) # Division

[1] 0.2000000 0.3333333 0.4285714 0.5000000

c(1, 2, 3, 4) %% 2 # Modulus

[1] 1 0 1 0

c(1, 2, 3, 4) ˆ 2 # Square

[1] 1 4 9 16

R likes to operate on vectors of the same length, so if it encounters two vectors of different lengths, it merely replicates (recycles) the smaller vector until it is the same length as the longest vector, then it does the operation.

print(a <- c(1:10))

[1] 1 2 3 4 5 6 7 8 9 10

print(b <- c(20:25))

[1] 20 21 22 23 24 25

print(a+b)

[1] 21 23 25 27 29 31 27 29 31 33
Recycling on smaller length vector (vector b)

Named Vector

When the elements of a vector have names, the vector is known as named vector. We can give names to a vector when we create it. Naming the vectors help writing readable code and self-describing objects.

name_vector <- c(a=1,b=2,c=3)
name_vector

a b c
1 2 3

Using names, we can access the vector elements using single valued character vector. Please note, the name of the element should be written in square bracket.

name_vector["b"]

b
2

We can access multiple elements with character vector.

name_vector[c("a","c")]

a c
1 3

We can get the names of a vector elements with names( ) function.

names(name_vector)

[1] "a" "b" "c"

If the names are no longer needed, we can simply remove the vector’s names using NULL, a special object that represents undefined value:

names(name_vector) <- NULL
name_vector

[1] 1 2 3

Extracting vector element

Extracting the vector elements is also called as sub-setting. The vectors can be subset using square brackets [ ]. The example below is for a numeric vector; the same concept applies to other types of vectors as well.

sub_vec <- c(10:20)
sub_vec

[1] 10 11 12 13 14 15 16 17 18 19 20

sub_vec[1] # extract the first element
[1] 10

sub_vec[5] # extract the fifth element
[1] 14

We can pass on integer sequence in [ ] operator to subset multiple elements.

sub_vec[c(1:4)]

[1] 10 11 12 13

The sequence does not have to be in order; you can specify any arbitrary integer vector.

sub_vec[c(1,3,6,9)]

[1] 10 12 15 18

We can use conditional statements to subset a vector. for e.g. if we want to subset the elements greater than 15, we can use > conditional operator as below:

sub_vec[sub_vec > 15]

[1] 16 17 18 19 20

Extracting elements from a named vector is little tricky.

sub_vec_name <- c(a=10,b=15,c=20,d=25)
sub_vec_name

a b c d
10 15 20 25

If we use single square bracket, as we did in our last example, we will get name of a vector element and its value.

sub_vec_name[1]

a
10

If we use double square bracket, we will only get the value of vector element.

sub_vec_name[[1]]

[1] 10

The metaphor of candy boxes makes it easier to understand. The sub_vec_name[1] argument gives you the box of candy labeled “a” and its contents, while sub_vec_name[[1]] gives you only the box contents.

Summary

Objects are fundamental building blocks of R language. In R, variables are stored as objects. Objects are created using assignment operator (< −). The most basic type of R object is vector. A vector is a group of primitive values of same type. The most commonly used
vectors are: Numeric vectors, Character vectors and Logical vectors. We created these vectors and perform arithmetic operations on it.

We subset the vector elements, executed coercion operations (converting one vector class to another) with the as.* family of functions, and checked vector attributes.
In the next article, we will see a higher dimension vector – a matrix. Matrices are widely used in linear system modeling. We will create matrix and learn some of its properties.

References

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top