How to import and export CSV file in R?

Oil Painting – View on the Catskill—Early Autumn

The first step in any kind of data analysis in R is to load the data, that is, to import a dataset into the R Environment. There are variety of data files you can import into R, which includes:

  • SAS
  • SPSS
  • MATLAB
  • XLSX
  • XLS
  • text delim files (csv,tsv etc..)
  • and many more..

Among all the files types used to store the data, perhaps the most widely used on is CSV. In a CSV file, first line is header of columns, and each subsequent line represents a data record separated by commas. In this article, we will learn to import the CSV file into R environment. So let’s get started.

In a typical CSV file, the first line is the header of columns, and each subsequent line represents a data record with columns separated by commas. Here is an example.

Name, Gender, Age, Major
Ken,Male,24,Finance
Ashley,Female,25,Statistics
Jennifer,Female,23,Computer Science

The simplest built-in function to import CSV file is read.csv() function.

read.csv("C:\\...Your path...\\students_data.csv",stringsAsFactors = FALSE)

read.csv( ) function

Please note, you can include double backward slash or single forward slash to mention the file path in a function. All the file path must be included into double quotes and file name must end with the file extension.

Technically, CSV format is delimited data format that uses a comma (,) to separate columns and a new line to separate rows. More generally speaking, any character can be column separator and row separator. In this case more general version read.table() function is used.

read.table("C:\\...Your path...\\students_data.csv",sep = ",",header=TRUE)

The sep argument is used to mention the type of delimiter used in the file. This is field separator character. Values on each line of the file are separated by this character. The argument header=TRUE is used to show the names of variables as its first line.

The readr package is another good choice to import tabular data in fast and consistent manner. Please install the package readr and then you can use read_* family of functions to import the data.

read_csv("C:\\...Your path...\\students_data.csv")

read_csv( ) function

There are additional handy arguments from read_csv() function you can use while importing the data. These arguments are:

  • n_max: The maximum number of rows to read
  • skip: Number of lines to skip before reading the data
  • na: strings to interpret the missing values

Here is an example:

read_csv("C:\\...Your path...\\students_data.csv",skip=1,n_max=2,na="empty",col_names = FALSE)

In this example we used:

  • skip = 1, to skip the first row of the data set
  • n_max = 2, to read max. 2 rows from the data set
  • na = “empty”, to show the missing value observations as empty cell
  • col_names = FALSE to show first row of the data set as not column name.

Sometimes, the data set comes in irregular format. The file content looks quite standard and tabular, but the number of spaces between each column is unequal across rows. In this case we can use read_table() function, and this function is smart enough to figure out the irregularities in the data file.

The function in readr are fast, and consistent and support the features of the built-in-functions which are much easier to use.

A typical procedure in data analysis is importing data from a data source, transforming the data, applying appropriate tools and models, and finally creating some new data to be stored for decision making. The interface for writing data to file is similar to that for reading data – we use write.* functions to export data frame to file.

write.csv(your_data,"C:\\...Your path...\\write1.csv")

write.csv( ) function

The write.csv() function allows us to modify the writing behavior. From the preceding output, we can notice there are some unnecessary components in it. For e.g. we don’t usually want the row names to be exported. We don’t need quotation marks around string values. To proceed, we can run the following code to export same data frame with the behavior and standard we want.

write.csv(your_data,"C:\\...Your path...\\write1.csv",quote = FALSE,row.names = FALSE)

write.csv( ) function

Now the output is simplified CSV file.

In this article, we looked at some important function to import and export the CSV file. We learned:

  • Importing CSV file using read.csv() function
  • Importing CSV file using readr package
  • Using various options while importing CSV file using readr package.
  • Writing CSV file using base write.csv( ) function.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top