Handling Date and Time data in R – Part 1

W. Avison Railway clock

A time series is a list of observations ordered successively over time. In a time series, observations are often recorded at evenly-spaced intervals, such as daily, weekly, monthly, and so on.

This is a first article of two part article series on handling date and time data in R. In this article we will look at how date and time objects are stored in R. In the subsequent part, we will discuss various functions used for data extraction, and manipulation of date and time objects.

Let’s look at some time series data sets.

AirPassengers data set is about monthly airline passenger numbers from 1949-1960. This data have month as time component and number of air passengers traveled in that particular month as time measurement.

plot(AirPassengers)

Time series plot for Airpassengers data set

Time series data is widely used in Finance. EuStockMarkets data is about daily closing prices of major European stock market indices. This data is recorded from 1991-1998.
This data includes the day as time component and closing stock price as the time measurement for each specific day.

plot(EuStockMarkets)

EUStock Market Data

In this article, we will learn to handle date and time in R. We will look at how R handles date/time objects internally and learn about different classes such as POSIXct and POSIXlt designed to effectively handle date time objects.

Date formats vary depending on the country you are in. For e.g. in my country India, we always put day of month, then month and then year. But in the US the convention is month first, then day of month, then year.

To represent the dates in unified manner, there is global standard called ISO 8601, that specifies the correct way to share dates to avoid all confusion.

As per ISO 8601, all dates components are ordered in decreasing units that is, year, month then day. The standard also specifies, each date component has fixed number of digits – year has four and day and month two. If you only need one digit for the day or month, you mist pad it with leading zeros. You don’t have to use separator in the ISO standard but if you do, it must be a dash in dates. So, the date, the 1st of January 2024, in ISO 8601 standard is 2024 (dash) 01 (dash) 01.

ISO-8601 Date format

Let us look at current date and time. Sys.Date() will return the current date.

Sys.Date()
[1] "2024-10-08"

unclass function will show how dates are stored internally in R. In R dates are represented as number of days since 1970-01-01. All dates in R are internally stored in this way.

unclass(Sys.Date())
[1] 20004

Dates older than origin are stored as negative integer

unclass(as.Date("1947-08-15"))
[1] -8175

When we import time series data into R, most often the date variable is imported as character variable. It takes a little work to get R recognize the date object. If you print 2024-10-06 in the console, R will interpret dashes as subtraction and you will end up with number.

date1 <- 2024-10-06
date1
[1] 2008

In the following example, though it certainly looks like a date, but R still thinks that it’s just character string.

date2 <- "2024-10-06"
date2
str(date2)

[1] "2024-10-06"
 chr "2024-10-06"

In order for R to recognize date as date object, we have to use as.Date function on the character string. as.Date takes a character string and turns it into a date object.

date3 <- as.Date(date2)
date3
str(date3)

[1] "2024-10-06"
 Date[1:1], format: "2024-10-06"

Please note, as.Date function will only work with dates in the ISO order. We will see how to handle dates with other format in later sections.

ISO 8601 also has standard datetimes format. Just like dates, if you have an accompanying time it should be written from largest unit to smallest unit using a fixed number of digits and optionally separating the units with a colon. When combined with a date, the time is sometimes prefixed with the character T.
Sys.time() returns the date, time and timezone.

Sys.time()

[1] "2024-10-08 14:59:28 IST"

IST(Indian Standard Time) is local timezone.

ISO 8601 datetime format

There are two built in types of objects for datetimes in R, POSIXlt and POSIXct. POSIXlt stores a date in a list of vectors with components for each unit, whereas POSIXct stores the date as the number of seconds since the beginning of 1970.

POSIX standard ensures different date time formats are compatible with different operating systems.

The function as.POSIXct() will take string and turn it into POSIXct object. Just like dates, as.POSIXct() function will only work with dates in the ISO order.

date4 <- as.POSIXct(date2)
date4
str(date4)

[1] "2024-10-06 IST"
 POSIXct[1:1], format: "2024-10-06"

unclass(date4)

[1] 1728153000
attr(,"tzone")
[1] ""

Because date4 is of class POSIXct,it is stored as number of seconds since the beginning of 1970 (1970-01-01).

Because of its simpler structure POSIXct is more frequently used to store time series data in data frames.

As we said earlier, POSIXlt stores date/time components in a list and these can be extracted. Let us look at the date/time components returned by POSIXlt using unclass().

date4 <- as.POSIXlt(date2)
unclass(date4)

$sec
[1] 0

$min
[1] 0

$hour
[1] 0

$mday
[1] 6

$mon
[1] 9

$year
[1] 124

$wday
[1] 0

$yday
[1] 279

$isdst
[1] 0

$zone
[1] "IST"

$gmtoff
[1] NA

attr(,"tzone")
[1] ""      "IST"   "+0630"
attr(,"balanced")
[1] TRUE

A time series is a list of observations ordered successively over time. In this article, we learned:

  • Dates are represented in unified manner,using global standard called ISO 8601. As per ISO 8601, all dates components are ordered in decreasing units that is, year, month then day.
  • ISO 8601 also represents time in unified manner. It should be written from largest unit to smallest unit using a fixed number of digits.
  • There are two built in types of objects for datetimes in R, POSIXlt and POSIXct. POSIXlt stores a date in a list of vectors with components for each unit, whereas POSIXct stores the date as the number of seconds since the beginning of 1970.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top