Handling Dates and Times

Authors

Jose Storopoli

Kevin Bonham

Juan Oneto

In this tutorial, we turn our attention to how to handle and parse dates and times in Julia.

Let’s begin with the standard library module Dates.

1 📅 Standard Library Dates Module

Contrary to R, Julia treats dates as a specific type in the standard library. This allows for a much richer and powerful representation and manipulation of dates and times.

For any date or time manipulation in Julia, you should first import the Dates module from Julia’s standard library:

using Dates

R treats dates as double precision floating point numbers:

> typeof(as.Date("2021-01-01"))
[1] "double"

Compare this with Julia:

typeof(Date("2021-01-01"))
Date

1.1 Date/DateTime Types

The Dates module from Julia’s standard library has two main types:

  • Date: represents a given date within a day precision
  • DateTime: represents a given time within a millisecond

You can construct them from the Date() and DateTime() constructors by passing a string:

Date("2021-01-01")
2021-01-01
DateTime("20201-01-01T08:00:00")
20201-01-01T08:00:00

They also accept Period types, which are divided into DatePeriod and TimePeriod.

using InteractiveUtils
subtypes(Period)
2-element Vector{Any}:
 DatePeriod
 TimePeriod

DatePeriod types are related to dates:

subtypes(DatePeriod)
5-element Vector{Any}:
 Day
 Month
 Quarter
 Week
 Year

And TimePeriod types are related to time:

subtypes(TimePeriod)
6-element Vector{Any}:
 Hour
 Microsecond
 Millisecond
 Minute
 Nanosecond
 Second

So for example, we can create the Date object for the 1st of January of 2021 using Period types:

Date(Day(01), Month(01), Year(2021))
2021-01-01

Similarly, we can create the DateTime object for the same date with time as 8AM:

DateTime(Day(01), Month(01), Year(2021), Hour(8), Minute(0), Second(0))
2021-01-01T08:00:00

Here is a Nice Diagram for the Period subtypes. The ellipses are abstract types and the rectangles are concrete types:

Period types

1.2 Parsing Strings as Date/DateTime

Most often, we just want to parse strings as either Date or DateTime objects.

If you are really lucky the string will be well-behaved and parsed automatically, like in our previous examples. However, that is often not the case. So, we have to learn how to parse different string formats and standards.

This can be done with the dateformat"" string literal. Date() and DateTime() constructors accept a second argument that specifies what is the string format to parse correctly the string.

For example, the string "01/Jan/2021" can be parsed with:

Date("01/Jan/2021", dateformat"dd/u/yyyy")
2021-01-01

Or even "Jan-1-21/8AM":

DateTime("Jan-1-21/8AM", dateformat"u-d-yy/Hp")
0021-01-01T08:00:00
Tip

Julia’s documentation on DateFormat is the best resource to learn how to construct your dateformat"" string to represent a given string format.

In fact, the table below is taken from there:

Code Matches Comment
y 1996, 96 Returns year of 1996, 0096
Y 1996, 96 Returns year of 1996, 0096. Equivalent to y
m 1, 01 Matches 1 or 2-digit months
u Jan Matches abbreviated months according to the locale keyword
U January Matches full month names according to the locale keyword
d 1, 01 Matches 1 or 2-digit days
H 00 Matches hours (24-hour clock)
I 00 For outputting hours with 12-hour clock
M 00 Matches minutes
S 00 Matches seconds
s .500 Matches milliseconds
e Mon, Tues Matches abbreviated days of the week
E Monday Matches full name days of the week
p AM Matches AM/PM (case-insensitive)
yyyymmdd 19960101 Matches fixed-width year, month, and day

1.3 Extracting Information from Date/DateTime

Once we have parsed a string into Date and DateTime objects, we can also extract relevant time and date information as integers.

This can be done with the “Accessor Functions”:

  • year(), month(), and Day() for dates
  • hour(), minute(), and second() for time
jan_1 = Date("2021-01-01")
eight_am = DateTime("2021-01-01T08:00:00")
year(jan_1)
2021
month(jan_1)
1
day(jan_1)
1
hour(eight_am)
8
minute(eight_am)
0
second(eight_am)
0
Note

Remember that in Julia, by convention, anything in CamelCase is either a type, package, module, or a constructor, and anything in snake_case is a function/method or variable.

1.4 Date/DateTime Operations

We can perform two operations on Date and DateTime objects:

  1. Addition with +: this gives back another Date or DateTime object.
  2. Subtraction with -: this does not return a Date or DateTime, but a Period type to represent a given duration.

Here is an example with addition, note that we have to add using Period types:

jan_1 + Day(8)
2021-01-09
Tip

Now we are using the constructor Day() for the Period type instead of the accessor function day().

We can add any period we want, and it also works with DateTime:

jan_1 + Month(1) + Day(1)
2021-02-02
eight_am + Hour(2) + Minute(30)
2021-01-01T10:30:00
Caution

Notice that we cannot add a Period to a type that cannot represent it.

The following will error, because jan_1 (a Date object) cannot represent time periods:

jan_1 + Hour(1)
MethodError: MethodError(+, (Date("2021-01-01"), Hour(1)), 0x000000000000831a)

Now let’s show some examples with Date/DateTime subtractions.

First, let’s perform a subtraction on Date/DateTime objects:

jan_9 = Date("2021-01-09")
2021-01-09
duration = jan_1 - jan_9
-8 days

Notice that it returns a Period object:

typeof(duration)
Day

We can also use this to add/subtract Period types:

duration + Day(1)
-7 days
duration - Day(1)
-9 days

Or chain then together:

duration + Month(1) - Day(5)
1 month, -13 days

1.5 Rounding Periods

Suppose you want to convert a duration (which is a Period object) into another Period object that has less resolution. For example, we want to convert Seconds to Hours. This might be complicated because we have to make sure that we do not bump into an InexactError.

Tip

InexactError is an error thrown when Julia cannot exactly convert a certain value to a desired type.

Let’s try to convert 61 seconds into a minute:

Minute(Second(61))
InexactError: InexactError(:divexact, Int64, 1.0166666666666666)

We get that InexactError we mentioned above!

To remedy this, we need to round somehow our Second type to a desired Minute type. This can be accomplished in 3 ways:

  • round: rounding to the nearest integer.
  • floor: rounding down.
  • ceil: rounding up.
round(Second(61), Minute)
1 minute
floor(Second(61), Minute)
1 minute
ceil(Second(61), Minute)
2 minutes

1.6 Date/DateTime Intervals

One of the most amazing things we can do with the Dates modules is to use the : operator to create Date/DateTime Intervals.

This is one of the multiple uses of multiple dispatch that we’ve shown in our previous tutorial on functions.

For example, creating an interval from January 1st to January 10th is easy. We just need to sandwich in what period we want our interval to have as a “step size”:

my_date_interval = Date("2021-01-01"):Day(1):Date("2021-01-10")
Dates.Date("2021-01-01"):Dates.Day(1):Dates.Date("2021-01-10")

We can also convert our interval to a vector with collect():

my_dates = collect(my_date_interval)
first(my_dates, 5)
5-element Vector{Date}:
 2021-01-01
 2021-01-02
 2021-01-03
 2021-01-04
 2021-01-05

Also, there is nothing special with the Day(1) as our Period “step size”, we can use whatever we fancy:

Date("2021-01-01"):Day(3):Date("2021-01-10")
Dates.Date("2021-01-01"):Dates.Day(3):Dates.Date("2021-01-10")
Date("2021-01-01"):Week(2):Date("2021-07-01")
Dates.Date("2021-01-01"):Dates.Week(2):Dates.Date("2021-06-18")
Date("2021-01-01"):Month(3):Date("2021-07-01")
Dates.Date("2021-01-01"):Dates.Month(3):Dates.Date("2021-07-01")
Tip

There is also the range() function that gives you the ability to construct an interval by setting the arguments length() with an additional step size:

range(Date("2021-01-01"), length = 10, step = Day(1))

All of our accessor functions and dates/time operations work on these intervals with the caveat that we need to vectorize (or broadcast) them:

day.(my_date_interval)
10-element Vector{Int64}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
month.(my_date_interval)
10-element Vector{Int64}:
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
my_date_interval .+ Day(1)
10-element Vector{Date}:
 2021-01-02
 2021-01-03
 2021-01-04
 2021-01-05
 2021-01-06
 2021-01-07
 2021-01-08
 2021-01-09
 2021-01-10
 2021-01-11
my_date_interval .+ Month(1)
10-element Vector{Date}:
 2021-02-01
 2021-02-02
 2021-02-03
 2021-02-04
 2021-02-05
 2021-02-06
 2021-02-07
 2021-02-08
 2021-02-09
 2021-02-10
Note

For the quick-witted, there is more than meets the eye in these accessor functions. They are really helpful and powerful when grouping and summarizing time series data.

2 📁 dateformat in CSV.jl

The CSV.File constructor and CSV.read() function from CSV.jl have a keyword argument dateformat that allows a string representing the same dateformat you would include inside the dateformat"" string literal.

It will automatically detect any column that has strings with the specified dateformat and parse them as Date or DateTime objects.

For example, you can read a CSV File into a DataFrame where you would have a column of string as dates with the following format "yyyy/mm/dd":

df = CSV.read("file.csv", DataFrame; dateformat = "yyyy/mm/dd")

3 📁 Dates in XLSX.jl

XLSX.jl does not have an explicit argument for date or time parsing. Instead, if you use the keyword argument infer_eltypes=true, any date or time in the original Excel file will be parsed as a column of Date or DateTime objects:

df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"; infer_eltypes = true))
first(df, 5)

4 💾 Using with DataFrames

How do we handle dates and times in a DataFrame?

This is simple. All you need you’ve already learned in this tutorial!

Let us show you. First, let’s create a dummy DataFrame with some fake time series data:

using DataFramesMeta
df = DataFrame(;
    Date = rand(Date("2021-01-01"):Day(1):Date("2021-12-31"), 1_000),
    Value = rand(0.0:1.0, 1_000),
)
first(df, 5)
5×2 DataFrame
Row Date Value
Date Float64
1 2021-05-12 1.0
2 2021-09-26 0.0
3 2021-12-30 0.0
4 2021-05-14 0.0
5 2021-10-19 0.0

First let’s learn how to group data by some specific period using the accessor functions.

We begin by creating a new column to store the day of the week and the month:

@rtransform! df :DayWeek = dayofweek(:Date) :Month = month(:Date)
first(df, 5)
5×4 DataFrame
Row Date Value DayWeek Month
Date Float64 Int64 Int64
1 2021-05-12 1.0 3 5
2 2021-09-26 0.0 7 9
3 2021-12-30 0.0 4 12
4 2021-05-14 0.0 5 5
5 2021-10-19 0.0 2 10

Now we can group our data by the new column :Month.

First, let’s import the mean() function from the Statistics module from Julia’s standard library:

using Statistics: mean

@chain df begin
    groupby(:Month)
    @combine :mean_value = mean(:Value)
end
12×2 DataFrame
Row Month mean_value
Int64 Float64
1 1 0.535714
2 2 0.526316
3 3 0.516854
4 4 0.439024
5 5 0.483871
6 6 0.421875
7 7 0.55814
8 8 0.529412
9 9 0.517647
10 10 0.473118
11 11 0.52439
12 12 0.54321

Here is a handy tip for you to recover only weekdays from Dates objects. Just use the following filter: dayofweek(your_date) <= 5.

my_df = @rsubset df dayofweek(:Date) <= 5
first(my_df, 5)
5×4 DataFrame
Row Date Value DayWeek Month
Date Float64 Int64 Int64
1 2021-05-12 1.0 3 5
2 2021-12-30 0.0 4 12
3 2021-05-14 0.0 5 5
4 2021-10-19 0.0 2 10
5 2021-02-15 1.0 1 2
Tip

For more complex stuff, like business days, you can use the package BusinessDays.jl.

Also, we can combine the previous two operations into a single one: we filter only the values occurring in weekdays followed by a group by, and then summarize by month.

@chain df begin
    @rsubset dayofweek(:Date) <= 5
    groupby(:Month)
    @combine :mean_value = mean(:Value)
end
12×2 DataFrame
Row Month mean_value
Int64 Float64
1 1 0.555556
2 2 0.561404
3 3 0.5
4 4 0.40625
5 5 0.460317
6 6 0.44186
7 7 0.557377
8 8 0.490566
9 9 0.523077
10 10 0.45
11 11 0.47541
12 12 0.517241

5 ✉️ Post Scriptum

We hope that you are as marveled as we are with the elegance and simplicity of the Julia Dates module and its types. It is really easy and a joy to work with Date and DateFormat either individually or inside vectors, matrices and DataFrames.