using Dates
Handling Dates and Times
In this tutorial, we turn our attention to how to handle and parse dates and times in Julia.
Let’s begin with the standard library module Dates
.
1 📅 Standard Library Dates
Module
Contrary to R, Julia treats dates as a specific type in the standard library. This allows for a much richer and powerful representation and manipulation of dates and times.
For any date or time manipulation in Julia, you should first import the Dates
module from Julia’s standard library:
R treats dates as double precision floating point numbers:
> typeof(as.Date("2021-01-01"))
1] "double" [
Compare this with Julia:
typeof(Date("2021-01-01"))
Date
1.1 Date
/DateTime
Types
The Dates
module from Julia’s standard library has two main types:
Date
: represents a given date within a day precisionDateTime
: represents a given time within a millisecond
You can construct them from the Date()
and DateTime()
constructors by passing a string:
Date("2021-01-01")
2021-01-01
DateTime("20201-01-01T08:00:00")
20201-01-01T08:00:00
They also accept Period
types, which are divided into DatePeriod
and TimePeriod
.
using InteractiveUtils
subtypes(Period)
2-element Vector{Any}:
DatePeriod
TimePeriod
DatePeriod
types are related to dates:
subtypes(DatePeriod)
5-element Vector{Any}:
Day
Month
Quarter
Week
Year
And TimePeriod
types are related to time:
subtypes(TimePeriod)
6-element Vector{Any}:
Hour
Microsecond
Millisecond
Minute
Nanosecond
Second
So for example, we can create the Date
object for the 1st of January of 2021 using Period
types:
Date(Day(01), Month(01), Year(2021))
2021-01-01
Similarly, we can create the DateTime
object for the same date with time as 8AM:
DateTime(Day(01), Month(01), Year(2021), Hour(8), Minute(0), Second(0))
2021-01-01T08:00:00
Here is a Nice Diagram for the Period
subtypes. The ellipses are abstract types and the rectangles are concrete types:
1.2 Parsing Strings as Date
/DateTime
Most often, we just want to parse strings as either Date
or DateTime
objects.
If you are really lucky the string will be well-behaved and parsed automatically, like in our previous examples. However, that is often not the case. So, we have to learn how to parse different string formats and standards.
This can be done with the dateformat""
string literal. Date()
and DateTime()
constructors accept a second argument that specifies what is the string format to parse correctly the string.
For example, the string "01/Jan/2021"
can be parsed with:
Date("01/Jan/2021", dateformat"dd/u/yyyy")
2021-01-01
Or even "Jan-1-21/8AM"
:
DateTime("Jan-1-21/8AM", dateformat"u-d-yy/Hp")
0021-01-01T08:00:00
Julia’s documentation on DateFormat
is the best resource to learn how to construct your dateformat""
string to represent a given string format.
In fact, the table below is taken from there:
Code | Matches | Comment |
---|---|---|
y |
1996, 96 | Returns year of 1996, 0096 |
Y |
1996, 96 | Returns year of 1996, 0096. Equivalent to y |
m |
1, 01 | Matches 1 or 2-digit months |
u |
Jan | Matches abbreviated months according to the locale keyword |
U |
January | Matches full month names according to the locale keyword |
d |
1, 01 | Matches 1 or 2-digit days |
H |
00 | Matches hours (24-hour clock) |
I |
00 | For outputting hours with 12-hour clock |
M |
00 | Matches minutes |
S |
00 | Matches seconds |
s |
.500 | Matches milliseconds |
e |
Mon, Tues | Matches abbreviated days of the week |
E |
Monday | Matches full name days of the week |
p |
AM | Matches AM/PM (case-insensitive) |
yyyymmdd |
19960101 | Matches fixed-width year, month, and day |
1.3 Extracting Information from Date
/DateTime
Once we have parsed a string into Date
and DateTime
objects, we can also extract relevant time and date information as integers.
This can be done with the “Accessor Functions”:
year()
,month()
, andDay()
for dateshour()
,minute()
, andsecond()
for time
= Date("2021-01-01")
jan_1 = DateTime("2021-01-01T08:00:00") eight_am
year(jan_1)
2021
month(jan_1)
1
day(jan_1)
1
hour(eight_am)
8
minute(eight_am)
0
second(eight_am)
0
Remember that in Julia, by convention, anything in CamelCase
is either a type, package, module, or a constructor, and anything in snake_case
is a function/method or variable.
1.4 Date
/DateTime
Operations
We can perform two operations on Date
and DateTime
objects:
- Addition with
+
: this gives back anotherDate
orDateTime
object. - Subtraction with
-
: this does not return aDate
orDateTime
, but aPeriod
type to represent a given duration.
Here is an example with addition, note that we have to add using Period
types:
+ Day(8) jan_1
2021-01-09
Now we are using the constructor Day()
for the Period
type instead of the accessor function day()
.
We can add any period we want, and it also works with DateTime
:
+ Month(1) + Day(1) jan_1
2021-02-02
+ Hour(2) + Minute(30) eight_am
2021-01-01T10:30:00
Notice that we cannot add a Period
to a type that cannot represent it.
The following will error, because jan_1
(a Date
object) cannot represent time periods:
+ Hour(1) jan_1
MethodError: MethodError(+, (Date("2021-01-01"), Hour(1)), 0x0000000000007af8)
MethodError: no method matching +(::Date, ::Hour)
Closest candidates are:
+(::TimeType, ::Period, !Matched::Period)
@ Dates ~/_work/_tool/julia/1.10.5/x64/share/julia/stdlib/v1.10/Dates/src/periods.jl:353
+(::TimeType, ::Period, !Matched::Period, !Matched::Period...)
@ Dates ~/_work/_tool/julia/1.10.5/x64/share/julia/stdlib/v1.10/Dates/src/periods.jl:354
+(::Any, ::Any, !Matched::Any, !Matched::Any...)
@ Base operators.jl:587
...
Stacktrace:
[1] top-level scope
@ ~/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/DataWranglingInJulia/10-dates.qmd:281
Now let’s show some examples with Date
/DateTime
subtractions.
First, let’s perform a subtraction on Date
/DateTime
objects:
= Date("2021-01-09") jan_9
2021-01-09
= jan_1 - jan_9 duration
-8 days
Notice that it returns a Period
object:
typeof(duration)
Day
We can also use this to add/subtract Period
types:
+ Day(1) duration
-7 days
- Day(1) duration
-9 days
Or chain then together:
+ Month(1) - Day(5) duration
1 month, -13 days
1.5 Rounding Period
s
Suppose you want to convert a duration (which is a Period
object) into another Period
object that has less resolution. For example, we want to convert Second
s to Hour
s. This might be complicated because we have to make sure that we do not bump into an InexactError
.
InexactError
is an error thrown when Julia cannot exactly convert a certain value to a desired type.
Let’s try to convert 61 seconds into a minute:
Minute(Second(61))
InexactError: InexactError(:divexact, Int64, 1.0166666666666666)
InexactError: divexact(Int64, 1.0166666666666666)
Stacktrace:
[1] divexact
@ ~/_work/_tool/julia/1.10.5/x64/share/julia/stdlib/v1.10/Dates/src/periods.jl:378 [inlined]
[2] convert
@ ~/_work/_tool/julia/1.10.5/x64/share/julia/stdlib/v1.10/Dates/src/periods.jl:404 [inlined]
[3] Minute(p::Second)
@ Dates ~/_work/_tool/julia/1.10.5/x64/share/julia/stdlib/v1.10/Dates/src/periods.jl:384
[4] top-level scope
@ ~/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/DataWranglingInJulia/10-dates.qmd:332
We get that InexactError
we mentioned above!
To remedy this, we need to round somehow our Second
type to a desired Minute
type. This can be accomplished in 3 ways:
round
: rounding to the nearest integer.floor
: rounding down.ceil
: rounding up.
round(Second(61), Minute)
1 minute
floor(Second(61), Minute)
1 minute
ceil(Second(61), Minute)
2 minutes
1.6 Date
/DateTime
Intervals
One of the most amazing things we can do with the Dates
modules is to use the :
operator to create Date
/DateTime
Intervals.
This is one of the multiple uses of multiple dispatch that we’ve shown in our previous tutorial on functions.
For example, creating an interval from January 1st to January 10th is easy. We just need to sandwich in what period we want our interval to have as a “step size”:
= Date("2021-01-01"):Day(1):Date("2021-01-10") my_date_interval
Dates.Date("2021-01-01"):Dates.Day(1):Dates.Date("2021-01-10")
We can also convert our interval to a vector with collect()
:
= collect(my_date_interval)
my_dates first(my_dates, 5)
5-element Vector{Date}:
2021-01-01
2021-01-02
2021-01-03
2021-01-04
2021-01-05
Also, there is nothing special with the Day(1)
as our Period
“step size”, we can use whatever we fancy:
Date("2021-01-01"):Day(3):Date("2021-01-10")
Dates.Date("2021-01-01"):Dates.Day(3):Dates.Date("2021-01-10")
Date("2021-01-01"):Week(2):Date("2021-07-01")
Dates.Date("2021-01-01"):Dates.Week(2):Dates.Date("2021-06-18")
Date("2021-01-01"):Month(3):Date("2021-07-01")
Dates.Date("2021-01-01"):Dates.Month(3):Dates.Date("2021-07-01")
There is also the range()
function that gives you the ability to construct an interval by setting the arguments length()
with an additional step
size:
range(Date("2021-01-01"), length = 10, step = Day(1))
All of our accessor functions and dates/time operations work on these intervals with the caveat that we need to vectorize (or broadcast) them:
day.(my_date_interval)
10-element Vector{Int64}:
1
2
3
4
5
6
7
8
9
10
month.(my_date_interval)
10-element Vector{Int64}:
1
1
1
1
1
1
1
1
1
1
.+ Day(1) my_date_interval
10-element Vector{Date}:
2021-01-02
2021-01-03
2021-01-04
2021-01-05
2021-01-06
2021-01-07
2021-01-08
2021-01-09
2021-01-10
2021-01-11
.+ Month(1) my_date_interval
10-element Vector{Date}:
2021-02-01
2021-02-02
2021-02-03
2021-02-04
2021-02-05
2021-02-06
2021-02-07
2021-02-08
2021-02-09
2021-02-10
For the quick-witted, there is more than meets the eye in these accessor functions. They are really helpful and powerful when grouping and summarizing time series data.
2 📁 dateformat
in CSV.jl
The CSV.File
constructor and CSV.read()
function from CSV.jl
have a keyword argument dateformat
that allows a string representing the same dateformat
you would include inside the dateformat""
string literal.
It will automatically detect any column that has strings with the specified dateformat
and parse them as Date
or DateTime
objects.
For example, you can read a CSV File into a DataFrame
where you would have a column of string as dates with the following format "yyyy/mm/dd"
:
= CSV.read("file.csv", DataFrame; dateformat = "yyyy/mm/dd") df
3 📁 Dates
in XLSX.jl
XLSX.jl
does not have an explicit argument for date or time parsing. Instead, if you use the keyword argument infer_eltypes=true
, any date or time in the original Excel file will be parsed as a column of Date
or DateTime
objects:
= DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"; infer_eltypes = true))
df first(df, 5)
4 💾 Using with DataFrame
s
How do we handle dates and times in a DataFrame
?
This is simple. All you need you’ve already learned in this tutorial!
Let us show you. First, let’s create a dummy DataFrame
with some fake time series data:
using DataFramesMeta
= DataFrame(;
df Date = rand(Date("2021-01-01"):Day(1):Date("2021-12-31"), 1_000),
= rand(0.0:1.0, 1_000),
Value
)first(df, 5)
Row | Date | Value |
---|---|---|
Date | Float64 | |
1 | 2021-05-20 | 0.0 |
2 | 2021-08-06 | 1.0 |
3 | 2021-09-11 | 0.0 |
4 | 2021-08-20 | 0.0 |
5 | 2021-11-07 | 1.0 |
First let’s learn how to group data by some specific period using the accessor functions.
We begin by creating a new column to store the day of the week and the month:
@rtransform! df :DayWeek = dayofweek(:Date) :Month = month(:Date)
first(df, 5)
Row | Date | Value | DayWeek | Month |
---|---|---|---|---|
Date | Float64 | Int64 | Int64 | |
1 | 2021-05-20 | 0.0 | 4 | 5 |
2 | 2021-08-06 | 1.0 | 5 | 8 |
3 | 2021-09-11 | 0.0 | 6 | 9 |
4 | 2021-08-20 | 0.0 | 5 | 8 |
5 | 2021-11-07 | 1.0 | 7 | 11 |
Now we can group our data by the new column :Month
.
First, let’s import the mean()
function from the Statistics
module from Julia’s standard library:
using Statistics: mean
@chain df begin
groupby(:Month)
@combine :mean_value = mean(:Value)
end
Row | Month | mean_value |
---|---|---|
Int64 | Float64 | |
1 | 1 | 0.481481 |
2 | 2 | 0.475 |
3 | 3 | 0.494845 |
4 | 4 | 0.454545 |
5 | 5 | 0.5 |
6 | 6 | 0.476744 |
7 | 7 | 0.494253 |
8 | 8 | 0.383721 |
9 | 9 | 0.48 |
10 | 10 | 0.532609 |
11 | 11 | 0.383721 |
12 | 12 | 0.475 |
Here is a handy tip for you to recover only weekdays from Date
s objects. Just use the following filter: dayofweek(your_date) <= 5
.
= @rsubset df dayofweek(:Date) <= 5
my_df first(my_df, 5)
Row | Date | Value | DayWeek | Month |
---|---|---|---|---|
Date | Float64 | Int64 | Int64 | |
1 | 2021-05-20 | 0.0 | 4 | 5 |
2 | 2021-08-06 | 1.0 | 5 | 8 |
3 | 2021-08-20 | 0.0 | 5 | 8 |
4 | 2021-12-10 | 0.0 | 5 | 12 |
5 | 2021-04-23 | 1.0 | 5 | 4 |
For more complex stuff, like business days, you can use the package BusinessDays.jl
.
Also, we can combine the previous two operations into a single one: we filter only the values occurring in weekdays followed by a group by, and then summarize by month.
@chain df begin
@rsubset dayofweek(:Date) <= 5
groupby(:Month)
@combine :mean_value = mean(:Value)
end
Row | Month | mean_value |
---|---|---|
Int64 | Float64 | |
1 | 1 | 0.5 |
2 | 2 | 0.45098 |
3 | 3 | 0.430556 |
4 | 4 | 0.458333 |
5 | 5 | 0.5 |
6 | 6 | 0.460317 |
7 | 7 | 0.484375 |
8 | 8 | 0.359375 |
9 | 9 | 0.516129 |
10 | 10 | 0.5625 |
11 | 11 | 0.428571 |
12 | 12 | 0.491525 |
5 ✉️ Post Scriptum
We hope that you are as marveled as we are with the elegance and simplicity of the Julia Dates
module and its types. It is really easy and a joy to work with Date
and DateFormat
either individually or inside vectors, matrices and DataFrame
s.