using Dates
Handling Dates and Times
In this tutorial, we turn our attention to how to handle and parse dates and times in Julia.
Let’s begin with the standard library module Dates.
1 📅 Standard Library Dates Module
Contrary to R, Julia treats dates as a specific type in the standard library. This allows for a much richer and powerful representation and manipulation of dates and times.
For any date or time manipulation in Julia, you should first import the Dates module from Julia’s standard library:
R treats dates as double precision floating point numbers:
> typeof(as.Date("2021-01-01"))
[1] "double"Compare this with Julia:
typeof(Date("2021-01-01"))Date
1.1 Date/DateTime Types
The Dates module from Julia’s standard library has two main types:
Date: represents a given date within a day precisionDateTime: represents a given time within a millisecond
You can construct them from the Date() and DateTime() constructors by passing a string:
Date("2021-01-01")2021-01-01
DateTime("20201-01-01T08:00:00")20201-01-01T08:00:00
They also accept Period types, which are divided into DatePeriod and TimePeriod.
using InteractiveUtils
subtypes(Period)2-element Vector{Any}:
DatePeriod
TimePeriod
DatePeriod types are related to dates:
subtypes(DatePeriod)5-element Vector{Any}:
Day
Month
Quarter
Week
Year
And TimePeriod types are related to time:
subtypes(TimePeriod)6-element Vector{Any}:
Hour
Microsecond
Millisecond
Minute
Nanosecond
Second
So for example, we can create the Date object for the 1st of January of 2021 using Period types:
Date(Day(01), Month(01), Year(2021))2021-01-01
Similarly, we can create the DateTime object for the same date with time as 8AM:
DateTime(Day(01), Month(01), Year(2021), Hour(8), Minute(0), Second(0))2021-01-01T08:00:00
Here is a Nice Diagram for the Period subtypes. The ellipses are abstract types and the rectangles are concrete types:
Period types1.2 Parsing Strings as Date/DateTime
Most often, we just want to parse strings as either Date or DateTime objects.
If you are really lucky the string will be well-behaved and parsed automatically, like in our previous examples. However, that is often not the case. So, we have to learn how to parse different string formats and standards.
This can be done with the dateformat"" string literal. Date() and DateTime() constructors accept a second argument that specifies what is the string format to parse correctly the string.
For example, the string "01/Jan/2021" can be parsed with:
Date("01/Jan/2021", dateformat"dd/u/yyyy")2021-01-01
Or even "Jan-1-21/8AM":
DateTime("Jan-1-21/8AM", dateformat"u-d-yy/Hp")0021-01-01T08:00:00
Julia’s documentation on DateFormat is the best resource to learn how to construct your dateformat"" string to represent a given string format.
In fact, the table below is taken from there:
| Code | Matches | Comment |
|---|---|---|
y |
1996, 96 | Returns year of 1996, 0096 |
Y |
1996, 96 | Returns year of 1996, 0096. Equivalent to y |
m |
1, 01 | Matches 1 or 2-digit months |
u |
Jan | Matches abbreviated months according to the locale keyword |
U |
January | Matches full month names according to the locale keyword |
d |
1, 01 | Matches 1 or 2-digit days |
H |
00 | Matches hours (24-hour clock) |
I |
00 | For outputting hours with 12-hour clock |
M |
00 | Matches minutes |
S |
00 | Matches seconds |
s |
.500 | Matches milliseconds |
e |
Mon, Tues | Matches abbreviated days of the week |
E |
Monday | Matches full name days of the week |
p |
AM | Matches AM/PM (case-insensitive) |
yyyymmdd |
19960101 | Matches fixed-width year, month, and day |
1.3 Extracting Information from Date/DateTime
Once we have parsed a string into Date and DateTime objects, we can also extract relevant time and date information as integers.
This can be done with the “Accessor Functions”:
year(),month(), andDay()for dateshour(),minute(), andsecond()for time
jan_1 = Date("2021-01-01")
eight_am = DateTime("2021-01-01T08:00:00")year(jan_1)2021
month(jan_1)1
day(jan_1)1
hour(eight_am)8
minute(eight_am)0
second(eight_am)0
Remember that in Julia, by convention, anything in CamelCase is either a type, package, module, or a constructor, and anything in snake_case is a function/method or variable.
1.4 Date/DateTime Operations
We can perform two operations on Date and DateTime objects:
- Addition with
+: this gives back anotherDateorDateTimeobject. - Subtraction with
-: this does not return aDateorDateTime, but aPeriodtype to represent a given duration.
Here is an example with addition, note that we have to add using Period types:
jan_1 + Day(8)2021-01-09
Now we are using the constructor Day() for the Period type instead of the accessor function day().
We can add any period we want, and it also works with DateTime:
jan_1 + Month(1) + Day(1)2021-02-02
eight_am + Hour(2) + Minute(30)2021-01-01T10:30:00
Notice that we cannot add a Period to a type that cannot represent it.
The following will error, because jan_1 (a Date object) cannot represent time periods:
jan_1 + Hour(1)MethodError: MethodError(+, (Date("2021-01-01"), Hour(1)), 0x00000000000082cd)
MethodError: no method matching +(::Date, ::Hour)
Closest candidates are:
+(::TimeType, ::Period, !Matched::Period)
@ Dates ~/_work/_tool/julia/1.9.3/x64/share/julia/stdlib/v1.9/Dates/src/periods.jl:353
+(::TimeType, ::Period, !Matched::Period, !Matched::Period...)
@ Dates ~/_work/_tool/julia/1.9.3/x64/share/julia/stdlib/v1.9/Dates/src/periods.jl:354
+(::Any, ::Any, !Matched::Any, !Matched::Any...)
@ Base operators.jl:578
...
Stacktrace:
[1] top-level scope
@ ~/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/DataWranglingInJulia/10-dates.qmd:281
Now let’s show some examples with Date/DateTime subtractions.
First, let’s perform a subtraction on Date/DateTime objects:
jan_9 = Date("2021-01-09")2021-01-09
duration = jan_1 - jan_9-8 days
Notice that it returns a Period object:
typeof(duration)Day
We can also use this to add/subtract Period types:
duration + Day(1)-7 days
duration - Day(1)-9 days
Or chain then together:
duration + Month(1) - Day(5)1 month, -13 days
1.5 Rounding Periods
Suppose you want to convert a duration (which is a Period object) into another Period object that has less resolution. For example, we want to convert Seconds to Hours. This might be complicated because we have to make sure that we do not bump into an InexactError.
InexactError is an error thrown when Julia cannot exactly convert a certain value to a desired type.
Let’s try to convert 61 seconds into a minute:
Minute(Second(61))InexactError: InexactError(:divexact, Int64, 1.0166666666666666)
InexactError: divexact(Int64, 1.0166666666666666)
Stacktrace:
[1] divexact
@ ~/_work/_tool/julia/1.9.3/x64/share/julia/stdlib/v1.9/Dates/src/periods.jl:378 [inlined]
[2] convert
@ ~/_work/_tool/julia/1.9.3/x64/share/julia/stdlib/v1.9/Dates/src/periods.jl:404 [inlined]
[3] Minute(p::Second)
@ Dates ~/_work/_tool/julia/1.9.3/x64/share/julia/stdlib/v1.9/Dates/src/periods.jl:384
[4] top-level scope
@ ~/_work/PumasTutorials.jl/PumasTutorials.jl/tutorials/DataWranglingInJulia/10-dates.qmd:332
We get that InexactError we mentioned above!
To remedy this, we need to round somehow our Second type to a desired Minute type. This can be accomplished in 3 ways:
round: rounding to the nearest integer.floor: rounding down.ceil: rounding up.
round(Second(61), Minute)1 minute
floor(Second(61), Minute)1 minute
ceil(Second(61), Minute)2 minutes
1.6 Date/DateTime Intervals
One of the most amazing things we can do with the Dates modules is to use the : operator to create Date/DateTime Intervals.
This is one of the multiple uses of multiple dispatch that we’ve shown in our previous tutorial on functions.
For example, creating an interval from January 1st to January 10th is easy. We just need to sandwich in what period we want our interval to have as a “step size”:
my_date_interval = Date("2021-01-01"):Day(1):Date("2021-01-10")Dates.Date("2021-01-01"):Dates.Day(1):Dates.Date("2021-01-10")
We can also convert our interval to a vector with collect():
my_dates = collect(my_date_interval)
first(my_dates, 5)5-element Vector{Date}:
2021-01-01
2021-01-02
2021-01-03
2021-01-04
2021-01-05
Also, there is nothing special with the Day(1) as our Period “step size”, we can use whatever we fancy:
Date("2021-01-01"):Day(3):Date("2021-01-10")Dates.Date("2021-01-01"):Dates.Day(3):Dates.Date("2021-01-10")
Date("2021-01-01"):Week(2):Date("2021-07-01")Dates.Date("2021-01-01"):Dates.Week(2):Dates.Date("2021-06-18")
Date("2021-01-01"):Month(3):Date("2021-07-01")Dates.Date("2021-01-01"):Dates.Month(3):Dates.Date("2021-07-01")
There is also the range() function that gives you the ability to construct an interval by setting the arguments length() with an additional step size:
range(Date("2021-01-01"), length = 10, step = Day(1))All of our accessor functions and dates/time operations work on these intervals with the caveat that we need to vectorize (or broadcast) them:
day.(my_date_interval)10-element Vector{Int64}:
1
2
3
4
5
6
7
8
9
10
month.(my_date_interval)10-element Vector{Int64}:
1
1
1
1
1
1
1
1
1
1
my_date_interval .+ Day(1)10-element Vector{Date}:
2021-01-02
2021-01-03
2021-01-04
2021-01-05
2021-01-06
2021-01-07
2021-01-08
2021-01-09
2021-01-10
2021-01-11
my_date_interval .+ Month(1)10-element Vector{Date}:
2021-02-01
2021-02-02
2021-02-03
2021-02-04
2021-02-05
2021-02-06
2021-02-07
2021-02-08
2021-02-09
2021-02-10
For the quick-witted, there is more than meets the eye in these accessor functions. They are really helpful and powerful when grouping and summarizing time series data.
2 📁 dateformat in CSV.jl
The CSV.File constructor and CSV.read() function from CSV.jl have a keyword argument dateformat that allows a string representing the same dateformat you would include inside the dateformat"" string literal.
It will automatically detect any column that has strings with the specified dateformat and parse them as Date or DateTime objects.
For example, you can read a CSV File into a DataFrame where you would have a column of string as dates with the following format "yyyy/mm/dd":
df = CSV.read("file.csv", DataFrame; dateformat = "yyyy/mm/dd")3 📁 Dates in XLSX.jl
XLSX.jl does not have an explicit argument for date or time parsing. Instead, if you use the keyword argument infer_eltypes=true, any date or time in the original Excel file will be parsed as a column of Date or DateTime objects:
df = DataFrame(XLSX.readtable("myfile.xlsx", "mysheet"; infer_eltypes = true))
first(df, 5)4 💾 Using with DataFrames
How do we handle dates and times in a DataFrame?
This is simple. All you need you’ve already learned in this tutorial!
Let us show you. First, let’s create a dummy DataFrame with some fake time series data:
using DataFramesMetadf = DataFrame(;
Date = rand(Date("2021-01-01"):Day(1):Date("2021-12-31"), 1_000),
Value = rand(0.0:1.0, 1_000),
)
first(df, 5)| Row | Date | Value |
|---|---|---|
| Date | Float64 | |
| 1 | 2021-06-22 | 1.0 |
| 2 | 2021-01-27 | 0.0 |
| 3 | 2021-12-11 | 0.0 |
| 4 | 2021-03-15 | 1.0 |
| 5 | 2021-11-27 | 0.0 |
First let’s learn how to group data by some specific period using the accessor functions.
We begin by creating a new column to store the day of the week and the month:
@rtransform! df :DayWeek = dayofweek(:Date) :Month = month(:Date)
first(df, 5)| Row | Date | Value | DayWeek | Month |
|---|---|---|---|---|
| Date | Float64 | Int64 | Int64 | |
| 1 | 2021-06-22 | 1.0 | 2 | 6 |
| 2 | 2021-01-27 | 0.0 | 3 | 1 |
| 3 | 2021-12-11 | 0.0 | 6 | 12 |
| 4 | 2021-03-15 | 1.0 | 1 | 3 |
| 5 | 2021-11-27 | 0.0 | 6 | 11 |
Now we can group our data by the new column :Month.
First, let’s import the mean() function from the Statistics module from Julia’s standard library:
using Statistics: mean
@chain df begin
groupby(:Month)
@combine :mean_value = mean(:Value)
end| Row | Month | mean_value |
|---|---|---|
| Int64 | Float64 | |
| 1 | 1 | 0.5875 |
| 2 | 2 | 0.51087 |
| 3 | 3 | 0.505747 |
| 4 | 4 | 0.472222 |
| 5 | 5 | 0.47619 |
| 6 | 6 | 0.55 |
| 7 | 7 | 0.44898 |
| 8 | 8 | 0.464286 |
| 9 | 9 | 0.493506 |
| 10 | 10 | 0.519481 |
| 11 | 11 | 0.566265 |
| 12 | 12 | 0.465116 |
Here is a handy tip for you to recover only weekdays from Dates objects. Just use the following filter: dayofweek(your_date) <= 5.
my_df = @rsubset df dayofweek(:Date) <= 5
first(my_df, 5)| Row | Date | Value | DayWeek | Month |
|---|---|---|---|---|
| Date | Float64 | Int64 | Int64 | |
| 1 | 2021-06-22 | 1.0 | 2 | 6 |
| 2 | 2021-01-27 | 0.0 | 3 | 1 |
| 3 | 2021-03-15 | 1.0 | 1 | 3 |
| 4 | 2021-03-10 | 0.0 | 3 | 3 |
| 5 | 2021-04-26 | 1.0 | 1 | 4 |
For more complex stuff, like business days, you can use the package BusinessDays.jl.
Also, we can combine the previous two operations into a single one: we filter only the values occurring in weekdays followed by a group by, and then summarize by month.
@chain df begin
@rsubset dayofweek(:Date) <= 5
groupby(:Month)
@combine :mean_value = mean(:Value)
end| Row | Month | mean_value |
|---|---|---|
| Int64 | Float64 | |
| 1 | 1 | 0.591837 |
| 2 | 2 | 0.569231 |
| 3 | 3 | 0.492308 |
| 4 | 4 | 0.52 |
| 5 | 5 | 0.509804 |
| 6 | 6 | 0.587302 |
| 7 | 7 | 0.396825 |
| 8 | 8 | 0.474576 |
| 9 | 9 | 0.490566 |
| 10 | 10 | 0.553191 |
| 11 | 11 | 0.523077 |
| 12 | 12 | 0.474576 |
5 ✉️ Post Scriptum
We hope that you are as marveled as we are with the elegance and simplicity of the Julia Dates module and its types. It is really easy and a joy to work with Date and DateFormat either individually or inside vectors, matrices and DataFrames.