using CairoMakie
Plotting Different Geometries with AlgebraOfGraphics.jl
In this tutorial, we will explore how to make different kinds of plots (also called geometries or geoms in ggplot2
) with AoG.jl
. First, we’ll discuss how to navigate AoG.jl
and Makie.jl
documentation. Then, we’ll proceed to show the most common plotting functions in AoG.jl
:
BarPlot
Lines
Errorbars
Scatter
BoxPlot
Violin
Contour
Heatmap
Some main visualizations are missing from this tutorial. These would be the Statistical Visualizations. They are covered in Plotting Statistical Visualizations with AlgebraOfGraphics.jl
. Don’t forget to check it out.
1 📋 geom_*()
- AoG.jl
Table
The following table is a mapping of ggplot2
’s geom_*()
to AoG.jl
’s plotting functions:
ggplot2 |
AoG.jl |
---|---|
geom_col() |
visual(BarPlot) |
geom_point() |
visual(Scatter) |
geom_line() |
visual(Lines) |
geom_errorbar() |
visual(Errorbars) |
geom_boxplot() |
visual(BoxPlot) |
geom_violin() |
visual(Violin) |
geom_label() |
visual(Annotations) |
geom_text() |
visual(Annotations) |
geom_contour() |
visual(Contour) |
geom_tile() |
visual(Heatmap) |
geom_bar() |
frequency() |
geom_histogram() |
histogram() |
geom_density() |
density() |
geom_smooth() |
smooth() |
geom_smooth(method = "lm") |
linear() |
geom_area() |
linesfill() |
2 🆘 How to Find Available Plotting Functions?
As you probably know, AoG.jl
uses Makie.jl
as the plotting engine for all visualizations. This has the consequence that all possible plotting objects (geometries) in AoG.jl
are actually plotting types in Makie.jl
.
So, for example, to plot a bar plot in AoG.jl
you would have to call the BarPlot
type from Makie.jl
.
In AoG.jl
you use the visual()
function and then pass the desired Makie.jl
plotting type along with all desired keyword arguments. So, the bar plot would be the following call for visual()
:
= data(...) * mapping(...) * visual(BarPlot; ...)
plt
draw(plt)
There are two main ways to browse and obtain information regarding plotting types and custom arguments:
Makie.jl
Documentation: this is very useful and evenAoG.jl
’s documentation redirects to it.help_attributes()
function and Docstrings: you can also see information from the Julia REPL (terminal) with thehelp_attributes()
and by seeing the help information forMakie.jl
’s plotting functions.
2.1 Makie Documentation
In Makie.jl
’s documentation there is a rich description of the plotting functions. We encourage you to browse it and learn with the examples the several options available for every plotting type.
Note that Makie.jl
’s plotting functions are all lowercase since they use the naming convention for functions. Instead, AoG.jl
’s uses plotting types which are all TitleCase with the naming convention for types.
If you try to use visual()
on the plotting functions, you’ll get an error. Instead, you need to use visual()
on the plotting types.
For instance, this will error:
visual(barplot)
The correct way is:
visual(BarPlot)
Just remember that you would need to convert the plotting functions to plotting types when you pass it to the visual()
function.
This is the online documentation for the barplot()
function from Makie.jl
:
barplot(x, y; kwargs...)
Plots a barplot; y
defines the height. x
and y
should be 1 dimensional. Bar width is determined by the attribute width
, shrunk by gap
in the following way: width -> width * (1 - gap)
.
Attributes
Available attributes and their defaults for MakieCore.Combined{Makie.barplot}
are:
bar_labels "nothing"
color RGBA{Float32}(0.0f0,0.0f0,0.0f0,0.6f0)
color_over_background MakieCore.Automatic()
color_over_bar MakieCore.Automatic()
colormap :viridis
colorrange MakieCore.Automatic()
cycle [:color => :patchcolor]
direction :y
dodge MakieCore.Automatic()
dodge_gap 0.03
fillto MakieCore.Automatic()
flip_labels_at Inf
gap 0.2
highclip MakieCore.Automatic()
inspectable true
label_color :black
label_font :regular
label_formatter Makie.bar_label_formatter
label_offset 5
label_rotation 0.0
label_size 20
lowclip MakieCore.Automatic()
marker GeometryBasics.HyperRectangle
n_dodge MakieCore.Automatic()
nan_color :transparent
offset 0.0
stack MakieCore.Automatic()
strokecolor :black
strokewidth 0
transparency false
visible true
width MakieCore.Automatic()
2.2 help_attributes()
and Docstrings
A nice helping hand with plotting functions, if you do not want to browse Makie.jl
’s documentation is the help_attributes()
function from any Makie.jl
’s backend.
Let us show how it works, but first let’s load the default backend that we are using in these tutorials: CairoMakie.jl
.
Here is an example with the barplot()
plotting function:
help_attributes(barplot)
Available attributes for `MakieCore.Combined{Makie.barplot}` are:
bar_labels color color_over_background color_over_bar colormap colorrange cycle direction dodge dodge_gap fillto flip_labels_at gap highclip inspectable label_color label_font label_formatter label_offset label_rotation label_size lowclip marker n_dodge nan_color offset stack strokecolor strokewidth transparency visible width
We can see that BarPlot
, when used inside visual()
, has a lot of keyword arguments for us to customize our bar plots.
2.2.1 Docstrings from help
or ?
We can also check the docstrings from a specific plotting function by calling either the help()
function on it or by using the help mode of the Julia REPL:
julia> ?
help?> barplot
Also don’t forget to check AoG.jl
’s Documentation. The tutorial and gallery are nice sections that showcase several use cases and possible customizations.
3 🎨 visual()
function
The visual()
function from AoG.jl
is the function which we attribute plotting objects to our mapping()
s in our data()
.
The most important argument to visual()
is the first positional argument: the plotting type. Then the following keyword arguments are the same that the analogous Makie.jl
plotting function’s available keyword arguments.
For example, the barplot()
plotting function from Makie.jl
supports the width
keyword argument. That would be translate to the following visual()
function call in AoG.jl
:
visual(BarPlot; width = ...)
Let’s show some of the available plotting types (geometries) to the visual()
function. But first, we begin by loading AoG.jl
, data wrangling libraries and the DataFrame
we’ve used previously:
using PharmaDatasets
using DataFramesMeta
using AlgebraOfGraphics
= dataset("demographics_1")
df first(df, 5)
Row | ID | AGE | WEIGHT | SCR | ISMALE | eGFR |
---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | Float64 | Int64 | Float64 | |
1 | 1 | 34.823 | 38.212 | 1.1129 | 0 | 42.635 |
2 | 2 | 32.765 | 74.838 | 0.8846 | 1 | 126.0 |
3 | 3 | 35.974 | 37.303 | 1.1004 | 1 | 48.981 |
4 | 4 | 38.206 | 32.969 | 1.1972 | 1 | 38.934 |
5 | 5 | 33.559 | 47.139 | 1.5924 | 0 | 37.198 |
We will also do some columns transformations to CategoricalArray
s:
Don’t forget to check our Data Wrangling in Julia tutorials Handling Factors and Categorical Data with CategoricalArrays.jl
.
using CategoricalArrays
@transform! df :SEX = categorical(:ISMALE);
@transform! df :SEX = recode(:SEX, 0 => "female", 1 => "male");
@transform! df :WEIGHT_cat = cut(:WEIGHT, 2; labels = ["light", "heavy"])
3.1 BarPlot
Let’s begin with the bar plot. Here the plotting type is BarPlot
and the plotting function is barplot
. So, we just call visual()
and pass BarPlot
as the first argument followed by any desired keyword arguments supported by the plotting function barplot()
.
Here is an example with our dataset df
. Notice that we need to first group the data with the @by
macro and then apply the mean()
function from Julia’s standard library Statistics
module:
There is an easier way to automatically perform grouping and summarizing in AoG.jl
with statistical transformation functions. We will cover this in Plotting Statistical Visualizations with AlgebraOfGraphics.jl
. Make sure to check it out.
using Statistics
data(@by df :SEX :AGE_MEAN = mean(:AGE)) * mapping(:SEX, :AGE_MEAN) * visual(BarPlot) |>
draw
We can customize the specified plotting object in visual()
by adding supported keyword arguments.
If we would like to make our bars blue and a little bit less wide we can use the color
and width
arguments:
data(@by df :SEX :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN) *
visual(BarPlot; color = :blue, width = 0.5) |> draw
Here is a more complex example using color
and dodge
for the column :WEIGHT_cat
inside mapping()
:
data(@by df [:WEIGHT_cat, :SEX] :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN; color = :WEIGHT_cat, dodge = :WEIGHT_cat) *
visual(BarPlot) |> draw
Note that the color
mapping will override the color
keyword argument inside a visual()
call. For custom colors, which we will cover in Customization of AlgebraOfGraphics.jl
Plots, it is better to use the palette
argument inside draw[!]()
function.
3.2 Lines
Lines
creates a line plot with the specified data()
and mapping()
s.
It is analogous to ggplot2
’s geom_line()
.
For the line plot, we will use some concentration-time pharmacokinetic data after oral administration. This plot is known as spaghetti plot.
Line plots implicitly indicate a dependence of an observation with previous ones. This dependence makes line plots perfect for time series data and other time-dependent visualizations. But for data that do not have a time-dependency, or any other x-axis dependency, line plots might convey an intuition that is not the objective of the visualization.
= dataset("pumas_tutorials/po_sd_1")
pk first(pk, 5)
Row | id | time | cp | dv | amt | evid | cmt | rate | dosegrp |
---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64? | Float64? | Float64? | Int64 | Int64 | Float64 | Int64 | |
1 | 1 | 0.0 | missing | missing | 10.0 | 1 | 1 | 0.0 | 10 |
2 | 1 | 0.25 | 20.2592 | 22.6353 | missing | 0 | 2 | 0.0 | 10 |
3 | 1 | 0.5 | 36.8068 | 16.5712 | missing | 0 | 2 | 0.0 | 10 |
4 | 1 | 0.75 | 50.2838 | 60.8928 | missing | 0 | 2 | 0.0 | 10 |
5 | 1 | 1.0 | 61.2211 | 46.8858 | missing | 0 | 2 | 0.0 | 10 |
Here is a simple plot for the PK data for one subject using the positional x-axis, y-axis, and color arguments from mapping. We reduce the dataset to the first 10 ids so our plot’s legend doesn’t overflow.
We are removing missing
values from the pk
dataset and also filtering only to 10 observations so that the legend does not overflow.
dropmissing!(pk, :cp);
= @rsubset(pk, :id <= 10); pk_ids
We are using nonnumeric()
inside mapping()
to tell AoG.jl
that the column :id
, despite being an integer column, should be treated as discrete/categorical, i.e. non-numeric.
This will be covered in Customization of AlgebraOfGraphics.jl
Plots.
data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Lines; alpha = 0.5) |> draw
To draw this visualization without the legend, you can call the mutating function draw!()
. Whereas draw()
automatically adds colorbars and legends, draw!()
does not. Colorbar and legend, should they be necessary, can be added separately to the visualization with the colorbar!()
and legend!()
helper functions.
We’ll cover customizations in Customization of AlgebraOfGraphics.jl
Plots. Don’t forget to check it out.
Here’s how the code would look like without the legend:
= Figure()
fig = data(pk) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Lines)
plt draw!(fig, plt)
fig
3.3 Errorbars
Errorbars
creates vertical interval lines, commonly used to represent data variability or uncertainty.
It is analogous to ggplot2
’s geom_errorbar()
Let’s use the same example as before, but this time we are interested in finding the mean concentration-time profile for each dose level. We will also use the standard deviation as our measure of variability.
We are using @by
macro again to group the data by dosegrp
and time
. Also, we will use the mean()
function to calculate the mean concentration and the std()
function to determine the standard deviation. These functions are available in the Statistics
module in Julia’s standard library.
Don’t forget to check our Data Wrangling in Julia tutorial Manipulating Tables with DataFramesMeta.jl
for a more in-depth explanation on the use of DataFramesMeta.jl
’s macros
= @by pk [:dosegrp, :time] begin
pk_error :Cmean = mean(:cp)
:Cstd = std(:cp)
end
first(pk_error, 5)
Row | dosegrp | time | Cmean | Cstd |
---|---|---|---|---|
Int64 | Float64 | Float64 | Float64 | |
1 | 10 | 0.25 | 31.3022 | 13.3897 |
2 | 10 | 0.5 | 54.484 | 21.9751 |
3 | 10 | 0.75 | 71.6505 | 27.4859 |
4 | 10 | 1.0 | 84.3257 | 30.9806 |
5 | 10 | 2.0 | 108.478 | 34.991 |
Now we can plot the error bars. In this case, the mapping
function will take three positional arguments: x
position (time), y
position (mean concentration), and the error bar length (standard deviation):
data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
visual(Errorbars) |> draw
Notice that Errorbars
only generates the vertical interval lines. It is a common practice to show error bars together with other data visualization methods, such as bar and line plots. You can achieve this by adding two layers with the +
operator, which would look like this for a line plot:
data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
visual(Errorbars) + visual(Lines)) |> draw (
You can learn more about combining layers with the +
operator by checking our tutorial on Grammar of Graphics with AlgebraOfGraphics.jl
3.4 Scatter
Scatter
is the plotting type for scatter plots and is analogous to ggplot2
’s geom_point()
:
data(pk_ids) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Scatter) |> draw
There are some interesting keyword arguments for Scatter
if you type in a Julia REPL help_attributes(scatter)
or ?scatter
. For example, you can choose a marker
type and markersize
:
data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Scatter; marker = '+', markersize = 25, alpha = 0.5) |> draw
3.5 ScatterLines
ScatterLines
is the fusion of Scatter
and Lines
. So every keyword argument from both of them will be available.
It is similar to applying in ggplot2
the following geometries: geom_line() + geom_point()
.
Let’s plot our previous Lines
example using ScatterLines
:
data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(ScatterLines; alpha = 0.5) |> draw
3.6 BoxPlot
Box plots are the statistician’s favorite plots. Here is a simple box plot using BoxPlot
inside visual()
:
data(df) * mapping(:SEX, :AGE) * visual(BoxPlot) |> draw
You can use some keyword arguments for BoxPlot
, such as:
show_notch
: whether or not to have a notch near the median.range
: the inter-quartile range (IQR), default1.5
.whiskerwidth
: if you want to have a small horizontal end in the whiskers relative to the box width.show_outliers
: whether or not to show outliers as points, defaulttrue
.
data(df) *
mapping(:SEX, :AGE) *
visual(BoxPlot; show_notch = true, range = 1, whiskerwidth = 0.25) |> draw
3.7 Violin
Violin plots are also popular and a good alternative to box plots. Instead of being based in median, quartiles and IQR, violin plots display the actual probability density of the underlying values (using a kernel density estimator).
Here is the same box plot example, but now using a Violin
inside visual()
. It shows much more visual information than the box plot:
data(df) * mapping(:SEX, :AGE) * visual(Violin) |> draw
As with BoxPlot
, Violin
has some interesting keyword arguments. The most important is show_median
, which tells AoG.jl
whether to show or not the median inside the violins:
data(df) * mapping(:SEX, :AGE) * visual(Violin; show_median = true) |> draw
Violin
also accepts a side
inside mapping()
which breaks the violin plot into two sides: left and right.
If you pair side
with color
you can convey more information in your violin plots:
data(df) *
mapping(:SEX, :AGE; side = :WEIGHT_cat, color = :WEIGHT_cat) *
visual(Violin; show_median = true) |> draw
We will cover more geometries and plotting types in Plotting Statistical Visualizations with AlgebraOfGraphics.jl
which are frequently paired with statistical visualizations.