AlgebraOfGraphics.jl is a powerful package for plotting and data visualization. Its expressive syntax is based on principles similar to the grammar of graphics of the R package ggplot2.
In this set of tutorials we will learn how to do complex and custom plots easily with AlgebraOfGraphics.jl. In this tutorial, we will introduce the fundamentals of AlgebraOfGraphics.jl and give an overview of the Makie.jl plotting ecosystem, which is the base on which AlgebraOfGraphics.jl is built.
Tip
AlgebraOfGraphics.jl is a mouthful. So we will be using the alias AoG.jl which makes it less cumbersome.
1 📇 Comparison ggplot2 vs AoG.jl
action
ggplot2
AoG.jl
Input data
ggplot(df)
data(df)
Map aesthetics
aes(...)
mapping(...)
Add geometries
geom_*(...)
visual(...)
Combine layers
+
*
Facetting
facet_[wrap\|grid](~ column)
mapping(...; [row\|col\|layout]=:column)
Customize scales
scale_*_manual()
renamer(...)
Themes
theme_*(...)
set_theme!(theme_*()); draw(plt)
Customize axes labels
[x\|y]lab("...")
draw(plt, axis=(; [x\|y]label="..."))
Customize color
scale_[fill\|color]_*(...)
draw(plt, palettes=(; color=...)) or visual(..., colormap=...)
Save plot
ggsave("file.[png\|svg]")
save("file.[png\|svg]", draw(plt))
Frequency
geom_bar() or stat_count()
frequency()
Histogram
geom_histogram or stat_bin()
histogram()
Density
geom_density or stat_density()
density()
Expectation/Mean
stat_summary(fun = "mean")
expectation()
Smooth trend
stat_smooth or geom_smooth()
(visual(...) + smooth())
Linear trend
stat_smooth(method = "lm") or geom_smooth(method = "lm")
(visual(...) + linear())
Log scale
scale_[x\|y]_log10()
draw(plt; axis=(; [x\|y]scale=log10))
2 đź’ľ Interface with Data: data() function
The first step with AoG.jl is to specify your data source. In Julia, there is a unifying data API provided by the Tables.jl package. A lot of packages and types in the Julia ecosystem are compatible with the Tables.jl data API. For instance, the DataFrame that we have been using so far is compatible with the Tables.jl data API.
AoG.jl can use any Tables.jl compatible type as input and you specify them with the data() function.
First, let’s import the PharamaDatasets.jl and DataFramesMeta.jl packages with the using statements and load our data using the dataset function:
Now, we import AoG.jl and CairoMakie.jl as our Makie backend.
usingAlgebraOfGraphics# big name, AoGusingCairoMakie
Note
CairoMakie.jl is a Makie.jl backend built on the Cairo open source graphics library. It is the default backend that we will use in our tutorials on data visualization. If you want to know more about the different Makie.jl backends that are available, their advantages, and which one you should use; check the end of this tutorial.
Now, if we call data() on our DataFrame named df we will have back an AoG.jl object of type Layer.
This object is where AoG.jl stores all the specifications of our intended visualization.
Of course, now it will only hold the visualization’s data and nothing more:
data(df)
This is similar to the following in R:
df %>%ggplot()
3 🗺️ Specify Mappings: mapping function
The second step is to specify our mappings, also known as aesthetics or aes() from ggplot2.
This is done with AoG.jl’s mapping() function. It accepts 3 positional arguments and several keyword arguments which we will cover briefly. Let’s first focus on the 3 positional arguments. They represent the x, y and z axes of the plot:
So for example, if we specify first the column :AGE followed by the column :WEIGHT, we would be asking AoG.jl to map :AGE to the x-axis and :WEIGHT to the y-axis.
Note that if we do not specify any “geometry”, AoG.jl will, by default, draw a scatter plot.
data(df) *mapping(:AGE, :WEIGHT) |> draw
Tip
I am using the draw() function but we have not yet covered it. Don’t worry for now. Just think that draw() renders our plot specifications into a backend (in our case CairoMakie.jl).
Notice that we are using the * multiplication operator. This is the primary operator to combine partially defined layers into a full visualization.
The * operator is also associative, which means that order does not matter. So, if we specify mapping() first then apply a multiplication operation * to the data(), we get back the same plot:
Besides the 3 positional arguments, mapping() has several keyword arguments:
color
marker
dodge
stack
col
row
layout
We’ll cover all of them below:
3.1.1color
The first mapping() keyword argument we will cover is color which maps a column to a color to be displayed in the visualization.
For example if we specify the color argument the column :ISMALE we get the same scatter plot as before but now color is mapped to the :ISMALE column:
data(df) *mapping(:AGE, :WEIGHT; color =:ISMALE) |> draw
Since :ISMALE is an Int64 type of column, so AoG.jl will, by default, map it as a continuous color gradient. That is obviously not what we intend to display.
Let’s add :SEX column as a CategoricalArray of the :ISMALE column and do a little bit of recode():
data(df) *mapping(:AGE, :WEIGHT; color =:SEX) |> draw
Note
In order for AoG.jl to display values as categorical/factor/discrete instead of continuous, we need to use the function nonnumeric() for the desired mapping. We could also use the renamer() function.
To show the dodge keyword argument, we will do a statistical visualization with the frequency() function. dodge can be used with the following geometries:
BoxPlot
BarPlot
Violin
Let’s use the :WEIGHT column to create a CategoricalArray with 3 levels using the cut() function and assign it to the :WEIGHT_CAT column:
stack mapping is only available for bar plots. So let’s revisit the last example but instead of “dodging” the bars, we will stack them:
data(df) *mapping(:WEIGHT_CAT; color =:SEX, stack =:SEX) *frequency() |> draw
3.1.5col
Have you ever done “facetting” in ggplot2? If you have, the next 3 keywords arguments, col, row, and layout, represent 3 different ways to do facetting on a plot.
First let’s facet our visualization using different columns. This is done with the col keyword argument inside mapping():
data(df) *mapping(:AGE, :eGFR; color =:SEX, col =:WEIGHT_CAT) |> draw
3.1.6row
We can do the same facet as before but now using different rows with row:
data(df) *mapping(:AGE, :eGFR; color =:SEX, row =:WEIGHT_CAT) |> draw
3.1.7layout
layout tells AoG.jl to facet with an automatic setting that best uses the available space. It is analogous to ggplot2’s facet_wrap() function. Here, we can have the previous plot, but now with a facetting that is optimized for a neutral aspect ratio:
data(df) *mapping(:AGE, :eGFR; color =:SEX, layout =:WEIGHT_CAT) |> draw
Tip
Notice that the axes are linked while facetting either with row, col or layout. We’ll explore ways to customize the axes behavior in Advanced Layouts with AlgebraOfGraphics.jl.
Be sure to check it out.
4 🖼️ draw[!]() function
The draw[!]() function in AoG.jl is responsible for passing all the plot specifications and customizations to the desired Makie.jl backend. In this notebook, the chosen backend was CairoMakie.jl.
Also, note that all of our draw() usage was by “piping” AoG.jl layers into it with the Julia’s |> pipe operator. This is fine if you do not need to specify arguments to the draw() function.
The draw() function has 3 keyword arguments used to customize either the axis, figure or palettes.
For example, here is an AoG.jl plot with custom axis, figure and palette specifications inside the draw() function:
plt =data(df) *mapping(:AGE, :WEIGHT; color =:SEX);
In order to save plots, AoG.jl defines a new method for FileIO.jl’s save() function. The first argument is the filename with the desired extension/format, e.g. my_plot.png. The second argument is an AoG.jl plot (the one returned from draw()).
The resolution of a Makie Figure is in principle unitless until it is exported to a file, then the output depends on additional backend settings.
When you save a bitmap (.png), the resolution is converted to pixels using the px_per_unit setting of the CairoMakie backend. This is set to 1 by default, so a figure with resolution (800, 600) will be 800px wide and 600px high. If you set it to 2, you double the resolution without having to adjust font sizes, line widths, etc. This is similar to changing the dpi in other plotting packages, although it is technically different because Cairo does not actually adjust the dpi of the output image and without the dpi metadata, an image does not have a well-defined physical size. Therefore the “per inch” part of dpi settings is usually misleading.
Vector graphics, however, have a physical size by definition because the pt unit that they are specified in can be directly converted to inch or cm. The value pt_per_unit governs how the figure size is converted to pt when saving vector graphics. Its default is 0.75 (this causes png and svg files to be displayed with the same size in most browsers when saved with default settings).
For example, to save our plt image from above as a my_image.png file with 3 times the resolution of the underlying figure, we would call the following save() function:
save("my_image.png", draw(plt); px_per_unit =3)
5.1 Supported Extensions
Different Makie.jl backends support different filetypes and extensions. Here is a complete list:
CairoMakie.jl: .svg, .pdf and .png
GLMakie.jl: .png
WGLMakie.jl: .png
6 🌎 Overview of the Plotting Makie.jl Ecosystem
Under the hood, AoG.jl uses a pure-Julia visualization backend named Makie.jl. We believe that Makie.jl is the present and future of plotting and visualizations in Julia.
Makie.jl itself uses different visualization backends under the hood. These backends are the barebones interfaces for rendering graphics. Currently (January 2022), Makie.jl supports 3 interfaces:
OpenGL
Cairo
WebGL
Let’s talk about each one of them.
6.1 OpenGL with GLMakie.jl
The first interface is the OpenGL which stands for OpenGraphics Library and was created in 1991. OpenGL can use the GPU and is managed by a non-profit technology consortium much the same as the majority of open source platforms, standards and technologies that a lot of other packages depends on.
Makie.jl has an interface to OpenGL with the package GLMakie.jl. GLMakie.jl will render your visualizations and plots in a standalone screen and allows for click, drag and zoom interactivity with the mouse. Notice that Visual Studio Code will not render the image if you use GLMakie.jl and that you won’t be able to take advantage of the interactivity in static rendered versions of Quarto documents, such as HTML or PDFs.
Note
If you played some PC videogames you are familiar with OpenGL and OpenCL.
To use OpenGL with GLMakie.jl you’ll need to load it with the using statement:
usingGLMakie
6.2 Cairo with CairoMakie.jl
The second interface, CairoMakie.jl, is the Makie.jl interface to the Cairo open source graphics library.
Cairo was created in 2003 and is written in C. It is primarily used to render static, high-quality vector graphics visualizations.
If you use CairoMakie.jl, most of your visualizations will be rendered as static SVGs or PNGs. For example, Quarto will render the images in the output cell. Visual Studio Code will also render the images in a preview pane. However, if you use CairoMakie.jl in a Julia terminal you will not have the image rendered, but instead you’ll see an object printed in the terminal that represents the image building blocks, such as points, lines and shapes.
Note
If you used \(\LaTeX\), Gnuplot uses Cairo under the hood to render PDFs and PNGs files. R also uses Cairo for rendering output plots as PDFs and SVGs files. Finally, if you have seen some of the YouTube videos from the famous math communicator and entertainer 3Blue1Brown, his software uses Cairo under the hood as well.
To use Cairo with CairoMakie.jl you’ll need to load it with the using statement:
usingCairoMakie
6.3 WebGL with WGLMakie.jl
The third interface, WGLMakie.jl, uses WebGL. WebGL is the “cousin” of OpenGL. It is a JavaScript GPU-accelerated API that renders graphics and is compatible with almost all web browsers available. It was initially released in 2011 and is managed by the same non-profit technology consortium that manages OpenGL.
WGLMakie.jl is still experimental, so beware that it might not work as intended. You can use WGLMakie.jl to get some of the interactivity that you would get in a standalone GLMakie.jl window, but it will not work in static rendered versions of Quarto documents, such as HTML or PDF versions.
Note
If you ever played games in your browser you’ll definitely have benefited from WebGL. There are a lot of notorious game engines that use WebGL, such as the Unreal Engine 4 and Unity.
To use WebGL with WGLMakie.jl you’ll need to load it with the using statement:
usingWGLMakie
6.4 Which One to Use?
Now the question arises: which one shall I use?
To make it simple, our recommendations are:
Always use CairoMakie.jl (Cairo backend). Most data communication and visualizations are still static, so prefer the Cairo backend for outstanding high-quality static images and plots.
If you need interactivity, use GLMakie.jl, but beware that it creates stand-alone plot windows that can’t be displayed inline in a Visual Studio Code session or Quarto document. Additionally, if you like to code in terminal environments GLMakie.jl might be worth using.
Avoid WGLMakie.jl and only use it if you need to do something really fancy.
7 ⏳ A note about Time To First Plot (TTFP)
Julia is a just-in-time (JIT) compiled language. Which means that it will generate binary code as it needs. This is great for a lot of things, but can be a challenge for others. One of such challenges is the notorious Time To First Plot (TTFP).
Since AoG.jl runs on Makie.jl; which in turn is a pure-Julia implementation some plots will take a while to render. This is because Makie.jl will JIT-compile everything in order to generate the first plot. After this, the following plots will be much faster to show.
This is somewhat disliked by users coming from R’s ggplot2. Since it is coded in C++, ggplot2 is not JIT-compiled, but Ahead-of-time (AOT) compiled.
Fortunately, the Julia community is focusing on resolving TTFP. In the near future, you can expect TTFP to reduce in every Julia and Makie.jl new versions. Eventually, TTFP will be negligible.