Plotting Different Geometries with AlgebraOfGraphics.jl

Authors

Jose Storopoli

Juan Oneto

In this tutorial, we will explore how to make different kinds of plots (also called geometries or geoms in ggplot2) with AoG.jl. First, we’ll discuss how to navigate AoG.jl and Makie.jl documentation. Then, we’ll proceed to show the most common plotting functions in AoG.jl:

  1. BarPlot
  2. Lines
  3. Errorbars
  4. Scatter
  5. BoxPlot
  6. Violin
  7. Contour
  8. Heatmap
Note

Some main visualizations are missing from this tutorial. These would be the Statistical Visualizations. They are covered in Plotting Statistical Visualizations with AlgebraOfGraphics.jl. Don’t forget to check it out.

1 📋 geom_*() - AoG.jl Table

The following table is a mapping of ggplot2’s geom_*() to AoG.jl’s plotting functions:

ggplot2 AoG.jl
geom_col() visual(BarPlot)
geom_point() visual(Scatter)
geom_line() visual(Lines)
geom_errorbar() visual(Errorbars)
geom_boxplot() visual(BoxPlot)
geom_violin() visual(Violin)
geom_label() visual(Annotations)
geom_text() visual(Annotations)
geom_contour() visual(Contour)
geom_tile() visual(Heatmap)
geom_bar() frequency()
geom_histogram() histogram()
geom_density() density()
geom_smooth() smooth()
geom_smooth(method = "lm") linear()
geom_area() linesfill()

2 🆘 How to Find Available Plotting Functions?

As you probably know, AoG.jl uses Makie.jl as the plotting engine for all visualizations. This has the consequence that all possible plotting objects (geometries) in AoG.jl are actually plotting types in Makie.jl.

So, for example, to plot a bar plot in AoG.jl you would have to call the BarPlot type from Makie.jl.

In AoG.jl you use the visual() function and then pass the desired Makie.jl plotting type along with all desired keyword arguments. So, the bar plot would be the following call for visual():

plt = data(...) * mapping(...) * visual(BarPlot; ...)

draw(plt)

There are two main ways to browse and obtain information regarding plotting types and custom arguments:

  1. Makie.jl Documentation: this is very useful and even AoG.jl’s documentation redirects to it.
  2. help_attributes() function and Docstrings: you can also see information from the Julia REPL (terminal) with the help_attributes() and by seeing the help information for Makie.jl’s plotting functions.

2.1 Makie Documentation

In Makie.jl’s documentation there is a rich description of the plotting functions. We encourage you to browse it and learn with the examples the several options available for every plotting type.

Caution

Note that Makie.jl’s plotting functions are all lowercase since they use the naming convention for functions. Instead, AoG.jl’s uses plotting types which are all TitleCase with the naming convention for types.

If you try to use visual() on the plotting functions, you’ll get an error. Instead, you need to use visual() on the plotting types.

For instance, this will error:

visual(barplot)

The correct way is:

visual(BarPlot)

Just remember that you would need to convert the plotting functions to plotting types when you pass it to the visual() function.

Note

This is the online documentation for the barplot() function from Makie.jl:


barplot(positions, heights; kwargs...)

Plots a barplot.

Plot type

The plot type alias for the barplot function is BarPlot.

Attributes

alpha = 1.0 — The alpha value of the colormap or color attribute. Multiple alphas like in plot(alpha=0.2, color=(:red, 0.5), will get multiplied.

bar_labels = nothing — Labels added at the end of each bar.

clip_planes = automatic — Clip planes offer a way to do clipping in 3D space. You can set a Vector of up to 8 Plane3f planes here, behind which plots will be clipped (i.e. become invisible). By default clip planes are inherited from the parent plot or scene. You can remove parent clip_planes by passing Plane3f[].

color = @inherit patchcolor — No docs available.

color_over_background = automatic — No docs available.

color_over_bar = automatic — No docs available.

colormap = @inherit colormap :viridis — Sets the colormap that is sampled for numeric colors. PlotUtils.cgrad(...), Makie.Reverse(any_colormap) can be used as well, or any symbol from ColorBrewer or PlotUtils. To see all available color gradients, you can call Makie.available_gradients().

colorrange = automatic — The values representing the start and end points of colormap.

colorscale = identity — The color transform function. Can be any function, but only works well together with Colorbar for identity, log, log2, log10, sqrt, logit, Makie.pseudolog10 and Makie.Symlog10.

cycle = [:color => :patchcolor] — No docs available.

depth_shift = 0.0 — adjusts the depth value of a plot after all other transformations, i.e. in clip space, where 0 <= depth <= 1. This only applies to GLMakie and WGLMakie and can be used to adjust render order (like a tunable overdraw).

direction = :y — Controls the direction of the bars, can be :y (vertical) or :x (horizontal).

dodge = automatic — No docs available.

dodge_gap = 0.03 — No docs available.

fillto = automatic — Controls the baseline of the bars. This is zero in the default automatic case unless the barplot is in a log-scaled Axis. With a log scale, the automatic default is half the minimum value because zero is an invalid value for a log scale.

flip_labels_at = Inf — No docs available.

fxaa = true — adjusts whether the plot is rendered with fxaa (anti-aliasing, GLMakie only).

gap = 0.2 — The final width of the bars is calculated as w * (1 - gap) where w is the width of each bar as determined with the width attribute.

highclip = automatic — The color for any value above the colorrange.

inspectable = true — sets whether this plot should be seen by DataInspector.

inspector_clear = automatic — Sets a callback function (inspector, plot) -> ... for cleaning up custom indicators in DataInspector.

inspector_hover = automatic — Sets a callback function (inspector, plot, index) -> ... which replaces the default show_data methods.

inspector_label = automatic — Sets a callback function (plot, index, position) -> string which replaces the default label generated by DataInspector.

label_align = automatic — No docs available.

label_color = @inherit textcolor — No docs available.

label_font = @inherit font — The font of the bar labels.

label_formatter = bar_label_formatter — No docs available.

label_offset = 5 — The distance of the labels from the bar ends in screen units. Does not apply when label_position = :center.

label_position = :end — The position of each bar's label relative to the bar. Possible values are :end or :center.

label_rotation = 0π — No docs available.

label_size = @inherit fontsize — The font size of the bar labels.

lowclip = automatic — The color for any value below the colorrange.

model = automatic — Sets a model matrix for the plot. This overrides adjustments made with translate!, rotate! and scale!.

n_dodge = automatic — No docs available.

nan_color = :transparent — The color for NaN values.

offset = 0.0 — No docs available.

overdraw = false — Controls if the plot will draw over other plots. This specifically means ignoring depth checks in GL backends

space = :data — sets the transformation space for box encompassing the plot. See Makie.spaces() for possible inputs.

ssao = false — Adjusts whether the plot is rendered with ssao (screen space ambient occlusion). Note that this only makes sense in 3D plots and is only applicable with fxaa = true.

stack = automatic — No docs available.

strokecolor = @inherit patchstrokecolor — No docs available.

strokewidth = @inherit patchstrokewidth — No docs available.

transformation = automatic — No docs available.

transparency = false — Adjusts how the plot deals with transparency. In GLMakie transparency = true results in using Order Independent Transparency.

visible = true — Controls whether the plot will be rendered or not.

width = automatic — The gapless width of the bars. If automatic, the width w is calculated as minimum(diff(sort(unique(positions))). The actual width of the bars is calculated as w * (1 - gap).


2.2 help_attributes() and Docstrings

A nice helping hand with plotting functions, if you do not want to browse Makie.jl’s documentation is the help_attributes() function from any Makie.jl’s backend.

Let us show how it works, but first let’s load the default backend that we are using in these tutorials: CairoMakie.jl.

using CairoMakie

Here is an example with the barplot() plotting function:

help_attributes(barplot)
Available attributes for `MakieCore.Plot{Makie.barplot}` are: 

alpha bar_labels clip_planes color color_over_background color_over_bar colormap colorrange colorscale cycle depth_shift direction dodge dodge_gap fillto flip_labels_at gap highclip inspectable inspector_clear inspector_hover inspector_label label_align label_color label_font label_formatter label_offset label_position label_rotation label_size lowclip n_dodge nan_color offset overdraw space ssao stack strokecolor strokewidth transparency visible width

We can see that BarPlot, when used inside visual(), has a lot of keyword arguments for us to customize our bar plots.

2.2.1 Docstrings from help or ?

We can also check the docstrings from a specific plotting function by calling either the help() function on it or by using the help mode of the Julia REPL:

julia> ?

help?> barplot
Note

Also don’t forget to check AoG.jl’s Documentation. The tutorial and gallery are nice sections that showcase several use cases and possible customizations.

3 🎨 visual() function

The visual() function from AoG.jl is the function which we attribute plotting objects to our mapping()s in our data().

The most important argument to visual() is the first positional argument: the plotting type. Then the following keyword arguments are the same that the analogous Makie.jl plotting function’s available keyword arguments.

For example, the barplot() plotting function from Makie.jl supports the width keyword argument. That would be translate to the following visual() function call in AoG.jl:

visual(BarPlot; width = ...)

Let’s show some of the available plotting types (geometries) to the visual() function. But first, we begin by loading AoG.jl, data wrangling libraries and the DataFrame we’ve used previously:

using PharmaDatasets
using DataFramesMeta
using AlgebraOfGraphics

df = dataset("demographics_1")
first(df, 5)
5×6 DataFrame
Row ID AGE WEIGHT SCR ISMALE eGFR
Int64 Float64 Float64 Float64 Int64 Float64
1 1 34.823 38.212 1.1129 0 42.635
2 2 32.765 74.838 0.8846 1 126.0
3 3 35.974 37.303 1.1004 1 48.981
4 4 38.206 32.969 1.1972 1 38.934
5 5 33.559 47.139 1.5924 0 37.198

We will also do some columns transformations to CategoricalArrays:

Note

Don’t forget to check our Data Wrangling in Julia tutorials Handling Factors and Categorical Data with CategoricalArrays.jl.

using CategoricalArrays
@transform! df :SEX = categorical(:ISMALE);
@transform! df :SEX = recode(:SEX, 0 => "female", 1 => "male");
@transform! df :WEIGHT_cat = cut(:WEIGHT, 2; labels = ["light", "heavy"])

3.1 BarPlot

Let’s begin with the bar plot. Here the plotting type is BarPlot and the plotting function is barplot. So, we just call visual() and pass BarPlot as the first argument followed by any desired keyword arguments supported by the plotting function barplot().

Here is an example with our dataset df. Notice that we need to first group the data with the @by macro and then apply the mean() function from Julia’s standard library Statistics module:

Note

There is an easier way to automatically perform grouping and summarizing in AoG.jl with statistical transformation functions. We will cover this in Plotting Statistical Visualizations with AlgebraOfGraphics.jl. Make sure to check it out.

using Statistics
data(@by df :SEX :AGE_MEAN = mean(:AGE)) * mapping(:SEX, :AGE_MEAN) * visual(BarPlot) |>
draw

We can customize the specified plotting object in visual() by adding supported keyword arguments.

If we would like to make our bars blue and a little bit less wide we can use the color and width arguments:

data(@by df :SEX :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN) *
visual(BarPlot; color = :blue, width = 0.5) |> draw

Here is a more complex example using color and dodge for the column :WEIGHT_cat inside mapping():

data(@by df [:WEIGHT_cat, :SEX] :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN; color = :WEIGHT_cat, dodge = :WEIGHT_cat) *
visual(BarPlot) |> draw

Tip

Note that the color mapping will override the color keyword argument inside a visual() call. For custom colors, which we will cover in Customization of AlgebraOfGraphics.jl Plots, it is better to use the palette argument inside draw[!]() function.

3.2 Lines

Lines creates a line plot with the specified data() and mapping()s.

It is analogous to ggplot2’s geom_line().

For the line plot, we will use some concentration-time pharmacokinetic data after oral administration. This plot is known as spaghetti plot.

Tip

Line plots implicitly indicate a dependence of an observation with previous ones. This dependence makes line plots perfect for time series data and other time-dependent visualizations. But for data that do not have a time-dependency, or any other x-axis dependency, line plots might convey an intuition that is not the objective of the visualization.

pk = dataset("pumas_tutorials/po_sd_1")
first(pk, 5)
5×9 DataFrame
Row id time cp dv amt evid cmt rate dosegrp
Int64 Float64 Float64? Float64? Float64? Int64 Int64 Float64 Int64
1 1 0.0 missing missing 10.0 1 1 0.0 10
2 1 0.25 20.2592 22.6353 missing 0 2 0.0 10
3 1 0.5 36.8068 16.5712 missing 0 2 0.0 10
4 1 0.75 50.2838 60.8928 missing 0 2 0.0 10
5 1 1.0 61.2211 46.8858 missing 0 2 0.0 10

Here is a simple plot for the PK data for one subject using the positional x-axis, y-axis, and color arguments from mapping. We reduce the dataset to the first 10 ids so our plot’s legend doesn’t overflow.

Note

We are removing missing values from the pk dataset and also filtering only to 10 observations so that the legend does not overflow.

dropmissing!(pk, :cp);
pk_ids = @rsubset(pk, :id <= 10);
Note

We are using nonnumeric() inside mapping() to tell AoG.jl that the column :id, despite being an integer column, should be treated as discrete/categorical, i.e. non-numeric.

This will be covered in Customization of AlgebraOfGraphics.jl Plots.

data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Lines; alpha = 0.5) |> draw

Tip

To draw this visualization without the legend, you can call the mutating function draw!(). Whereas draw() automatically adds colorbars and legends, draw!() does not. Colorbar and legend, should they be necessary, can be added separately to the visualization with the colorbar!() and legend!() helper functions.

We’ll cover customizations in Customization of AlgebraOfGraphics.jl Plots. Don’t forget to check it out.

Here’s how the code would look like without the legend:

fig = Figure()
plt = data(pk) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Lines)
draw!(fig, plt)
fig

3.3 Errorbars

Errorbars creates vertical interval lines, commonly used to represent data variability or uncertainty.

It is analogous to ggplot2’s geom_errorbar()

Let’s use the same example as before, but this time we are interested in finding the mean concentration-time profile for each dose level. We will also use the standard deviation as our measure of variability.

Note

We are using @by macro again to group the data by dosegrp and time. Also, we will use the mean() function to calculate the mean concentration and the std() function to determine the standard deviation. These functions are available in the Statistics module in Julia’s standard library.

Don’t forget to check our Data Wrangling in Julia tutorial Manipulating Tables with DataFramesMeta.jl for a more in-depth explanation on the use of DataFramesMeta.jl’s macros

pk_error = @by pk [:dosegrp, :time] begin
    :Cmean = mean(:cp)
    :Cstd = std(:cp)
end
first(pk_error, 5)
5×4 DataFrame
Row dosegrp time Cmean Cstd
Int64 Float64 Float64 Float64
1 10 0.25 31.3022 13.3897
2 10 0.5 54.484 21.9751
3 10 0.75 71.6505 27.4859
4 10 1.0 84.3257 30.9806
5 10 2.0 108.478 34.991

Now we can plot the error bars. In this case, the mapping function will take three positional arguments: x position (time), y position (mean concentration), and the error bar length (standard deviation):

data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
visual(Errorbars) |> draw

Tip

Notice that Errorbars only generates the vertical interval lines. It is a common practice to show error bars together with other data visualization methods, such as bar and line plots. You can achieve this by adding two layers with the + operator, which would look like this for a line plot:

data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
(visual(Errorbars) + visual(Lines)) |> draw

You can learn more about combining layers with the + operator by checking our tutorial on Grammar of Graphics with AlgebraOfGraphics.jl

3.4 Scatter

Scatter is the plotting type for scatter plots and is analogous to ggplot2’s geom_point():

data(pk_ids) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Scatter) |> draw

There are some interesting keyword arguments for Scatter if you type in a Julia REPL help_attributes(scatter) or ?scatter. For example, you can choose a marker type and markersize:

data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Scatter; marker = '+', markersize = 25, alpha = 0.5) |> draw

3.5 ScatterLines

ScatterLines is the fusion of Scatter and Lines. So every keyword argument from both of them will be available.

It is similar to applying in ggplot2 the following geometries: geom_line() + geom_point().

Let’s plot our previous Lines example using ScatterLines:

data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(ScatterLines; alpha = 0.5) |> draw

3.6 BoxPlot

Box plots are the statistician’s favorite plots. Here is a simple box plot using BoxPlot inside visual():

data(df) * mapping(:SEX, :AGE) * visual(BoxPlot) |> draw

You can use some keyword arguments for BoxPlot, such as:

  • show_notch: whether or not to have a notch near the median.
  • range: the inter-quartile range (IQR), default 1.5.
  • whiskerwidth: if you want to have a small horizontal end in the whiskers relative to the box width.
  • show_outliers: whether or not to show outliers as points, default true.
data(df) *
mapping(:SEX, :AGE) *
visual(BoxPlot; show_notch = true, range = 1, whiskerwidth = 0.25) |> draw

3.7 Violin

Violin plots are also popular and a good alternative to box plots. Instead of being based in median, quartiles and IQR, violin plots display the actual probability density of the underlying values (using a kernel density estimator).

Here is the same box plot example, but now using a Violin inside visual(). It shows much more visual information than the box plot:

data(df) * mapping(:SEX, :AGE) * visual(Violin) |> draw

As with BoxPlot, Violin has some interesting keyword arguments. The most important is show_median, which tells AoG.jl whether to show or not the median inside the violins:

data(df) * mapping(:SEX, :AGE) * visual(Violin; show_median = true) |> draw

Violin also accepts a side inside mapping() which breaks the violin plot into two sides: left and right.

If you pair side with color you can convey more information in your violin plots:

data(df) *
mapping(:SEX, :AGE; side = :WEIGHT_cat, color = :WEIGHT_cat) *
visual(Violin; show_median = true) |> draw

Note

We will cover more geometries and plotting types in Plotting Statistical Visualizations with AlgebraOfGraphics.jl which are frequently paired with statistical visualizations.