Plotting Different Geometries with AlgebraOfGraphics.jl

Authors

Jose Storopoli

Juan Oneto

In this tutorial, we will explore how to make different kinds of plots (also called geometries or geoms in ggplot2) with AoG.jl. First, we’ll discuss how to navigate AoG.jl and Makie.jl documentation. Then, we’ll proceed to show the most common plotting functions in AoG.jl:

  1. BarPlot
  2. Lines
  3. Errorbars
  4. Scatter
  5. BoxPlot
  6. Violin
  7. Contour
  8. Heatmap
Note

Some main visualizations are missing from this tutorial. These would be the Statistical Visualizations. They are covered in Plotting Statistical Visualizations with AlgebraOfGraphics.jl. Don’t forget to check it out.

1 📋 geom_*() - AoG.jl Table

The following table is a mapping of ggplot2’s geom_*() to AoG.jl’s plotting functions:

ggplot2 AoG.jl
geom_col() visual(BarPlot)
geom_point() visual(Scatter)
geom_line() visual(Lines)
geom_errorbar() visual(Errorbars)
geom_boxplot() visual(BoxPlot)
geom_violin() visual(Violin)
geom_label() visual(Annotations)
geom_text() visual(Annotations)
geom_contour() visual(Contour)
geom_tile() visual(Heatmap)
geom_bar() frequency()
geom_histogram() histogram()
geom_density() density()
geom_smooth() smooth()
geom_smooth(method = "lm") linear()
geom_area() linesfill()

2 🆘 How to Find Available Plotting Functions?

As you probably know, AoG.jl uses Makie.jl as the plotting engine for all visualizations. This has the consequence that all possible plotting objects (geometries) in AoG.jl are actually plotting types in Makie.jl.

So, for example, to plot a bar plot in AoG.jl you would have to call the BarPlot type from Makie.jl.

In AoG.jl you use the visual() function and then pass the desired Makie.jl plotting type along with all desired keyword arguments. So, the bar plot would be the following call for visual():

plt = data(...) * mapping(...) * visual(BarPlot; ...)

draw(plt)

There are two main ways to browse and obtain information regarding plotting types and custom arguments:

  1. Makie.jl Documentation: this is very useful and even AoG.jl’s documentation redirects to it.
  2. help_attributes() function and Docstrings: you can also see information from the Julia REPL (terminal) with the help_attributes() and by seeing the help information for Makie.jl’s plotting functions.

2.1 Makie Documentation

In Makie.jl’s documentation there is a rich description of the plotting functions. We encourage you to browse it and learn with the examples the several options available for every plotting type.

Caution

Note that Makie.jl’s plotting functions are all lowercase since they use the naming convention for functions. Instead, AoG.jl’s uses plotting types which are all TitleCase with the naming convention for types.

If you try to use visual() on the plotting functions, you’ll get an error. Instead, you need to use visual() on the plotting types.

For instance, this will error:

visual(barplot)

The correct way is:

visual(BarPlot)

Just remember that you would need to convert the plotting functions to plotting types when you pass it to the visual() function.

Note

This is the online documentation for the barplot() function from Makie.jl:


barplot(positions, heights; kwargs...)

Plots bars of the given heights at the given (scalar) positions.

Plot type

The plot type alias for the barplot function is BarPlot.

Attributes

alpha = 1.0 — The alpha value of the colormap or color attribute. Multiple alphas like in plot(alpha=0.2, color=(:red, 0.5), will get multiplied.

bar_labels = nothing — Labels added at the end of each bar.

clip_planes = @inherit clip_planes automatic — Clip planes offer a way to do clipping in 3D space. You can set a Vector of up to 8 Plane3f planes here, behind which plots will be clipped (i.e. become invisible). By default clip planes are inherited from the parent plot or scene. You can remove parent clip_planes by passing Plane3f[].

color = @inherit patchcolor — Sets the color of bars.

color_over_background = automatic — Sets the color of labels that are drawn outside of bars. Defaults to label_color

color_over_bar = automatic — Sets the color of labels that are drawn inside of/over bars. Defaults to label_color

colormap = @inherit colormap :viridis — Sets the colormap that is sampled for numeric colors. PlotUtils.cgrad(...), Makie.Reverse(any_colormap) can be used as well, or any symbol from ColorBrewer or PlotUtils. To see all available color gradients, you can call Makie.available_gradients().

colorrange = automatic — The values representing the start and end points of colormap.

colorscale = identity — The color transform function. Can be any function, but only works well together with Colorbar for identity, log, log2, log10, sqrt, logit, Makie.pseudolog10, Makie.Symlog10, Makie.AsinhScale, Makie.SinhScale, Makie.LogScale, Makie.LuptonAsinhScale, and Makie.PowerScale.

cycle = [:color => :patchcolor] — Sets which attributes to cycle when creating multiple plots. The values to cycle through are defined by the parent Theme. Multiple cycled attributes can be set by passing a vector. Elements can

  • directly refer to a cycled attribute, e.g. :color

  • map a cycled attribute to a palette attribute, e.g. :linecolor => :color

  • map multiple cycled attributes to a palette attribute, e.g. [:linecolor, :markercolor] => :color

depth_shift = 0.0 — Adjusts the depth value of a plot after all other transformations, i.e. in clip space, where -1 <= depth <= 1. This only applies to GLMakie and WGLMakie and can be used to adjust render order (like a tunable overdraw).

direction = :y — Controls the direction of the bars. can be :y (height is vertical) or :x (height is horizontal).

dodge = automatic — Dodge can be used to separate bars drawn at the same position. For this each bar is given an integer value corresponding to its position relative to the given positions. E.g. with positions = [1, 1, 1, 2, 2, 2] we have 3 bars at each position which can be separated by dodge = [1, 2, 3, 1, 2, 3].

dodge_gap = 0.03 — Sets the gap between dodged bars relative to the size of the dodged bars.

fillto = automatic — Controls the baseline of the bars. This is zero in the default automatic case unless the barplot is in a log-scaled Axis. With a log scale, the automatic default is half the minimum value because zero is an invalid value for a log scale.

flip_labels_at = Inf — Sets a height value beyond which labels are drawn inside the bar instead of outside.

fxaa = true — Adjusts whether the plot is rendered with fxaa (fast approximate anti-aliasing, GLMakie only). Note that some plots implement a better native anti-aliasing solution (scatter, text, lines). For them fxaa = true generally lowers quality. Plots that show smoothly interpolated data (e.g. image, surface) may also degrade in quality as fxaa = true can cause blurring.

gap = 0.2 — The final width of the bars is calculated as w * (1 - gap) where w is the width of each bar as determined with the width attribute. When dodge is used the w corresponds to the width of undodged bars, making this control the gap between groups.

highclip = automatic — The color for any value above the colorrange.

inspectable = @inherit inspectable — Sets whether this plot should be seen by DataInspector. The default depends on the theme of the parent scene.

inspector_clear = automatic — Sets a callback function (inspector, plot) -> ... for cleaning up custom indicators in DataInspector.

inspector_hover = automatic — Sets a callback function (inspector, plot, index) -> ... which replaces the default show_data methods.

inspector_label = automatic — Sets a callback function (plot, index, position) -> string which replaces the default label generated by DataInspector.

label_align = automatic — Sets the text alignment of labels.

label_color = @inherit textcolor — Sets the color of labels.

label_font = @inherit font — The font of the bar labels.

label_formatter = bar_label_formatter — Formatting function which is applied to bar labels before they are passed on text()

label_offset = 5 — The distance of the labels from the bar ends in screen units. Does not apply when label_position = :center.

label_position = :end — The position of each bar's label relative to the bar. Possible values are :end or :center.

label_rotation = 0π — Sets the text rotation of labels in radians.

label_size = @inherit fontsize — The font size of the bar labels.

lowclip = automatic — The color for any value below the colorrange.

model = automatic — Sets a model matrix for the plot. This overrides adjustments made with translate!, rotate! and scale!.

n_dodge = automatic — Sets the maximum integer for dodge. This sets how many bars can be placed at a given position, controlling their width.

nan_color = :transparent — The color for NaN values.

offset = 0.0 — Offsets all bars by the given real value. Can also be set per-bar.

overdraw = false — Controls if the plot will draw over other plots. This specifically means ignoring depth checks in GL backends

space = :data — Sets the transformation space for box encompassing the plot. See Makie.spaces() for possible inputs.

ssao = false — Adjusts whether the plot is rendered with ssao (screen space ambient occlusion). Note that this only makes sense in 3D plots and is only applicable with fxaa = true.

stack = automatic — Similar to dodge, this allows bars at the same positions to be stacked by identifying their stack position with integers. E.g. with positions = [1, 1, 1, 2, 2, 2] each group of 3 bars can be stacked with stack = [1, 2, 3, 1, 2, 3].

strokecolor = @inherit patchstrokecolor — Sets the outline color of bars.

strokewidth = @inherit patchstrokewidth — Sets the outline linewidth of bars.

transformation = :automatic — Controls the inheritance or directly sets the transformations of a plot. Transformations include the transform function and model matrix as generated by translate!(...), scale!(...) and rotate!(...). They can be set directly by passing a Transformation() object or inherited from the parent plot or scene. Inheritance options include:

  • :automatic: Inherit transformations if the parent and child space is compatible

  • :inherit: Inherit transformations

  • :inherit_model: Inherit only model transformations

  • :inherit_transform_func: Inherit only the transform function

  • :nothing: Inherit neither, fully disconnecting the child's transformations from the parent

Another option is to pass arguments to the transform!() function which then get applied to the plot. For example transformation = (:xz, 1.0) which rotates the xy plane to the xz plane and translates by 1.0. For this inheritance defaults to :automatic but can also be set through e.g. (:nothing, (:xz, 1.0)).

transparency = false — Adjusts how the plot deals with transparency. In GLMakie transparency = true results in using Order Independent Transparency.

visible = true — Controls whether the plot gets rendered or not.

width = automatic — The gapless width of the bars. If automatic, the width w is calculated as minimum(diff(sort(unique(positions))). The actual width of the bars is calculated as w * (1 - gap).


2.2 help_attributes() and Docstrings

A nice helping hand with plotting functions, if you do not want to browse Makie.jl’s documentation is the help_attributes() function from any Makie.jl’s backend.

Let us show how it works, but first let’s load the default backend that we are using in these tutorials: CairoMakie.jl.

using CairoMakie

Here is an example with the barplot() plotting function:

help_attributes(barplot)
Available attributes for `Makie.BarPlot` are: 

alpha bar_labels clip_planes color color_over_background color_over_bar colormap colorrange colorscale cycle depth_shift direction dodge dodge_gap fillto flip_labels_at gap highclip inspectable inspector_clear inspector_hover inspector_label label_align label_color label_font label_formatter label_offset label_position label_rotation label_size lowclip n_dodge nan_color offset overdraw space ssao stack strokecolor strokewidth transparency visible width

We can see that BarPlot, when used inside visual(), has a lot of keyword arguments for us to customize our bar plots.

2.2.1 Docstrings from help or ?

We can also check the docstrings from a specific plotting function by calling either the help() function on it or by using the help mode of the Julia REPL:

julia> ?

help?> barplot
Note

Also don’t forget to check AoG.jl’s Documentation. The tutorial and gallery are nice sections that showcase several use cases and possible customizations.

3 🎨 visual() function

The visual() function from AoG.jl is the function which we attribute plotting objects to our mapping()s in our data().

The most important argument to visual() is the first positional argument: the plotting type. Then the following keyword arguments are the same that the analogous Makie.jl plotting function’s available keyword arguments.

For example, the barplot() plotting function from Makie.jl supports the width keyword argument. That would be translate to the following visual() function call in AoG.jl:

visual(BarPlot; width = ...)

Let’s show some of the available plotting types (geometries) to the visual() function. But first, we begin by loading AoG.jl, data wrangling libraries and the DataFrame we’ve used previously:

using PharmaDatasets
using DataFramesMeta
using AlgebraOfGraphics

df = dataset("demographics_1")
first(df, 5)
5×6 DataFrame
Row ID AGE WEIGHT SCR ISMALE eGFR
Int64 Float64 Float64 Float64 Int64 Float64
1 1 34.823 38.212 1.1129 0 42.635
2 2 32.765 74.838 0.8846 1 126.0
3 3 35.974 37.303 1.1004 1 48.981
4 4 38.206 32.969 1.1972 1 38.934
5 5 33.559 47.139 1.5924 0 37.198

We will also do some columns transformations to CategoricalArrays:

Note

Don’t forget to check our Data Wrangling in Julia tutorials Handling Factors and Categorical Data with CategoricalArrays.jl.

using CategoricalArrays
@transform! df :SEX = categorical(:ISMALE);
@transform! df :SEX = recode(:SEX, 0 => "female", 1 => "male");
@transform! df :WEIGHT_cat = cut(:WEIGHT, 2; labels = ["light", "heavy"])

3.1 BarPlot

Let’s begin with the bar plot. Here the plotting type is BarPlot and the plotting function is barplot. So, we just call visual() and pass BarPlot as the first argument followed by any desired keyword arguments supported by the plotting function barplot().

Here is an example with our dataset df. Notice that we need to first group the data with the @by macro and then apply the mean() function from Julia’s standard library Statistics module:

Note

There is an easier way to automatically perform grouping and summarizing in AoG.jl with statistical transformation functions. We will cover this in Plotting Statistical Visualizations with AlgebraOfGraphics.jl. Make sure to check it out.

using Statistics
data(@by df :SEX :AGE_MEAN = mean(:AGE)) * mapping(:SEX, :AGE_MEAN) * visual(BarPlot) |>
draw

We can customize the specified plotting object in visual() by adding supported keyword arguments.

If we would like to make our bars blue and a little bit less wide we can use the color and width arguments:

data(@by df :SEX :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN) *
visual(BarPlot; color = :blue, width = 0.5) |> draw

Here is a more complex example using color and dodge for the column :WEIGHT_cat inside mapping():

data(@by df [:WEIGHT_cat, :SEX] :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN; color = :WEIGHT_cat, dodge = :WEIGHT_cat) *
visual(BarPlot) |> draw

Tip

Note that the color mapping will override the color keyword argument inside a visual() call. For custom colors, which we will cover in Customization of AlgebraOfGraphics.jl Plots, it is better to use the palette argument inside draw[!]() function.

3.2 Lines

Lines creates a line plot with the specified data() and mapping()s.

It is analogous to ggplot2’s geom_line().

For the line plot, we will use some concentration-time pharmacokinetic data after oral administration. This plot is known as spaghetti plot.

Tip

Line plots implicitly indicate a dependence of an observation with previous ones. This dependence makes line plots perfect for time series data and other time-dependent visualizations. But for data that do not have a time-dependency, or any other x-axis dependency, line plots might convey an intuition that is not the objective of the visualization.

pk = dataset("pumas_tutorials/po_sd_1")
first(pk, 5)
5×9 DataFrame
Row id time cp dv amt evid cmt rate dosegrp
Int64 Float64 Float64? Float64? Float64? Int64 Int64 Float64 Int64
1 1 0.0 missing missing 10.0 1 1 0.0 10
2 1 0.25 20.2592 22.6353 missing 0 2 0.0 10
3 1 0.5 36.8068 16.5712 missing 0 2 0.0 10
4 1 0.75 50.2838 60.8928 missing 0 2 0.0 10
5 1 1.0 61.2211 46.8858 missing 0 2 0.0 10

Here is a simple plot for the PK data for one subject using the positional x-axis, y-axis, and color arguments from mapping. We reduce the dataset to the first 10 ids so our plot’s legend doesn’t overflow.

Note

We are removing missing values from the pk dataset and also filtering only to 10 observations so that the legend does not overflow.

dropmissing!(pk, :cp);
pk_ids = @rsubset(pk, :id <= 10);
Note

We are using nonnumeric() inside mapping() to tell AoG.jl that the column :id, despite being an integer column, should be treated as discrete/categorical, i.e. non-numeric.

This will be covered in Customization of AlgebraOfGraphics.jl Plots.

data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Lines; alpha = 0.5) |> draw

Tip

To draw this visualization without the legend, you can call the mutating function draw!(). Whereas draw() automatically adds colorbars and legends, draw!() does not. Colorbar and legend, should they be necessary, can be added separately to the visualization with the colorbar!() and legend!() helper functions.

We’ll cover customizations in Customization of AlgebraOfGraphics.jl Plots. Don’t forget to check it out.

Here’s how the code would look like without the legend:

fig = Figure()
plt = data(pk) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Lines)
draw!(fig, plt)
fig

3.3 Errorbars

Errorbars creates vertical interval lines, commonly used to represent data variability or uncertainty.

It is analogous to ggplot2’s geom_errorbar()

Let’s use the same example as before, but this time we are interested in finding the mean concentration-time profile for each dose level. We will also use the standard deviation as our measure of variability.

Note

We are using @by macro again to group the data by dosegrp and time. Also, we will use the mean() function to calculate the mean concentration and the std() function to determine the standard deviation. These functions are available in the Statistics module in Julia’s standard library.

Don’t forget to check our Data Wrangling in Julia tutorial Manipulating Tables with DataFramesMeta.jl for a more in-depth explanation on the use of DataFramesMeta.jl’s macros

pk_error = @by pk [:dosegrp, :time] begin
    :Cmean = mean(:cp)
    :Cstd = std(:cp)
end
first(pk_error, 5)
5×4 DataFrame
Row dosegrp time Cmean Cstd
Int64 Float64 Float64 Float64
1 10 0.25 31.3022 13.3897
2 10 0.5 54.484 21.9751
3 10 0.75 71.6505 27.4859
4 10 1.0 84.3257 30.9806
5 10 2.0 108.478 34.991

Now we can plot the error bars. In this case, the mapping function will take three positional arguments: x position (time), y position (mean concentration), and the error bar length (standard deviation):

data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
visual(Errorbars) |> draw

Tip

Notice that Errorbars only generates the vertical interval lines. It is a common practice to show error bars together with other data visualization methods, such as bar and line plots. You can achieve this by adding two layers with the + operator, which would look like this for a line plot:

data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
(visual(Errorbars) + visual(Lines)) |> draw

You can learn more about combining layers with the + operator by checking our tutorial on Grammar of Graphics with AlgebraOfGraphics.jl

3.4 Scatter

Scatter is the plotting type for scatter plots and is analogous to ggplot2’s geom_point():

data(pk_ids) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Scatter) |> draw

There are some interesting keyword arguments for Scatter if you type in a Julia REPL help_attributes(scatter) or ?scatter. For example, you can choose a marker type and markersize:

data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Scatter; marker = '+', markersize = 25, alpha = 0.5) |> draw

3.5 ScatterLines

ScatterLines is the fusion of Scatter and Lines. So every keyword argument from both of them will be available.

It is similar to applying in ggplot2 the following geometries: geom_line() + geom_point().

Let’s plot our previous Lines example using ScatterLines:

data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(ScatterLines; alpha = 0.5) |> draw

3.6 BoxPlot

Box plots are the statistician’s favorite plots. Here is a simple box plot using BoxPlot inside visual():

data(df) * mapping(:SEX, :AGE) * visual(BoxPlot) |> draw

You can use some keyword arguments for BoxPlot, such as:

  • show_notch: whether or not to have a notch near the median.
  • range: the inter-quartile range (IQR), default 1.5.
  • whiskerwidth: if you want to have a small horizontal end in the whiskers relative to the box width.
  • show_outliers: whether or not to show outliers as points, default true.
data(df) *
mapping(:SEX, :AGE) *
visual(BoxPlot; show_notch = true, range = 1, whiskerwidth = 0.25) |> draw

3.7 Violin

Violin plots are also popular and a good alternative to box plots. Instead of being based in median, quartiles and IQR, violin plots display the actual probability density of the underlying values (using a kernel density estimator).

Here is the same box plot example, but now using a Violin inside visual(). It shows much more visual information than the box plot:

data(df) * mapping(:SEX, :AGE) * visual(Violin) |> draw

As with BoxPlot, Violin has some interesting keyword arguments. The most important is show_median, which tells AoG.jl whether to show or not the median inside the violins:

data(df) * mapping(:SEX, :AGE) * visual(Violin; show_median = true) |> draw

Violin also accepts a side inside mapping() which breaks the violin plot into two sides: left and right.

If you pair side with color you can convey more information in your violin plots:

data(df) *
mapping(:SEX, :AGE; side = :WEIGHT_cat, color = :WEIGHT_cat) *
visual(Violin; show_median = true) |> draw

Note

We will cover more geometries and plotting types in Plotting Statistical Visualizations with AlgebraOfGraphics.jl which are frequently paired with statistical visualizations.