Plotting Different Geometries with `AlgebraOfGraphics.jl`

Authors

Jose Storopoli

Juan Oneto

In this tutorial, we will explore how to make different kinds of plots (also called geometries or geoms in `ggplot2`) with `AoG.jl`. First, weâ€™ll discuss how to navigate `AoG.jl` and `Makie.jl` documentation. Then, weâ€™ll proceed to show the most common plotting functions in `AoG.jl`:

1. `BarPlot`
2. `Lines`
3. `Errorbars`
4. `Scatter`
5. `BoxPlot`
6. `Violin`
7. `Contour`
8. `Heatmap`
Note

Some main visualizations are missing from this tutorial. These would be the Statistical Visualizations. They are covered in Plotting Statistical Visualizations with `AlgebraOfGraphics.jl`. Donâ€™t forget to check it out.

1 ðŸ“‹ `geom_*()` - `AoG.jl` Table

The following table is a mapping of `ggplot2`â€™s `geom_*()` to `AoG.jl`â€™s plotting functions:

`ggplot2` `AoG.jl`
`geom_col()` `visual(BarPlot)`
`geom_point()` `visual(Scatter)`
`geom_line()` `visual(Lines)`
`geom_errorbar()` `visual(Errorbars)`
`geom_boxplot()` `visual(BoxPlot)`
`geom_violin()` `visual(Violin)`
`geom_label()` `visual(Annotations)`
`geom_text()` `visual(Annotations)`
`geom_contour()` `visual(Contour)`
`geom_tile()` `visual(Heatmap)`
`geom_bar()` `frequency()`
`geom_histogram()` `histogram()`
`geom_density()` `density()`
`geom_smooth()` `smooth()`
`geom_smooth(method = "lm")` `linear()`
`geom_area()` `linesfill()`

2 ðŸ†˜ How to Find Available Plotting Functions?

As you probably know, `AoG.jl` uses `Makie.jl` as the plotting engine for all visualizations. This has the consequence that all possible plotting objects (geometries) in `AoG.jl` are actually plotting types in `Makie.jl`.

So, for example, to plot a bar plot in `AoG.jl` you would have to call the `BarPlot` type from `Makie.jl`.

In `AoG.jl` you use the `visual()` function and then pass the desired `Makie.jl` plotting type along with all desired keyword arguments. So, the bar plot would be the following call for `visual()`:

``````plt = data(...) * mapping(...) * visual(BarPlot; ...)

draw(plt)``````

There are two main ways to browse and obtain information regarding plotting types and custom arguments:

1. `Makie.jl` Documentation: this is very useful and even `AoG.jl`â€™s documentation redirects to it.
2. `help_attributes()` function and Docstrings: you can also see information from the Julia REPL (terminal) with the `help_attributes()` and by seeing the help information for `Makie.jl`â€™s plotting functions.

2.1 Makie Documentation

In `Makie.jl`â€™s documentation there is a rich description of the plotting functions. We encourage you to browse it and learn with the examples the several options available for every plotting type.

Caution

Note that `Makie.jl`â€™s plotting functions are all lowercase since they use the naming convention for functions. Instead, `AoG.jl`â€™s uses plotting types which are all TitleCase with the naming convention for types.

If you try to use `visual()` on the plotting functions, youâ€™ll get an error. Instead, you need to use `visual()` on the plotting types.

For instance, this will error:

``visual(barplot)``

The correct way is:

``visual(BarPlot)``

Just remember that you would need to convert the plotting functions to plotting types when you pass it to the `visual()` function.

Note

This is the online documentation for the `barplot()` function from `Makie.jl`:

``barplot(x, y; kwargs...)``

Plots a barplot; `y` defines the height. `x` and `y` should be 1 dimensional. Bar width is determined by the attribute `width`, shrunk by `gap` in the following way: `width -> width * (1 - gap)`.

Attributes

Available attributes and their defaults for `MakieCore.Combined{Makie.barplot}` are:

``````  bar_labels             "nothing"
color                  RGBA{Float32}(0.0f0,0.0f0,0.0f0,0.6f0)
color_over_background  MakieCore.Automatic()
color_over_bar         MakieCore.Automatic()
colormap               :viridis
colorrange             MakieCore.Automatic()
cycle                  [:color => :patchcolor]
direction              :y
dodge                  MakieCore.Automatic()
dodge_gap              0.03
fillto                 MakieCore.Automatic()
flip_labels_at         Inf
gap                    0.2
highclip               MakieCore.Automatic()
inspectable            true
label_color            :black
label_font             :regular
label_formatter        Makie.bar_label_formatter
label_offset           5
label_rotation         0.0
label_size             20
lowclip                MakieCore.Automatic()
marker                 GeometryBasics.HyperRectangle
n_dodge                MakieCore.Automatic()
nan_color              :transparent
offset                 0.0
stack                  MakieCore.Automatic()
strokecolor            :black
strokewidth            0
transparency           false
visible                true
width                  MakieCore.Automatic()``````

2.2`help_attributes()` and Docstrings

A nice helping hand with plotting functions, if you do not want to browse `Makie.jl`â€™s documentation is the `help_attributes()` function from any `Makie.jl`â€™s backend.

Let us show how it works, but first letâ€™s load the default backend that we are using in these tutorials: `CairoMakie.jl`.

``using CairoMakie``

Here is an example with the `barplot()` plotting function:

``help_attributes(barplot)``
``````Available attributes for `MakieCore.Combined{Makie.barplot}` are:
``````

bar_labels color color_over_background color_over_bar colormap colorrange cycle direction dodge dodge_gap fillto flip_labels_at gap highclip inspectable label_color label_font label_formatter label_offset label_rotation label_size lowclip marker n_dodge nan_color offset stack strokecolor strokewidth transparency visible width

We can see that `BarPlot`, when used inside `visual()`, has a lot of keyword arguments for us to customize our bar plots.

2.2.1 Docstrings from `help` or `?`

We can also check the docstrings from a specific plotting function by calling either the `help()` function on it or by using the help mode of the Julia REPL:

``````julia> ?

help?> barplot``````
Note

Also donâ€™t forget to check `AoG.jl`â€™s Documentation. The tutorial and gallery are nice sections that showcase several use cases and possible customizations.

3 ðŸŽ¨ `visual()` function

The `visual()` function from `AoG.jl` is the function which we attribute plotting objects to our `mapping()`s in our `data()`.

The most important argument to `visual()` is the first positional argument: the plotting type. Then the following keyword arguments are the same that the analogous `Makie.jl` plotting functionâ€™s available keyword arguments.

For example, the `barplot()` plotting function from `Makie.jl` supports the `width` keyword argument. That would be translate to the following `visual()` function call in `AoG.jl`:

``visual(BarPlot; width = ...)``

Letâ€™s show some of the available plotting types (geometries) to the `visual()` function. But first, we begin by loading `AoG.jl`, data wrangling libraries and the `DataFrame` weâ€™ve used previously:

``````using PharmaDatasets
using DataFramesMeta
using AlgebraOfGraphics

df = dataset("demographics_1")
first(df, 5)``````
5Ã—6 DataFrame
Row ID AGE WEIGHT SCR ISMALE eGFR
Int64 Float64 Float64 Float64 Int64 Float64
1 1 34.823 38.212 1.1129 0 42.635
2 2 32.765 74.838 0.8846 1 126.0
3 3 35.974 37.303 1.1004 1 48.981
4 4 38.206 32.969 1.1972 1 38.934
5 5 33.559 47.139 1.5924 0 37.198

We will also do some columns transformations to `CategoricalArray`s:

Note

Donâ€™t forget to check our Data Wrangling in Julia tutorials Handling Factors and Categorical Data with `CategoricalArrays.jl`.

``````using CategoricalArrays
@transform! df :SEX = categorical(:ISMALE);
@transform! df :SEX = recode(:SEX, 0 => "female", 1 => "male");
@transform! df :WEIGHT_cat = cut(:WEIGHT, 2; labels = ["light", "heavy"])``````

3.1`BarPlot`

Letâ€™s begin with the bar plot. Here the plotting type is `BarPlot` and the plotting function is `barplot`. So, we just call `visual()` and pass `BarPlot` as the first argument followed by any desired keyword arguments supported by the plotting function `barplot()`.

Here is an example with our dataset `df`. Notice that we need to first group the data with the `@by` macro and then apply the `mean()` function from Juliaâ€™s standard library `Statistics` module:

Note

There is an easier way to automatically perform grouping and summarizing in `AoG.jl` with statistical transformation functions. We will cover this in Plotting Statistical Visualizations with `AlgebraOfGraphics.jl`. Make sure to check it out.

``using Statistics``
``````data(@by df :SEX :AGE_MEAN = mean(:AGE)) * mapping(:SEX, :AGE_MEAN) * visual(BarPlot) |>
draw``````

We can customize the specified plotting object in `visual()` by adding supported keyword arguments.

If we would like to make our bars blue and a little bit less wide we can use the `color` and `width` arguments:

``````data(@by df :SEX :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN) *
visual(BarPlot; color = :blue, width = 0.5) |> draw``````

Here is a more complex example using `color` and `dodge` for the column `:WEIGHT_cat` inside `mapping()`:

``````data(@by df [:WEIGHT_cat, :SEX] :AGE_MEAN = mean(:AGE)) *
mapping(:SEX, :AGE_MEAN; color = :WEIGHT_cat, dodge = :WEIGHT_cat) *
visual(BarPlot) |> draw``````
Tip

Note that the `color` mapping will override the `color` keyword argument inside a `visual()` call. For custom colors, which we will cover in Customization of `AlgebraOfGraphics.jl` Plots, it is better to use the `palette` argument inside `draw[!]()` function.

3.2`Lines`

`Lines` creates a line plot with the specified `data()` and `mapping()`s.

It is analogous to `ggplot2`â€™s `geom_line()`.

For the line plot, we will use some concentration-time pharmacokinetic data after oral administration. This plot is known as spaghetti plot.

Tip

Line plots implicitly indicate a dependence of an observation with previous ones. This dependence makes line plots perfect for time series data and other time-dependent visualizations. But for data that do not have a time-dependency, or any other x-axis dependency, line plots might convey an intuition that is not the objective of the visualization.

``````pk = dataset("pumas_tutorials/po_sd_1")
first(pk, 5)``````
5Ã—9 DataFrame
Row id time cp dv amt evid cmt rate dosegrp
Int64 Float64 Float64? Float64? Float64? Int64 Int64 Float64 Int64
1 1 0.0 missing missing 10.0 1 1 0.0 10
2 1 0.25 20.2592 22.6353 missing 0 2 0.0 10
3 1 0.5 36.8068 16.5712 missing 0 2 0.0 10
4 1 0.75 50.2838 60.8928 missing 0 2 0.0 10
5 1 1.0 61.2211 46.8858 missing 0 2 0.0 10

Here is a simple plot for the PK data for one subject using the positional x-axis, y-axis, and color arguments from mapping. We reduce the dataset to the first 10 ids so our plotâ€™s legend doesnâ€™t overflow.

Note

We are removing `missing` values from the `pk` dataset and also filtering only to 10 observations so that the legend does not overflow.

``````dropmissing!(pk, :cp);
pk_ids = @rsubset(pk, :id <= 10);``````
Note

We are using `nonnumeric()` inside `mapping()` to tell `AoG.jl` that the column `:id`, despite being an integer column, should be treated as discrete/categorical, i.e. non-numeric.

This will be covered in Customization of `AlgebraOfGraphics.jl` Plots.

``````data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Lines; alpha = 0.5) |> draw``````
Tip

To draw this visualization without the legend, you can call the mutating function `draw!()`. Whereas `draw()` automatically adds colorbars and legends, `draw!()` does not. Colorbar and legend, should they be necessary, can be added separately to the visualization with the `colorbar!()` and `legend!()` helper functions.

Weâ€™ll cover customizations in Customization of `AlgebraOfGraphics.jl` Plots. Donâ€™t forget to check it out.

Hereâ€™s how the code would look like without the legend:

``````fig = Figure()
plt = data(pk) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Lines)
draw!(fig, plt)
fig``````

3.3`Errorbars`

`Errorbars` creates vertical interval lines, commonly used to represent data variability or uncertainty.

It is analogous to `ggplot2`â€™s `geom_errorbar()`

Letâ€™s use the same example as before, but this time we are interested in finding the mean concentration-time profile for each dose level. We will also use the standard deviation as our measure of variability.

Note

We are using `@by` macro again to group the data by `dosegrp` and `time`. Also, we will use the `mean()` function to calculate the mean concentration and the `std()` function to determine the standard deviation. These functions are available in the `Statistics` module in Juliaâ€™s standard library.

Donâ€™t forget to check our Data Wrangling in Julia tutorial Manipulating Tables with `DataFramesMeta.jl` for a more in-depth explanation on the use of `DataFramesMeta.jl`â€™s macros

``````pk_error = @by pk [:dosegrp, :time] begin
:Cmean = mean(:cp)
:Cstd = std(:cp)
end
first(pk_error, 5)``````
5Ã—4 DataFrame
Row dosegrp time Cmean Cstd
Int64 Float64 Float64 Float64
1 10 0.25 31.3022 13.3897
2 10 0.5 54.484 21.9751
3 10 0.75 71.6505 27.4859
4 10 1.0 84.3257 30.9806
5 10 2.0 108.478 34.991

Now we can plot the error bars. In this case, the `mapping` function will take three positional arguments: `x` position (time), `y` position (mean concentration), and the error bar length (standard deviation):

``````data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
visual(Errorbars) |> draw``````
Tip

Notice that `Errorbars` only generates the vertical interval lines. It is a common practice to show error bars together with other data visualization methods, such as bar and line plots. You can achieve this by adding two layers with the `+` operator, which would look like this for a line plot:

``````data(pk_error) *
mapping(:time, :Cmean, :Cstd, color = :dosegrp => nonnumeric) *
(visual(Errorbars) + visual(Lines)) |> draw``````

You can learn more about combining layers with the `+` operator by checking our tutorial on Grammar of Graphics with AlgebraOfGraphics.jl

3.4`Scatter`

`Scatter` is the plotting type for scatter plots and is analogous to `ggplot2`â€™s `geom_point()`:

``data(pk_ids) * mapping(:time, :cp; color = :id => nonnumeric) * visual(Scatter) |> draw``

There are some interesting keyword arguments for `Scatter` if you type in a Julia REPL `help_attributes(scatter)` or `?scatter`. For example, you can choose a `marker` type and `markersize`:

``````data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(Scatter; marker = '+', markersize = 25, alpha = 0.5) |> draw``````

3.5`ScatterLines`

`ScatterLines` is the fusion of `Scatter` and `Lines`. So every keyword argument from both of them will be available.

It is similar to applying in `ggplot2` the following geometries: `geom_line() + geom_point()`.

Letâ€™s plot our previous `Lines` example using `ScatterLines`:

``````data(pk_ids) *
mapping(:time, :cp; color = :id => nonnumeric) *
visual(ScatterLines; alpha = 0.5) |> draw``````

3.6`BoxPlot`

Box plots are the statisticianâ€™s favorite plots. Here is a simple box plot using `BoxPlot` inside `visual()`:

``data(df) * mapping(:SEX, :AGE) * visual(BoxPlot) |> draw``

You can use some keyword arguments for `BoxPlot`, such as:

• `show_notch`: whether or not to have a notch near the median.
• `range`: the inter-quartile range (IQR), default `1.5`.
• `whiskerwidth`: if you want to have a small horizontal end in the whiskers relative to the box width.
• `show_outliers`: whether or not to show outliers as points, default `true`.
``````data(df) *
mapping(:SEX, :AGE) *
visual(BoxPlot; show_notch = true, range = 1, whiskerwidth = 0.25) |> draw``````

3.7`Violin`

Violin plots are also popular and a good alternative to box plots. Instead of being based in median, quartiles and IQR, violin plots display the actual probability density of the underlying values (using a kernel density estimator).

Here is the same box plot example, but now using a `Violin` inside `visual()`. It shows much more visual information than the box plot:

``data(df) * mapping(:SEX, :AGE) * visual(Violin) |> draw``

As with `BoxPlot`, `Violin` has some interesting keyword arguments. The most important is `show_median`, which tells `AoG.jl` whether to show or not the median inside the violins:

``data(df) * mapping(:SEX, :AGE) * visual(Violin; show_median = true) |> draw``

`Violin` also accepts a `side` inside `mapping()` which breaks the violin plot into two sides: left and right.

If you pair `side` with `color` you can convey more information in your violin plots:

``````data(df) *
mapping(:SEX, :AGE; side = :WEIGHT_cat, color = :WEIGHT_cat) *
visual(Violin; show_median = true) |> draw``````
Note

We will cover more geometries and plotting types in Plotting Statistical Visualizations with `AlgebraOfGraphics.jl` which are frequently paired with statistical visualizations.