using NCA
using NCAUtilities
using PumasUtilities
Plot customization for Noncompartmental analysis - Multiple Ascending Dose
1 Libraries
We will load the NCA
, NCAUtilities
, and PumasUtilities
packages in order to have access to Pumas’ core and utilities functions for Noncompartmental analysis:
We will also use Makie.jl
as the main plotting library and also AlgebraOfGraphics.jl
.
CairoMakie.jl
will act as our plotting backend and will also make possible the various customization options that we will go over. On the other hand, AlgebraOfGraphics.jl
will provide a syntax inspired by the grammar of graphics and similar to R’s ggplot2
to create complex plots.
using CairoMakie
using AlgebraOfGraphics
Lastly, we will use the CSV.jl
package to read external data and DataFramesMeta.jl
for data wrangling.
using CSV
using DataFrames
using DataFramesMeta
2 Introduction
We will continue with our series on plot customization for noncompartmental analysis, where we aim to show you how to customize some of the plots available in the Pumas NCA workflow and create unique plots and layouts using Pumas
and Makie.jl
.
2.1 🔙 Quick reminder
Make sure to check our tutorial on Plot customization for Noncompartmental analysis - Single Ascending Dose for a more detailed discussion on the motivation of this series of tutorials
We are trying to apply a consistent color palette across the plots generated in our NCA workflow. Our color palette is comprised of five colors, which we divide into two main colors and three complementary colors:
2.1.0.1 Main colors
2.1.0.1.1 Complementary colors
These will be used for highlighting and more complex plots
Here’s a summary of our palette containing the names and hex codes for each color:
- Navy Blue:
#0B1E3F
- Emerald Green:
#50C878
- Coral:
#FF7F50
- Mustard yellow:
#FFDB58
- Deep purple:
#6A5ACD
Makie
uses Colors.jl
to define colors. This means that you can pass a string representing any [CSS color] (https://developer.mozilla.org/en-US/docs/Web/CSS/color) to Pumas
and Makie
’s plotting functions.
For instance, you could define the color red with "rgb(255,0,0)"
, "#FF0000"
, or "red"
. You can also use one of the named colors from Colors.jl
by passing them as a Symbol
(e.g :navyblue
)
We will store the colors in a Vector
so that we can access them later more easily. Let’s begin by defining the main and complementary colors. Notice that we are creating strings with the hex codes from above:
= ["#0B1E3F", "#50C878"];
main_colors = ["#FF7F50", "#FFDB58", "#6A5ACD"]; complementary
We can join both Vector
s of colors using vcat
, which will concatenate main_colors
and complementary_colors
to create a single color palette Vector
.
= vcat(complementary, main_colors); palette
We would also like to use a shade of gray for the background color instead of the default white color
= "#FAF9F6"; background
3 Load Data
We will be using simulated data for an orally administered drug through multiple doses ranging from 100 to 1,600 mg.
This tutorial is based on Alison Margolskee’s PK - Multiple Ascending Dose workflow. You can get the dataset used in this tutorial here.
Read the data using CSV.jl
and convert it to a DataFrame
= CSV.read("Multiple_Ascending_Dose_Dataset2.csv", DataFrame; missingstring = ["NA"])
df first(df, 5)
Row | ID | TIME | NOMTIME | TIMEUNIT | AMT | LIDV | CMT | NAME | EVENTU | CENS | EVID | WEIGHTB | SEX | TRTACT | DOSE | PROFDAY | PROFTIME | CYCLE |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Int64 | Float64 | Float64 | String7 | Float64 | Float64? | Int64 | String31 | String15 | Int64 | Int64 | Float64 | String7 | String7 | Int64 | Int64 | Float64 | Int64 | |
1 | 1 | -23.957 | -24.0 | Hours | 0.0 | 3.0 | 5 | PD - Ordinal | severity | 0 | 0 | 77.9 | Male | Placebo | 0 | 0 | 0.0 | 0 |
2 | 1 | -0.005 | -0.1 | Hours | 0.0 | 3.29 | 3 | PD - Continuous | IU/L | 0 | 0 | 77.9 | Male | Placebo | 0 | 0 | 23.9 | 0 |
3 | 1 | -0.005 | -0.1 | Hours | 0.0 | 3.0 | 4 | PD - Count | count | 0 | 0 | 77.9 | Male | Placebo | 0 | 0 | 23.9 | 0 |
4 | 1 | -0.005 | -0.1 | Hours | 0.0 | 0.0 | 6 | PD - Binary | response | 0 | 0 | 77.9 | Male | Placebo | 0 | 0 | 23.9 | 0 |
5 | 1 | 0.0 | 0.0 | Hours | 1.0e-12 | missing | 1 | Dosing | mg | 0 | 1 | 77.9 | Male | Placebo | 0 | 1 | 0.0 | 1 |
3.1 Data wrangling
We have a series of tutorials on Data Wrangling in Julia, where you can learn more about reading and manipulating data.
We will start by eliminating the rows that contain a negative time (NOMTIME < 0), placebo observations (dose = 0), and only keep dosing and PK concentration events. We can do this using the @rsubset
macro from DataFramesMeta.jl
@rsubset! df begin
:NOMTIME >= 0
:DOSE > 0
:NAME ∈ ["Dosing", "PK Concentration"]
end;
We will also switch the time units to days (currently in hours) and create a column for the route of administration as extravascular ("ev"
).
@rtransform! df begin
:NOMTIME = :NOMTIME / 24
:ROUTE = "ev"
end;
Finally, we have to define the units for the time (days), concentration (ng/mL), and dose amount (mg)
= u"d";
timeu = u"ng/mL";
concu = u"mg"; amtu
Notice that we have defined units using the u
string literal (as in u"..."
). This syntax corresponds to Unitful.jl
, which is the Julia package from which Pumas obtains its unit support.
Now we are ready to run read_nca
. Notice that we are setting the keyword argument group
as [:DOSE]
because we want to be able to account for the different dosing levels in our analysis.
= read_nca(
nca_data
df;= :ID,
id = :NOMTIME,
time = :LIDV,
observations = :AMT,
amt = :ROUTE,
route = [:DOSE],
group = timeu,
timeu = concu,
concu = amtu,
amtu );
4 Exploring concentration time profiles for each dose level
To explore the effect of dosing on drug concentration we will generate the average concentration time profile for each dose level. This can be accomplished by using the function summary_observations_vs_time
.
= summary_observations_vs_time(
f1, ax1, p1
nca_data;= false,
separate = [:DOSE],
group = (; backgroundcolor = background, size = (900, 500), fontsize = 16),
figure = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color
)
axislegend(ax1; backgroundcolor = background)
f1
Don’t forget to check our tutorials on Plot customization for Noncompartmental analysis - Single Ascending Dose and Customization of AlgebraOfGraphics.jl Plots for a more detailed discussion on the extensive support for customization achievable with the axis
and figure
keyword arguments
You probably noticed that we assigned Set1_5
to the color
keyword argument. Currently, summary_observations_vs_time
only accepts a single color or a predefined colormap from ColorSchemes.jl
. We are currently working to add support for a custom colormap
and expect this feature to be available soon.
We recommend using a colormap
if you are setting separate
to false
since this will allow you to get a different color for each line.
This plot allows us to clearly see that this is indeed a multiple dose scenario. We can see a sharp increase in concentration at times 0 and 10 days. This leads us to our next visualization: the concentration time profile after each dosing event. We will visualize the concentration profile for the first day and the tenth day.
= Figure(; size = (1000, 600), backgroundcolor = background); fig_days
Now we can add the plot for the first day. We will use summary_observations_vs_time
again, but this time we will use the limits
argument inside of axis
to customize the range of the x axis
summary_observations_vs_time(
1, 1],
fig_days[
nca_data;= false,
separate = [:DOSE],
group = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = log10,
yscale = (0, 1, nothing, nothing),
limits = "Day 1",
title
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
The limits
argument accepts a tuple in the form (xlow, xhigh, ylow, yhigh)
. In this case, we used (0, 1, nothing, nothing)
because we only want to show the first day, but we passed nothing
to the y axis limits to let Makie.jl
know that we don’t want to manually set the limit for this axis.
There are other ways to manually specify the limits of an axis in Makie.jl
. You can learn more in our tutorial on Customization of AlgebraOfGraphics.jl Plots and Makie.jl
’s documentation on the attributes available for Axis
You can’t use the figure
keyword argument if you are passing an Axis
as the first positional argument to summary_observations_vs_time
(fig_days[1, 1]
in this case) or to any other of the Pumas’ plotting functions.
Let’s now add the plot for the tenth day. We’ll just change the limits
and title
arguments for this:
summary_observations_vs_time(
1, 2],
fig_days[
nca_data;= false,
separate = [:DOSE],
group = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = log10,
yscale = (10, 11, nothing, nothing),
limits = "Day 10",
title
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
Lastly, let’s add a legend using figurelegend
:
figurelegend(
fig_days;= :b,
position = Outside(),
alignmode = :horizontal,
orientation = background,
backgroundcolor );
You can customize the looks of your legend by passing any of the supported attributes for Legend
to figurelegend
Now we can check the results by inspecting the variable holding our Figure
(fig_days
in this case):
fig_days
This plot allows us to see more clearly the concentration response to each dosing event. We can also use it to check if there is any accumulation of the first administered dose and the steady state response.
4.1 Subject variability
One way in which we might visualize subject variability is by plotting the average concentration profile overlaid on a spaghetti plot for all subjects in the population.
But before we do anything, we must eliminate the rows where the concentration (LIDV
) has a value of missing
, as this will cause errors when plotting:
= @rsubset(df, !ismissing(:LIDV)); df_clean
Now we can calculate the mean concentration profile for each dose level, we will need to import the function mean
from Julia’s standard library package Statistics.jl
using Statistics: mean
= @by df_clean [:DOSE, :NOMTIME] begin
df_sp :Cmean = mean(:LIDV)
end;
And that’s it for the wrangling. Let’s create our plot:
We are going to be using AlgebraOfGraphics.jl
for plotting. If you are unfamiliar with its syntax and use, feel free to refer to our series of tutorials on data visualization in Julia, starting with Introduction to AlgebraOfGraphics.jl.
We start by creating our plots, which we’ll pass to draw
later. But let us do just one more thing before we start: we will define a renamer
for the dose column. This will allow us to get a more readable legend.
= renamer(map(i -> i => "$(i) mg", unique(df.DOSE))); dose_renamer
You can pass a Vector
of Pairs
of the form (value in dataset => value to be displayed) to the renamer
function provided by AoG.jl
.
Make sure to check our tutorial on Customization of AlgebraOfGraphics.jl
Plots to learn more about renamer
and other utility functions provided by AoG.jl
=
plt_all data(df_clean) *
mapping(:NOMTIME, :LIDV; layout = :DOSE => dose_renamer, group = :ID => nonnumeric) *
visual(ScatterLines; color = (:grey, 0.2));
=
plt_mean data(df_sp) *
mapping(:NOMTIME, :Cmean; layout = :DOSE => dose_renamer) *
visual(ScatterLines);
draw(
+ plt_all;
plt_mean = (; size = (2000, 1000), fontsize = 25),
figure = (; ylabel = "Concentration (ng/mL)", xlabel = "Time (days)", yscale = log10),
axis )
Another useful way to examine subject variability in the concentration response is to see the concentration profile for each subject. We can do this using observations_vs_time
:
= observations_vs_time(
observations
nca_data;= true,
paginate = main_colors[1],
markercolor = main_colors[2],
color = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel
),= (; backgroundcolor = background, size = (1000, 800), fontsize = 18),
figure = (; combinelabels = true),
facet
);1] observations[
We recommend setting the keyword argument paginate
to true
for datasets containing many subjects (more than 9). By doing this, observations_vs_time
will return a vector of plots instead of trying to cram all of them into the limited space of a single figure.
Also, make to check our tutorial on Plot customization for Noncompartmental analysis - Single Ascending Dose for a detailed explanation of the keyword arguments that you can use for observations_vs_time
5 Dose normalized concentration
In addition to the concentration time profile, it could also be interesting to analyze the dose-normalized concentration profile. Normalizing the concentration will allow us to assess the dose linearity of exposure and we will use the normalized concentration for the subsequent analysis.
We begin by creating a new column, NLIDV
, which will be the concentration (LIDV
) divided by the corresponding dose (DOSE
):
@rtransform! df begin
:NLIDV = :LIDV / :DOSE
end;
Now we create another NCAPopulation
, but this time the observations will be the normalized concentrations:
= read_nca(
nca_data_norm
df;= :ID,
id = :NOMTIME,
time = :NLIDV,
observations = :AMT,
amt = :ROUTE,
route = [:DOSE],
group );
In order to get the normalized concentration profile for each dose, we use the same code as before, just changing the NCAPopulation
for the one that we just computed:
= summary_observations_vs_time(
f2, ax2, p2
nca_data_norm;= false,
separate = [:DOSE],
group = (; backgroundcolor = background, size = (900, 500), fontsize = 16),
figure = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Normalized Concentration (ng/mL)/mg",
ylabel = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color
)axislegend(ax2; backgroundcolor = background)
f2
It is interesting to note that the normalized concentration is similar for every dose level. This suggests a linear effect of dose on the concentration profile.
Let’s perform a similar analysis as before and plot the normalized concentration profile for the first and the tenth day:
= Figure(; size = (1000, 600), backgroundcolor = background)
fig_days_norm
summary_observations_vs_time(
1, 1],
fig_days_norm[
nca_data_norm;= false,
separate = [:DOSE],
group = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = log10,
yscale = (0, 1, nothing, nothing),
limits = "Day 1",
title
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars
)
summary_observations_vs_time(
1, 2],
fig_days_norm[
nca_data_norm;= false,
separate = [:DOSE],
group = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = log10,
yscale = (10, 11, nothing, nothing),
limits = "Day 10",
title
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars
)
figurelegend(
fig_days_norm;= :b,
position = Outside(),
alignmode = :horizontal,
orientation = background,
backgroundcolor
)
fig_days_norm
Notice that this is essentially the same code as before. We just had to change our NCAPopulation
(nca_data
to nca_data_norm
) and our Figure
(fig_days
to fig_days_norm
).
Also see that we included all the code in a single cell using the begin..end
syntax. This was done to save some space on the notebook since we already went over how to create a custom figure layout
5.1 AUC
The dose normalized AUC is also an important means to assess whether the dose is producing a linear effect on the concentration profile. We will be calculating the AUC values for all dose levels and for the first and tenth day.
The NCA
package offers a convenient function called auc
, which will return to us the AUC value for all dosing events. Let’s calculate the AUC values for the first subject in our population as an example:
auc(nca_data_norm[1], interval = (0, 1)) NCA.
[ Info: ID 11 errored: lambdaz calculation needs at least three data points between Cmax and the last positive concentration
[ Info: ID 11 errored: lambdaz calculation needs at least three data points between Cmax and the last positive concentration
[ Info: ID 11 errored: lambdaz calculation needs at least three data points between Cmax and the last positive concentration
[ Info: ID 11 errored: lambdaz calculation needs at least three data points between Cmax and the last positive concentration
6-element Vector{Union{Missing, Float64}}:
0.0032036328730891727
missing
missing
missing
missing
0.003485850895316806
Notice that auc
returned a Vector
instead of a single value. This is because auc
calculates the AUC for all dosing events, which in our case are six. However, the ones in the middle (positions 2 to 5) are consecutive doses and have no AUC value, which is why we get missing
in those positions.
In our case, we are interested in the first and last values, since those represent the two increases in concentration observed on days 1 and 10.
It should also be noted that we used the interval
keyword argument. This is used to tell auc
the time after dose that it should consider when calculating the AUC. Since we are interested in calculating the AUC for a day, we passed (0, 1)
to it.
That was a mouthful, but now we understand how auc
works and we are ready to calculate the AUC values for our entire population:
= NCA.auc(nca_data_norm, interval = (0, 1)); AUC_values
first(AUC_values, 5)
Row | id | DOSE | auc0_1 |
---|---|---|---|
String | Int64 | Float64? | |
1 | 11 | 100 | 0.00320363 |
2 | 11 | 100 | missing |
3 | 11 | 100 | missing |
4 | 11 | 100 | missing |
5 | 11 | 100 | missing |
Notice that we can just pass the population to auc
and it will return a DataFrame
containing the calculated values for each subject. Let’s get rid of the missing
values using dropmissing!
:
dropmissing!(AUC_values)
first(AUC_values, 5)
Row | id | DOSE | auc0_1 |
---|---|---|---|
String | Int64 | Float64 | |
1 | 11 | 100 | 0.00320363 |
2 | 11 | 100 | 0.00348585 |
3 | 12 | 100 | 0.00341586 |
4 | 12 | 100 | 0.00572918 |
5 | 13 | 100 | 0.00308788 |
Now we have two AUC values for each subject: the first one represents the first day and the second represents the tenth day. Let’s add a new column that will indicate this to us in order to have a more convenient way to create our plots:
= [1, 10]; days
We will use the function repeat
to create an array that has the same length as the number of rows of our DataFrame
(AUC_values
), alternating the values in days
([1, 10, 1, 10, ... , 1, 10]
).
First, we calculate the number of rows in our DataFrame
with the nrow
function and then we divide it by two (since we are repeating two values: 1 and 10):
= Int(nrow(AUC_values) / 2);
repeat_number = repeat(days, repeat_number);
day_indicator
@transform! AUC_values :day = day_indicator;
That was a lot of work 😅, but we are finally ready to create our violin plot and visualize the dose normalized AUC:
But let us do just one more thing before we start: we will define a renamer
for the day columns. We use the same procedure as we did for the dose renamer:
= renamer([1 => "Day 1", 10 => "Day 10"]);
day_renamer
=
plt_AUC data(AUC_values) *
mapping(
:DOSE => nonnumeric,
:auc0_1;
= :day => day_renamer,
col = :DOSE => dose_renamer => "Dose",
color *
) visual(Violin; show_median = true, mediancolor = background);
draw(
plt_AUC,scales(Color = (; palette));
= (;
axis = "Dose (mg)",
xlabel = "Dose normalized AUC (ng/mL)*day*mg^-1",
ylabel = background,
backgroundcolor
),= (; backgroundcolor = background, size = (1000, 500)),
figure = (; backgroundcolor = background),
legend )
Notice that we have included the keyword arguments show_median
and mediancolor
in this plot (inside of the visual
call of plt_ratio
), which tells AoG.jl
to show the median inside of the violin plots and to give it a custom color (the background
color in this case).
We encourage you to check our tutorial on Plotting Different Geometries with AlgebraOfGraphics.jl to learn more about the keyword arguments available for Violin
the other geometries that Makie.jl
and AoG.jl
offer.
We have used a lot of customization options from AoG.jl
to generate the last plot. Make sure to check our tutorial on Customization of AlgebraOfGraphics.jl Plots for a detailed explanation of how to use the nonnumeric
and renamer
helper functions, and also to learn more about the available keyword arguments for draw
This plot allows us to see that there isn’t a very large difference between the AUC from the first and the tenth day. However, there is a better way to examine this relationship, which is by using the ratio between the tenth and the first day’s AUC.
We are using the @by
macro from DataFramesMeta.jl
to group the data by id
and then calculate the ratio.
Don’t forget to check our Data Wrangling in Julia tutorial Manipulating Tables with DataFramesMeta.jl
for a more in-depth explanation on the use of DataFramesMeta.jl
’s macros
= @by AUC_values :id begin
AUC_ratios :ratio = :auc0_1[2] / :auc0_1[1]
:DOSE = :DOSE
end;
Let’s use a violin plot again to visualize these results:
=
plt_ratio data(AUC_ratios) *
mapping(:DOSE => nonnumeric, :ratio; color = :DOSE => dose_renamer => "Dose") *
visual(Violin; show_median = true, mediancolor = background);
draw(
plt_ratio,scales(Color = (; palette));
= (;
axis = "Dose (mg)",
xlabel = "Fold accumulation in AUC",
ylabel = background,
backgroundcolor
),= (; backgroundcolor = background, size = (1000, 500)),
figure = (; backgroundcolor = background),
legend )
We can see here that the median of the ratio is slightly greater than 1 for all dose levels (about 1.25). This could be interpreted as an indication of accumulation.
5.2 Cₘₐₓ
Let’s now perform a similar analysis to the one we just presented for the AUC. This time, however, we will go over some of the details more quickly, so make sure to review the AUC section if anything feels unfamiliar.
Similarly to auc
, the NCA
package contains a convenient function called cmax
to calculate maximum concentration. Let’s use it to determine Cₘₐₓ for our population:
There are many more convenient functions for NCA workflows. Make sure to check our documentation on NCA Functions for more information.
= NCA.cmax(nca_data_norm, interval = (0, 1))
cmax_values_all first(cmax_values_all, 5)
Row | id | DOSE | cmax0_1 |
---|---|---|---|
String | Int64 | Float64 | |
1 | 11 | 100 | 0.0118 |
2 | 11 | 100 | 0.000691 |
3 | 11 | 100 | 0.00137 |
4 | 11 | 100 | 0.0026 |
5 | 11 | 100 | 0.00125 |
Again, we’ll only keep the first and the last values. This time it won’t be as easy as calling dropmissing!
, but we’ll manage using the @by
macro from DataFramesMeta.jl
:
= @by cmax_values_all :id begin
cmax_values :cmax0_1 = [:cmax0_1[begin], :cmax0_1[end]]
:DOSE = [:DOSE[begin], :DOSE[end]]
end;
Let’s also create our day indicator (day 1 and day 10). We can reuse the one we created when doing the AUC analysis:
@transform! cmax_values :day = day_indicator;
Now we are ready for the violin plots. Let’s begin with visualizing Cₘₐₓ for each dose level and the first and tenth day:
=
plt_cmax data(cmax_values) *
mapping(
:DOSE => nonnumeric,
:cmax0_1;
= :day => day_renamer,
col = :DOSE => dose_renamer => "Dose",
color *
) visual(Violin, show_median = true, mediancolor = background);
draw(
plt_cmax,scales(Color = (; palette));
= (;
axis = "Dose (mg)",
xlabel = "Dose normalized Cₘₐₓ (ng/mL)*mg^-1",
ylabel = background,
backgroundcolor
),= (; backgroundcolor = background, size = (1000, 500)),
figure = (; backgroundcolor = background),
legend )
We see that the medians are similar for both days and across all dose levels, but there is high variability for some dose levels (e.g 400 and 1600 mg on the tenth day)
Let’s now examine the ratio between Cₘₐₓ on the tenth and the first day to help us assess if there is accumulation. Again, we begin by calculating the ratios using the @by
macro:
= @by cmax_values :id begin
cmax_ratios :ratio = :cmax0_1[2] / :cmax0_1[1]
:DOSE = :DOSE
end;
And that’s it! We can now generate our violin plot:
=
plt_cmax_ratio data(cmax_ratios) *
mapping(:DOSE => nonnumeric, :ratio; color = :DOSE => dose_renamer => "Dose") *
visual(Violin; show_median = true, mediancolor = background);
draw(
plt_cmax_ratio,scales(Color = (; palette));
= (;
axis = "Dose (mg)",
xlabel = "Fold accumulation in Cₘₐₓ",
ylabel = background,
backgroundcolor
),= (; backgroundcolor = background, size = (1000, 500)),
figure = (; backgroundcolor = background),
legend )
This allows us to see that the median of the ratios is close to 1, which could be an indication that there is no accumulation, but we should also notice that there is high variability, especially for the 400 mg dose level.
6 Exploring the effect of covariates
As you may have noticed, the dataset that we are using includes information about the subject’s sex and weight. We will explore the effect of these two covariates on the concentration profiles, AUC and Cₘₐₓ
6.1 Sex
We will generate the mean concentration profile for each sex. In order to do this, we will have to create a new NCAPopulation
with read_nca
. However, this time we’ll specify SEX
as the group variable
= read_nca(
nca_data_sex
df;= :ID,
id = :NOMTIME,
time = :NLIDV,
observations = :AMT,
amt = :ROUTE,
route = [:SEX],
group );
We could add any number of columns to the group
keyword argument. For instance, we could have done group=[:SEX, :DOSE]
to let Pumas know that we want to group subjects both by sex and by dose level. However, this will result in 10 groups (cartesian product for a set of 2 categories of sex and 5 dose levels), and we want to limit this analysis to only the sex covariate.
Now let’s generate the concentration profile on a logarithmic scale following the same procedure as described above:
= summary_observations_vs_time(
f3, ax3, p3
nca_data_sex;= false,
separate = "SEX",
group = (; backgroundcolor = background, size = (900, 500), fontsize = 16),
figure = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Normalized Concentration (ng/mL)/mg",
ylabel = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color
)
axislegend(ax3; backgroundcolor = background)
f3
This allows us to see that females exhibit a higher concentration profile on average. Let’s also compute the values for AUC and Cₘₐₓ in order to examine this difference:
= run_nca(nca_data_sex); nca_results_sex
This time we just used run_nca
instead of using auc
and cmax
because we won’t be distinguishing between the first and the tenth day.
If we just use parameters_vs_group
like we did on our Plot customization for Noncompartmental analysis - Single Ascending Dose tutorial, we will get an error which is caused by the fact that there will be missing
values on the AUC calculations.
Because of this, we will need to remove those rows. Let’s start by converting our results to a DataFrame
by accessing the reportdf
field of nca_results_sex
:
= nca_results_sex.reportdf; nca_results_sex_df
Now let’s only keep the columns which we are interested in. That is id
, SEX
, cmax
, and aucinf_obs
:
@select! nca_results_sex_df :id :SEX :cmax :aucinf_obs;
Lastly, we can drop the missing
values with dropmissing!
:
dropmissing!(nca_results_sex_df);
Once again, we’ll visualize the results using violin plots. However, this time we would like to see the AUC and Cₘₐₓ results next to each other.
We will follow the same method as the one we used to create the concentration time profiles for each day: create an empty figure and place the plots on each axis.
= Figure(; backgroundcolor = background, size = (1000, 500)); fAUC_Cmax_sex
Let’s now create the plot for the AUC and add it using draw!
=
plt_auc_sex data(nca_results_sex_df) * mapping(:SEX, :aucinf_obs, color = :SEX) * visual(Violin);
draw!(
1, 1],
fAUC_Cmax_sex[
plt_auc_sex,scales(Color = (; palette = main_colors));
= (;
axis = "Sex",
xlabel = "Dose normalized AUC (ng/mL)*day*mg^-1",
ylabel = "AUC",
title = background,
backgroundcolor
), );
Now we can do the same, but for Cₘₐₓ:
=
plt_cmax_sex data(nca_results_sex_df) * mapping(:SEX, :cmax, color = :SEX) * visual(Violin);
draw!(
1, 2],
fAUC_Cmax_sex[
plt_cmax_sex,scales(Color = (; palette = main_colors));
= (;
axis = "Sex",
xlabel = "Dose normalized Cₘₐₓ (ng/mL)*mg^-1",
ylabel = "Cₘₐₓ",
title = background,
backgroundcolor
), );
As always, we can check our final results by inspecting the figure (fAUC_Cmax_sex
):
fAUC_Cmax_sex
6.2 Weight
We’ll perform the same analysis as we did for the sex covariate. In order to do that, we will divide our population into those under 70 kg and those over:
@rtransform! df begin
:WEIGHT = :WEIGHTB > 70 ? "Over 70 kg" : "Under 70 kg"
end;
Now we can create our NCAPopulation
. This time we set the group
keyword argument to :WEIGHT
, our newly created variable:
= read_nca(
nca_data_weight
df;= :ID,
id = :NOMTIME,
time = :NLIDV,
observations = :AMT,
amt = :ROUTE,
route = [:WEIGHT],
group );
Let’s visualize the concentration time profile for each weight group:
= summary_observations_vs_time(
f4, ax4, p4
nca_data_weight;= false,
separate = [:WEIGHT],
group = (; backgroundcolor = background, size = (900, 500), fontsize = 16),
figure = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Normalized Concentration (ng/mL)/mg",
ylabel = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color
)
axislegend(ax4; backgroundcolor = background)
f4
We can see here that individuals with a lower body weight tend to exhibit higher concentration profiles. We could generate the violin plots for the AUC and Cₘₐₓ by following the same procedure as before, but we’ll leave it here since we already went over that.
7 Exploring both covariates simultaneously
For our last visualization, let’s combine both covariates in a single analysis. To do this, we will plot AUC and Cₘₐₓ against body weight, and we will also examine if there are differences related to sex.
Let’s start by creating an NCAPopulation
specifying the two groups we are interested in: SEX
and WEIGHTB
.
= read_nca(
nca_data_sex_weight
df;= :ID,
id = :NOMTIME,
time = :NLIDV,
observations = :AMT,
amt = :ROUTE,
route = [:SEX, :WEIGHTB],
group );
Again, we calculate the AUC and Cₘₐₓ using read_nca
. We’ll follow the same procedure as in the AUC section: convert the results to a DataFrame
, select the appropriate columns, remove missing
values and plot the results next to each other.
= run_nca(nca_data_sex_weight).reportdf; nca_results_sex_weight
@select! nca_results_sex_weight :id :SEX :WEIGHT = :WEIGHTB :cmax :aucinf_obs;
dropmissing!(nca_results_sex_weight);
Now we are ready to generate the plots. Let’s start by creating the Figure
:
= Figure(; backgroundcolor = background, size = (1000, 600)); fig_two_cov
Now we create the AUC plot and add it to the Figure
:
=
auc_plot data(nca_results_sex_weight) *
mapping(:WEIGHT, :aucinf_obs; color = :SEX => "Sex") *
visual(Scatter) + AlgebraOfGraphics.linear());
(
= draw!(
auc 1, 1],
fig_two_cov[
auc_plot,scales(Color = (; palette = main_colors));
= (;
axis = background,
backgroundcolor = "Weight (kg)",
xlabel = "Dose normalized AUC (ng/mL)*day*mg^-1",
ylabel = "AUC",
title
), );
The plot for Cₘₐₓ can be created with a very similar procedure:
=
cmax_plot data(nca_results_sex_weight) *
mapping(:WEIGHT, :cmax; color = :SEX => "Sex") *
visual(Scatter) + AlgebraOfGraphics.linear());
(
= draw!(
cmax 1, 2],
fig_two_cov[
cmax_plot,scales(Color = (; palette = main_colors));
= (;
axis = background,
backgroundcolor = "Weight (kg)",
xlabel = "Dose normalized Cₘₐₓ (ng/mL)*mg^-1",
ylabel = "Cₘₐₓ",
title
), );
Let’s add a legend for a finishing touch. This will also allow us to differentiate the data for each sex.
legend!(
end+1, 1:2],
fig_two_cov[
cmax;= :horizontal,
orientation = true,
tellheight = background,
backgroundcolor );
Following what he have done before, we inspect the Figure
(fig_two_cov
) to check our results:
fig_two_cov
This plot lets us see that, on average, AUC and Cₘₐₓ decrease as the person’s weight increases, and that these metrics generally have a higher value for female subjects than for male subjects of the same weight. However, we can also see that there is some variability, especially for the AUC values.
8 Real world data
In this second section of the tutorial, we will also be performing an NCA, but this time we will be using real-world data from a multiple dose project carried out by Novartis.
You’ll find that the process is very similar to what we have done before, we’ll just need to be more careful during our data wrangling section.
This tutorial section of the tutorial is based on Andy Stein’s PK - Multiple Ascending Dose - Using Novartis dzz file workflow. You can get the dataset used in this tutorial here.
8.1 Load Data
Again, we use CSV.jl
to read the data and convert it to a DataFrame
. This time, however, we included the empty string (""
) in the missingstring
keyword argument:
= CSV.read("dzz_PKConc.csv", DataFrame; missingstring = ["NA", ""])
df_novartis first(df_novartis, 5)
Row | SUBJID | ARM | VISIT | PCDTC | TMTPT | RESN | RESU |
---|---|---|---|---|---|---|---|
Int64 | String | String15 | DateTime? | Int64? | Float64? | String7 | |
1 | 1000001 | Phase I ABC123 240mg Q2W | CYCLE 4 DAY 1 | missing | 0 | 80.8 | ug/mL |
2 | 1000001 | Phase I ABC123 240mg Q2W | CYCLE 1 DAY 1 | 2016-03-10T10:40:00 | 0 | 0.0 | ug/mL |
3 | 1000001 | Phase I ABC123 240mg Q2W | CYCLE 1 DAY 1 | 2016-03-10T12:45:00 | 1 | 103.0 | ug/mL |
4 | 1000001 | Phase I ABC123 240mg Q2W | CYCLE 1 DAY 2 | 2016-03-11T12:00:00 | 24 | 84.1 | ug/mL |
5 | 1000001 | Phase I ABC123 240mg Q2W | CYCLE 1 DAY 8 | 2016-03-18T11:15:00 | 168 | 39.8 | ug/mL |
8.1.1 Data wrangling
Since this time we are not using simulated data, we will need to do some additional work to get the dataset ready for our analysis and to use read_nca
.
In order to save some space, we will be using the @chain
macro to perform all of our data wrangling operations in one go. Here’s a broad overview of what we will be doing:
- Extract dose and regimen information from
ARM
column. - Extract cycle information from
VISIT
column. - Remove
missing
values from date column (PCDTC
). - Compute the time column from the dates (convert milliseconds to days).
- Identify rows where the time is repeated.
- Eliminate rows where there are repeated times.
= @chain df_novartis begin
df_novartis_clean
@rtransform @astable begin
= split(:ARM, " ")
arm_information :regimen = arm_information[end] # extract regimen
= chop(arm_information[end-1], tail = 2) # extract dose
dose :dose = parse(Int, dose)
end
@rtransform @astable begin
= split(:VISIT, " ")
visit_information :cycle = visit_information[2]
end
@rsubset begin
ismissing(:PCDTC) # remove missing dates
!end
@by :SUBJID @astable begin
= :PCDTC .- first(:PCDTC) # calculate time
TIME :time = [i.value / (1000 * 60 * 60 * 24) for i in TIME] # convert milliseconds to days
:dose = :dose
:conc = :RESN
:regimen = :regimen
:cycle = :cycle
end
@by [:SUBJID, :time] begin
:unique = allunique(:time) # identify repeated times
:dose = first(:dose)
:conc = first(:conc)
:regimen = first(:regimen)
:cycle = first(:cycle)
end
@rsubset begin
:unique == true # eliminate repeated times
end
end
first(df_novartis_clean, 5)
Row | SUBJID | time | unique | dose | conc | regimen | cycle |
---|---|---|---|---|---|---|---|
Int64 | Float64 | Bool | Int64 | Float64? | SubStrin… | SubStrin… | |
1 | 1000001 | 0.0 | true | 240 | 0.0 | Q2W | 1 |
2 | 1000001 | 0.0868056 | true | 240 | 103.0 | Q2W | 1 |
3 | 1000001 | 1.05556 | true | 240 | 84.1 | Q2W | 1 |
4 | 1000001 | 8.02431 | true | 240 | 39.8 | Q2W | 1 |
5 | 1000001 | 11.066 | true | 240 | 38.1 | Q2W | 1 |
8.2 Concentration time profiles
8.2.1 Dose and regimen
We will examine the concentration time profiles for each regimen (Q2W and Q4W). To do this, we must start by creating each population, so we’ll have to divide it according to the regimen
column:
= @rsubset df_novartis_clean :regimen == "Q2W";
q2w_data = @rsubset df_novartis_clean :regimen == "Q4W"; q4w_data
Now we use read_nca
to create the NCAPopulation
required for the plotting functions:
= read_nca(q2w_data; id = :SUBJID, group = [:dose]);
q2w_population = read_nca(q4w_data; id = :SUBJID, group = [:dose]); q4w_population
As before, we use summary_observations_vs_time
to visualize the mean concentration for each dose level:
= Figure(; backgroundcolor = background, size = (1200, 600)); fig_dose_regimen
summary_observations_vs_time(
1, 1],
fig_dose_regimen[
q2w_population;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = "Q2W",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
summary_observations_vs_time(
1, 2],
fig_dose_regimen[
q4w_population;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = "Q4W",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
figurelegend(
fig_dose_regimen;= :b,
position = Outside(),
alignmode = :horizontal,
orientation = background,
backgroundcolor );
fig_dose_regimen
We can see that this time the concentration profiles aren’t as good looking as before, as we are using real measurements instead of simulated data.
This time we went over the process of creating each plot and adding it to the figure really fast. If you feel unsure about anything, please refer to the earlier section on exploring concentration time profiles for each dose level for a more detailed explanation.
8.2.2 Dose, regimen and cycle
Here, we will include the cycle in the analysis. We’ll take advantage of the cycle
column we created before to create an NCAPopulation
for both regimens and cycles 1 and 3
We start by the Q2W regimen:
= @rsubset q2w_data :cycle == "1";
q2w_1 = read_nca(q2w_1; id = :SUBJID, group = [:dose]);
q2w_1_population = @rsubset q2w_data :cycle == "3";
q2w_3 = read_nca(q2w_3; id = :SUBJID, group = [:dose]); q2w_3_population
Now we do the same for the Q4W regimen:
= @rsubset q4w_data :cycle == "1";
q4w_1 = read_nca(q4w_1; id = :SUBJID, group = [:dose]);
q4w_1_population = @rsubset q4w_data :cycle == "3";
q4w_3 = read_nca(q4w_3; id = :SUBJID, group = [:dose]); q4w_3_population
We can use the same procedure as before to create our plot colored by dose and faceted by both regimen and cycle:
= Figure(; backgroundcolor = background, size = (1200, 800)); fig_dose_regimen_cycle
summary_observations_vs_time(
1, 1],
fig_dose_regimen_cycle[
q2w_1_population;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = "Q2W - Cycle 1",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
summary_observations_vs_time(
1, 2],
fig_dose_regimen_cycle[
q4w_1_population;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = "Q4W - Cycle 1",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
summary_observations_vs_time(
2, 1],
fig_dose_regimen_cycle[
q2w_3_population;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = "Q2W - Cycle 3",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
summary_observations_vs_time(
2, 2],
fig_dose_regimen_cycle[
q4w_1_population;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = "Q4W - Cycle 3",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
figurelegend(
fig_dose_regimen_cycle;= :r,
position = Outside(),
alignmode = :vertical,
orientation = background,
backgroundcolor );
Here’s the final result:
fig_dose_regimen_cycle
8.3 Subject variability
Similarly to what we did before, we will generate the concentration time profile for each subject in the population in order to investigate subject variability.
However, this time we would like to be able to tell the dose level and regimen for each subject. We will achieve this by creating a population and grouping by subject id (:SUBJID
), dose level (:dose
), and regimen (:regimen
):
= read_nca(
population_indicator
df_novartis_clean;= :SUBJID,
id = :conc,
observations = [:dose, :regimen, :SUBJID],
group );
Now we just use summary_observations_vs_time
to visualize our results:
= summary_observations_vs_time(
grouped_observations
population_indicator;= true,
paginate = 9,
limit = false,
errorbars = main_colors[1],
markercolor = main_colors[2],
color = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)",
ylabel = log10,
yscale
),= (; backgroundcolor = background, size = (1800, 1200), fontsize = 20),
figure = (; combinelabels = true),
facet
);
1] grouped_observations[
9 Dose normalized concentration
As before, we need to begin our analysis by determining the dose normalized concentration. Once again, we will consider the differences between regimens, so we must compute the dose normalized concentration for each case
@rtransform! q2w_data begin
:nconc = :conc / :dose
end;
@rtransform! q4w_data begin
:nconc = :conc / :dose
end;
Now we create the new NCAPopulation
s, but this time we specify the observations to be the normalized concentration:
= read_nca(q2w_data; id = :SUBJID, observations = :nconc, group = [:dose]);
nq2w = read_nca(q4w_data; id = :SUBJID, observations = :nconc, group = [:dose]); nq4w
We’ll start by observing the dose normalized concentration profiles for each regimen. You’ll notice that the code will be virtually the same as the one we used above, we just change the NCAPopulation
passed to summary_observations_vs_time
:
= Figure(; backgroundcolor = background, size = (1200, 600)); fig_dose_regimen_norm
summary_observations_vs_time(
1, 1],
fig_dose_regimen_norm[
nq2w;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Normalized Concentration (ng/mL)/mg",
ylabel = "Q2W",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
summary_observations_vs_time(
1, 2],
fig_dose_regimen_norm[
nq4w;= false,
separate = (;
axis = background,
backgroundcolor = "Time (days)",
xlabel = "Concentration (ng/mL)/mg",
ylabel = "Q4W",
title = log10,
yscale
),= Makie.to_colormap(:Set1_5),
color = false,
errorbars );
figurelegend(
fig_dose_regimen_norm;= :b,
position = Outside(),
alignmode = :horizontal,
orientation = background,
backgroundcolor );
fig_dose_regimen_norm
9.1 AUC and Cₘₐₓ
Our last plot will be a visualization for the AUC and Cₘₐₓ. We will create violin plots for different types of AUC and for the Cₘₐₓ.
In this case, it will be more convenient to have a population with both mixed in (consider all subjects at once) since we will be generating the plot manually using AoG.jl
. Let’s start by creating the normalized concentration column:
@rtransform! df_novartis_clean begin
:nconc = :conc / :dose
end;
Now we can create the population with read_nca
. Note that we specified both :dose
and :regimen
as our grouping variables:
= read_nca(
population
df_novartis_clean;= :SUBJID,
id = :nconc,
observations = [:dose, :regimen],
group );
Having created the population, we can calculate our parameters using run_nca
:
= run_nca(population; parameters = ["aucinf_obs", "auclast", "cmax"]).reportdf; report
You can use the parameters
keyword argument to specify which Puma’s very extensive list of NCA parameters you would like to include in your report.
Of course, you could omit this argument and run_nca
will provide all relevant parameters, which is what we have been doing so far.
dropmissing!(report);
Notice that run_nca(...).reportdf
produces a DataFrame
in a long format. However, it would be more convenient to have it in a wide format since we want to produce a plot faceted by the type of parameter calculated (aucinf_obs
, auclast
, and cmax
).
We could definitely use DataFramesMeta.jl
’s stack
function to create a wide version of the data as follows:
= stack(report) report_wide
But AoG.jl
has the ability to deal with long and wide formats without the need for data wrangling, so we are going to take advantage of that here. Make sure to check our tutorials on Reshaping DataFrame
s and Wide Data with AlgebraOfGraphics.jl
to learn more about this.
= renamer(map(i -> i => "$(i) mg", unique(report.dose))); dose_renamer_novartis
=
plt_metrics data(report) *
mapping(:dose => nonnumeric => "Dose (mg)", 4:6 .=> "Parameter") *
mapping(;
= :dose => dose_renamer_novartis => "Dose",
color = :regimen,
col = dims(1) => renamer(["AUCinf", "AUClast", "Cₘₐₓ"]),
row *
) visual(Violin; show_median = true, mediancolor = background);
draw(
plt_metrics,scales(Color = (; palette));
= (; backgroundcolor = background, size = (1200, 1000), fontsize = 20),
figure = (;
axis = background,
backgroundcolor = Makie.pseudolog10,
yscale = map(i -> 10.0^i, 0:2),
yticks
),= (; backgroundcolor = background),
legend = (; linkxaxes = :none, linkyaxes = :none),
facet )
You can use linkxaxes
and linkyaxes
inside of the facet
keyword argument to tell AoG.jl
that the axis limits should be different for each plot.
This can be useful when plotting variables with very different ranges, such as AUC and Cₘₐₓ in this case.
10 Conclusions
We are continuing with our tutorials on Plot Customization. This time we analyzed more complex situations than the one we previously discussed in Plot customization for Noncompartmental analysis - Single Ascending Dose, which gave us the opportunity to consider new visualization options such as comparing two different periods of time in the concentration time profile. Lastly, we were also able to explore some of the nice functions that the NCA
package offers, such as auc
for calculating the AUC and cmax
for the maximum concentration.
As always, we encourage you to check our documentation on plotting. You’ll find that this is the most complete source of information for Pumas’ plotting capabilities. Don’t forget to check the documentation for Makie.jl
and our tutorials on data visualization in Julia to get more information on the available customization options for things like figures and axes.