DAPA01: Creating an ADPPK dataset with `ADaM.jl` (without covariates)

Author

Ragav Rajan

This tutorial presents a template for ADPPK dataset preparation using ADaM.jl. Since we do not have any other datasets for deriving covariates, the ADPPK dataset primarily contains information related to concentration, dose, and time. This template should be customized according to the specific study and data being used.

This tutorial uses code annotations with hover functionality to highlight certain code sections. Hovering over the numbered annotations beside code snippets will display additional information.

1 Load Packages

In this section we load every Julia package used in the tutorial. DataFramesMeta drives the data wrangling, Dates handles the date/time arithmetic, ReadStatTables and PharmaDatasets pull in the source SDTM domains, and ADaM provides the helpers — make_dtm, set_exclusion, join_columns, round_columns — that power each transformation step below.

using CSV
using DataFramesMeta
using Dates
using ReadStatTables
using StatsBase
using PharmaDatasets
using ADaM

2 Read Data

The datasets are read from the SDTM/DAPA01 datasets folder of the PharmaDatasets.jl package.

pc = @chain dataset("SDTM/DAPA01/pc") convert_to_missing(["", nothing])
ex = @chain dataset("SDTM/DAPA01/ex") convert_to_missing(["", nothing])

first(pc, 5)

5×19 DataFrame

Row	STUDYID	DOMAIN	USUBJID	PCSEQ	PCTESTCD	PCTEST	PCORRES	PCORRESU	PCSTRESC	PCSTRESN	PCSTRESU	PCSPEC	PCLLOQ	VISIT	VISITNUM	PCDTC	PCDY	PCTPT	PCTPTNUM
	String7	String3	String15	Float64	String7	String15	String15	String7	String15	Float64	String7	String7	String15	String15	Float64	String31	Float64	String31	Float64
1	DAPA01	PC	DAPA01-001	1.0	DAPA	Dapagliflozin	157.021	ng/mL	157.021	157.021	ng/mL	plasma	0.1 ng/mL	Period 1 Day 1	2.0	2022-06-10 09:30:00	1.0	0-HR POSTDOSE	0.0
2	DAPA01	PC	DAPA01-001	2.0	DAPA	Dapagliflozin	141.892	ng/mL	141.892	141.892	ng/mL	plasma	0.1 ng/mL	Period 1 Day 1	2.0	2022-06-10 09:33:00	1.0	0.05-HR POSTDOSE	0.05
3	DAPA01	PC	DAPA01-001	3.0	DAPA	Dapagliflozin	116.228	ng/mL	116.228	116.228	ng/mL	plasma	0.1 ng/mL	Period 1 Day 1	2.0	2022-06-10 09:51:00	1.0	0.35-HR POSTDOSE	0.35
4	DAPA01	PC	DAPA01-001	4.0	DAPA	Dapagliflozin	109.353	ng/mL	109.353	109.353	ng/mL	plasma	0.1 ng/mL	Period 1 Day 1	2.0	2022-06-10 10:00:00	1.0	0.5-HR POSTDOSE	0.5
5	DAPA01	PC	DAPA01-001	5.0	DAPA	Dapagliflozin	66.4814	ng/mL	66.4814	66.4814	ng/mL	plasma	0.1 ng/mL	Period 1 Day 1	2.0	2022-06-10 10:15:00	1.0	0.75-HR POSTDOSE	0.75

2.1 VISITDY Lookup

The pc dataset does not contain the VISITDY column, which is needed to derive nominal times, but this column is available in the ex dataset. Therefore, the ex dataset is used to create a lookup table with VISITNUM - VISITDY mapping, which can then be merged with the pc dataset to derive nominal variables.

visitdy_lookup = @chain ex begin
    @select :VISITNUM :VISITDY
    unique
end

first(visitdy_lookup, 5)

4×2 DataFrame

Row	VISITNUM	VISITDY
	Float64	Float64
1	2.0	1.0
2	3.0	8.0
3	4.0	15.0
4	5.0	22.0

3 PC Data Preparation

In this section of the tutorial we shape the PC (Pharmacokinetics) domain into an analysis-ready form. We derive the analysis date/time variables (ADTM, ADT, ATM), join the VISITDY lookup to compute the nominal time variables (NFRLT, AVISIT, AVISITN), set the event identifier EVID = 0 for observations, and convert PCLLOQ from its unit-suffixed string form into a numeric lower limit of quantification.

pc_prep = @chain pc begin

    @rtransform :PCDTC = replace(:PCDTC, " " => "T")
    make_dtm(:PCDTC, prefix = "A")

    make_dtm_to_dt(:ADTM, prefix = "A")
    make_dtm_to_tm(:ADTM, prefix = "A")

    leftjoin(visitdy_lookup, on = :VISITNUM)

    @rtransform @astable begin
        :EVID = 0
        :DRUG = :PCTEST

        :NFRLT = 24 * (:VISITDY - 1) + :PCTPTNUM
        :AVISITN = :VISITNUM
        :AVISIT = "Visit " * string(:AVISITN)

        :PCLLOQ = parse(Float64, replace(:PCLLOQ, " ng/mL" => ""))
    end

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(pc_prep, 5)

1: @rtransform @astable begin lets you create several columns inside a single block. @astable makes a column derived earlier in the block (e.g. AVISITN) available when computing a later column (e.g. AVISIT).
2: PCLLOQ arrives as a String with the unit suffix (e.g. "5 ng/mL"). Pumas expects numeric values for the lower limit of quantification, so the unit is stripped with replace and the remaining text is parsed to Float64.

5×28 DataFrame

Row	STUDYID	DOMAIN	USUBJID	PCSEQ	PCTESTCD	PCTEST	PCORRES	PCORRESU	PCSTRESC	PCSTRESN	PCSTRESU	PCSPEC	PCLLOQ	VISIT	VISITNUM	PCDTC	PCDY	PCTPT	PCTPTNUM	ADTM	ADT	ATM	VISITDY	EVID	DRUG	NFRLT	AVISITN	AVISIT
	String7	String3	String15	Float64	String7	String15	String15	String7	String15	Float64	String7	String7	Float64	String15	Float64	String	Float64	String31	Float64	DateTime	Date	Time	Float64?	Int64	String15	Float64	Float64	String
1	DAPA01	PC	DAPA01-001	1.0	DAPA	Dapagliflozin	157.021	ng/mL	157.021	157.021	ng/mL	plasma	0.1	Period 1 Day 1	2.0	2022-06-10T09:30:00	1.0	0-HR POSTDOSE	0.0	2022-06-10T09:30:00	2022-06-10	09:30:00	1.0	0	Dapagliflozin	0.0	2.0	Visit 2.0
2	DAPA01	PC	DAPA01-001	2.0	DAPA	Dapagliflozin	141.892	ng/mL	141.892	141.892	ng/mL	plasma	0.1	Period 1 Day 1	2.0	2022-06-10T09:33:00	1.0	0.05-HR POSTDOSE	0.05	2022-06-10T09:33:00	2022-06-10	09:33:00	1.0	0	Dapagliflozin	0.05	2.0	Visit 2.0
3	DAPA01	PC	DAPA01-001	3.0	DAPA	Dapagliflozin	116.228	ng/mL	116.228	116.228	ng/mL	plasma	0.1	Period 1 Day 1	2.0	2022-06-10T09:51:00	1.0	0.35-HR POSTDOSE	0.35	2022-06-10T09:51:00	2022-06-10	09:51:00	1.0	0	Dapagliflozin	0.35	2.0	Visit 2.0
4	DAPA01	PC	DAPA01-001	4.0	DAPA	Dapagliflozin	109.353	ng/mL	109.353	109.353	ng/mL	plasma	0.1	Period 1 Day 1	2.0	2022-06-10T10:00:00	1.0	0.5-HR POSTDOSE	0.5	2022-06-10T10:00:00	2022-06-10	10:00:00	1.0	0	Dapagliflozin	0.5	2.0	Visit 2.0
5	DAPA01	PC	DAPA01-001	5.0	DAPA	Dapagliflozin	66.4814	ng/mL	66.4814	66.4814	ng/mL	plasma	0.1	Period 1 Day 1	2.0	2022-06-10T10:15:00	1.0	0.75-HR POSTDOSE	0.75	2022-06-10T10:15:00	2022-06-10	10:15:00	1.0	0	Dapagliflozin	0.75	2.0	Visit 2.0

make_dtm helps in creating a DateTime column ADTM (Analysis DateTime) from a String column . The prefix A needs to be specified to name the resultant column ADTM.

make_dtm_to_dt helps in creating a Date column ADT (Analysis Date) from DateTime column ADTM (Analysis DateTime) based on prefix A.

make_dtm_to_tm helps in creating a Time column ATM (Analysis Time) from DateTime column ADTM (Analysis DateTime) based on prefix A.

After deriving the date variables, the nominal time variables NFRLT (Nominal Relative Time from First Dose), AVISIT (Analysis Visit) and AVISITN are derived.

4 EX Data Preparation

Here we walk through the parallel preparation of the EX (Exposure) domain. We derive ASTDTM, AENDTM, and ADTM from the dosing start time, generate the corresponding Date and Time columns, set EVID = 1 for dosing records, and use groupby + transform to carry the first dose DateTime (FANLDTM) and the first administered dose (EXDOSE_first) across each subject’s records.

ex_prep = @chain ex begin

    @rtransform :EXSTDTC = replace(:EXSTDTC, " " => "T")
    make_dtm(:EXSTDTC, prefix = "AST")
    @rtransform begin
        :AENDTM = :ASTDTM
        :ADTM = :ASTDTM
    end

    make_dtm_to_dt(:ASTDTM, prefix = "AST") # Output : ASTDT
    make_dtm_to_dt(:AENDTM, prefix = "AEN") # Output : AENDT
    make_dtm_to_dt(:ADTM, prefix = "A")     # Output : ADT
    make_dtm_to_tm(:ADTM, prefix = "A")     # Output : ATM

    @rtransform @astable begin
        :NFRLT = 24 * (:VISITDY - 1)
        :AVISITN = :VISITNUM
        :AVISIT = "Visit " * string(Int(:AVISITN))
        :EVID = 1
        :DRUG = :EXTRT
    end

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :EXDOSE => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
        :ADTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
    )

    @orderby :USUBJID :DRUG :ADTM :EVID

end

first(ex_prep, 5)

1: minimum(skipmissing(x)) is used to find the minimum ADTM across USUBJID, DRUG grouping, skipping the missing values (like na.rm = TRUE in R). The resultant column FANLDTM has the minimum dose DateTime filled across the group.

5×29 DataFrame

Row	STUDYID	DOMAIN	USUBJID	EXSEQ	EXTRT	EXDOSE	EXDOSU	EXDOSFRM	EXDOSFRQ	EXROUTE	VISITNUM	VISIT	VISITDY	EXSTDTC	EXSTDY	ASTDTM	AENDTM	ADTM	ASTDT	AENDT	ADT	ATM	NFRLT	AVISITN	AVISIT	EVID	DRUG	EXDOSE_first	FANLDTM
	String7	String3	String15	Float64	String15	Float64	String3	String15	String7	String15	Float64	String15	Float64	String	Float64	DateTime	DateTime	DateTime	Date	Date	Date	Time	Float64	Float64	String	Int64	String15	Float64	DateTime
1	DAPA01	EX	DAPA01-001	1.0	Dapagliflozin	5.0	mg	Injection	ONCE	Intravenous	2.0	Period 1 Day 1	1.0	2022-06-10T09:30:00	1.0	2022-06-10T09:30:00	2022-06-10T09:30:00	2022-06-10T09:30:00	2022-06-10	2022-06-10	2022-06-10	09:30:00	0.0	2.0	Visit 2	1	Dapagliflozin	5.0	2022-06-10T09:30:00
2	DAPA01	EX	DAPA01-001	2.0	Dapagliflozin	5.0	mg	Capsule	ONCE	Oral	3.0	Period 2 Day 1	8.0	2022-06-17T09:30:00	8.0	2022-06-17T09:30:00	2022-06-17T09:30:00	2022-06-17T09:30:00	2022-06-17	2022-06-17	2022-06-17	09:30:00	168.0	3.0	Visit 3	1	Dapagliflozin	5.0	2022-06-10T09:30:00
3	DAPA01	EX	DAPA01-001	3.0	Dapagliflozin	10.0	mg	Capsule	ONCE	Oral	4.0	Period 3 Day 1	15.0	2022-06-25T09:30:00	15.0	2022-06-25T09:30:00	2022-06-25T09:30:00	2022-06-25T09:30:00	2022-06-25	2022-06-25	2022-06-25	09:30:00	336.0	4.0	Visit 4	1	Dapagliflozin	5.0	2022-06-10T09:30:00
4	DAPA01	EX	DAPA01-001	4.0	Dapagliflozin	25.0	mg	Capsule	ONCE	Oral	5.0	Period 4 Day 1	22.0	2022-07-02T11:30:00	22.0	2022-07-02T11:30:00	2022-07-02T11:30:00	2022-07-02T11:30:00	2022-07-02	2022-07-02	2022-07-02	11:30:00	504.0	5.0	Visit 5	1	Dapagliflozin	5.0	2022-06-10T09:30:00
5	DAPA01	EX	DAPA01-002	1.0	Dapagliflozin	5.0	mg	Injection	ONCE	Intravenous	2.0	Period 1 Day 1	1.0	2022-03-12T09:08:00	1.0	2022-03-12T09:08:00	2022-03-12T09:08:00	2022-03-12T09:08:00	2022-03-12	2022-03-12	2022-03-12	09:08:00	0.0	2.0	Visit 2	1	Dapagliflozin	5.0	2022-03-12T09:08:00

First, we derive the date variables such as ASTDTM,AENDTM,ADTM using make_dtm and ASTDT,AENDT,ADT using make_dtm_to_dt. For that the space () in the DateTime strings is replaced to T to make it DateTime convertible.

Second, we derive the nominal time variables such as NFRLT, AVISITN and AVISIT.

Lastly, we derive the FANLDTM(First Analyte Dose DateTime) and EXDOSE_first (First dosage administered).

5 EX Expansion

Dose expansion is the step where interval-style dosing records (e.g. “BID for 14 days”) are expanded into one row per administered dose. For DAPA01 no expansion is needed — the regimen is ONCE daily and each administration is already a separate EX record — so ex_prep is used directly downstream.

Note

If your study records dosing as intervals (start and end DateTime with a frequency), this is where you would expand the EX table into individual dose rows before combining it with PC.

6 Combine PC and EX

In this section of the tutorial we bring pc_prep and ex_prep together into a single combined dataset. We first stack the two domains with vcat, apply exclusion flags for problematic subjects, then perform a self-join against the dosing records to derive reference variables (ADTM_prev, EXDOSE_prev, NFRLT_prev) that downstream relative-time calculations rely on.

6.1 Flag the combined dataset

We begin by row-binding pc_prep and ex_prep, then use set_exclusion to mark subjects that should not enter the analysis — those with missing concentrations, no dosing records, or no concentration records. Each call adds (or extends) the EXCLF (Exclusion Flag) and EXCLFCOM (Exclusion Flag Comment) columns, so excluded subjects remain in the dataset but are traceable.

adppk_flag = @chain pc_prep begin
    vcat(ex_prep, cols = :union)

    # Exclusion 1: Subjects with missing conc. data
    set_exclusion(
        "Subjects with missing conc.",
        excl_func = group -> all(ismissing, group.PCSTRESN),
        group = [:USUBJID, :DRUG],
    )

    # Exclusion 2: Subjects with no dosing data
    set_exclusion(
        "Subjects with no dose records",
        excl_func = group -> all((==)(0), group.EVID),
        group = [:USUBJID, :DRUG],
    )

    # Exclusion 3: Subjects with no conc. data
    set_exclusion(
        "Subjects with no conc. records",
        excl_func = group -> all((==)(1), group.EVID),
        group = [:USUBJID, :DRUG],
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end;

5×6 DataFrame

Row	USUBJID	DRUG	ADTM	EVID	EXCLF	EXCLFCOM
	String15	String15	DateTime	Int64	Int64	Missing
1	DAPA01-001	Dapagliflozin	2022-06-10T09:30:00	0	0	missing
2	DAPA01-001	Dapagliflozin	2022-06-10T09:30:00	1	0	missing
3	DAPA01-001	Dapagliflozin	2022-06-10T09:33:00	0	0	missing
4	DAPA01-001	Dapagliflozin	2022-06-10T09:51:00	0	0	missing
5	DAPA01-001	Dapagliflozin	2022-06-10T10:00:00	0	0	missing

set_exclusion flags rows by group, adding EXCLF (Exclusion Flag) and EXCLFCOM (Exclusion Flag Comment). It takes a reason string, an excl_func = group -> condition predicate over a grouped sub-DataFrame, and a group key (e.g. [:USUBJID, :DRUG]).

The predicates above use point-free shorthand — all(ismissing, group.PCSTRESN) instead of all(x -> ismissing(x), group.PCSTRESN), and (==)(0) instead of x -> x == 0.

6.2 Derive Reference data

Next, we look up reference values from the dosing records for every row of the combined dataset. Using join_columns against ex_prep, we pull in the most recent prior dose DateTime (ADTM_prev), the most recent prior dose amount (EXDOSE_prev), and the most recent prior nominal time (NFRLT_prev). These “previous-dose” columns become the references used in the next section to compute relative time variables.

adppk_nom_prev = @chain adppk_flag begin
    join_columns(
        ex_prep,
        on = [:USUBJID, :DRUG],
        order = [:ADTM],
        keep = [:ADTM => :ADTM_prev, :EXDOSE => :EXDOSE_prev],
        filter_join = (t, r) -> t.ADTM > r.ADTM,
        mode = "last",
    )

    join_columns(
        ex_prep,
        on = [:USUBJID, :DRUG],
        order = [:NFRLT],
        keep = [:NFRLT => :NFRLT_prev],
        filter_join = (t, r) -> t.NFRLT > r.NFRLT,
        mode = "last",
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end;

1: join_columns looks up the most recent prior dosing record (from ex_prep) for each row, using filter_join to keep only references strictly earlier than the current row, and mode = "last" to pick the latest qualifying match.

5×6 DataFrame

Row	USUBJID	EVID	ADTM	ADTM_prev	NFRLT	NFRLT_prev
	String15	Int64	DateTime	DateTime?	Float64	Float64?
1	DAPA01-001	0	2022-06-10T09:30:00	missing	0.0	missing
2	DAPA01-001	1	2022-06-10T09:30:00	missing	0.0	missing
3	DAPA01-001	0	2022-06-10T09:33:00	2022-06-10T09:30:00	0.05	0.0
4	DAPA01-001	0	2022-06-10T09:51:00	2022-06-10T09:30:00	0.35	0.0
5	DAPA01-001	0	2022-06-10T10:00:00	2022-06-10T09:30:00	0.5	0.0

Deriving Reference variables are an important step for deriving required time and analysis variables.

join_columns performs a self-join against the dosing dataset (ex_prep) based on USUBJID and DRUG grouping to derive the prev variables.
For each row, the filter_join condition t.<col> > r.<col> restricts matches to dose records strictly earlier than the current row, and mode = "last" selects the most recent qualifying dose, giving the previous dose’s ADTM, EXDOSE, and NFRLT.

7 Derive Relative Time Variables

In this section of the tutorial we use the reference columns derived above to compute the relative time variables that any population PK analysis depends on. We fill FANLDTM and the per-subject minimum NFRLT across each USUBJID + DRUG group, then derive AFRLT (Actual Time Relative to First Dose), APRLT (Actual Time Relative to Previous Dose), and NPRLT (Nominal Time Relative to Previous Dose), handling missing values with @passmissing and clamping any negative pre-dose times to zero.

adppk_aprlt = @chain adppk_nom_prev begin
    # Fill the missing values with minimum value based on groupby
    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :FANLDTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
        :NFRLT => (x -> minimum(skipmissing(x))) => :min_NFRLT,
        :EXDOSE_first => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
    )

    @rtransform @passmissing begin
        :AFRLT = (:ADTM - DateTime(:FANLDTM)) / Hour(1)
        :APRLT = (:ADTM - DateTime(:ADTM_prev)) / Hour(1)
    end

    @rtransform :NPRLT =
        (:EVID == 1) ? 0 :
        ismissing(:NFRLT_prev) ? (:NFRLT - :min_NFRLT) : (:NFRLT - :NFRLT_prev)

    # Handling negative times
    @rtransform @passmissing :AFRLT = :AFRLT < 0 ? 0 : :AFRLT
    @rtransform @passmissing :APRLT = :APRLT < 0 ? 0 : :APRLT

    @orderby :USUBJID :DRUG :ADTM :EVID
end;

5×9 DataFrame

Row	USUBJID	ADTM	EVID	FANLDTM	AFRLT	ADTM_prev	APRLT	NFRLT_prev	NPRLT
	String15	DateTime	Int64	DateTime	Float64	DateTime?	Float64?	Float64?	Real
1	DAPA01-001	2022-06-10T09:30:00	0	2022-06-10T09:30:00	0.0	missing	missing	missing	0.0
2	DAPA01-001	2022-06-10T09:30:00	1	2022-06-10T09:30:00	0.0	missing	missing	missing	0
3	DAPA01-001	2022-06-10T09:33:00	0	2022-06-10T09:30:00	0.05	2022-06-10T09:30:00	0.05	0.0	0.05
4	DAPA01-001	2022-06-10T09:51:00	0	2022-06-10T09:30:00	0.35	2022-06-10T09:30:00	0.35	0.0	0.35
5	DAPA01-001	2022-06-10T10:00:00	0	2022-06-10T09:30:00	0.5	2022-06-10T09:30:00	0.5	0.0	0.5

Firstly, the FANLDTM column is filled across the combined dataset based on the minimum value, along with NFRLT which is represented in a new column min_NFRLT. Then the relative time variables are derived and their missing and negative values handled accordingly.

Relative Time Variables	Column Label	Reference Data
`AFRLT`	Actual Time Relative to First Dose	`FANLDTM`
`APRLT`	Actual Time Relative to Previous Dose	`ADTM_prev`
`NPRLT`	Nominal Time Relative to Previous Dose	`NFRLT_prev`

Note

@passmissing macro helps to skip missing values creating a new column or modifying an existing column.

8 Derive Analysis Variables

Here we walk through the derivation of the core analysis and flag variables that the modelling step will consume. We compute the actual dose carried on each record (DOSEA), the analysis lower limit of quantification (ALLOQ), the compartment and amount columns (CMT, AMT), the below-quantification flags (BLQFL, BLQFN), the dependent variable and its log transform (DV, DVL), the missing-dependent-variable flag (MDV), and the analysis unit (AVALU).

adppk_aval = @chain adppk_aprlt begin

    @orderby :USUBJID :ADTM

    @rtransform @astable begin
        # Derive Actual Dose
        :DOSEA =
            (:EVID == 1) ? :EXDOSE : ismissing(:EXDOSE_prev) ? :EXDOSE_first : :EXDOSE_prev

        # Analysis Lower Limit of Quantification
        :ALLOQ = :PCLLOQ

        # Compartment and Amount
        :CMT = (:EVID == 1) ? 1 : 2
        :AMT = (:EVID == 1) ? :EXDOSE : missing
    end

    # Below Lower Limit of Quantification Flag
    @rtransform @passmissing :BLQFL = (:PCSTRESN <= :ALLOQ) ? "Y" : "N"
    @rtransform @passmissing :BLQFN = (:PCSTRESN <= :ALLOQ) ? 1 : 0

    # Dependent Variable
    @rtransform :DV = (:EVID == 1) ? missing : :PCSTRESN

    # Log Transformed Dependent Variable
    @rtransform @passmissing :DVL = (:DV > 0) ? log(:DV) : missing

    # Missing Dependent Variable Result
    @rtransform :MDV = (:EVID == 1) ? 1 : ismissing(:DV) ? 1 : 0

    # Analysis Variable Unit
    @rtransform :AVALU = (:EVID == 1) ? :EXDOSU : :PCSTRESU

    @orderby :USUBJID :DRUG :ADTM :EVID
end;

5×13 DataFrame

Row	USUBJID	ADTM	EVID	DOSEA	ALLOQ	CMT	AMT	BLQFL	BLQFN	DV	DVL	MDV	AVALU
	String15	DateTime	Int64	Float64	Float64?	Int64	Float64?	String?	Int64?	Float64?	Float64?	Int64	InlineSt…
1	DAPA01-001	2022-06-10T09:30:00	0	5.0	0.1	2	missing	N	0	157.021	5.05638	0	ng/mL
2	DAPA01-001	2022-06-10T09:30:00	1	5.0	missing	1	5.0	missing	missing	missing	missing	1	mg
3	DAPA01-001	2022-06-10T09:33:00	0	5.0	0.1	2	missing	N	0	141.892	4.95507	0	ng/mL
4	DAPA01-001	2022-06-10T09:51:00	0	5.0	0.1	2	missing	N	0	116.228	4.75555	0	ng/mL
5	DAPA01-001	2022-06-10T10:00:00	0	5.0	0.1	2	missing	N	0	109.353	4.69458	0	ng/mL

Important analysis and flag variables are derived under this step.

Warning

Compartment CMT depends on concentration specimen and route of administration which can vary from study to study. Derivation must be done accordingly.

9 ADPPK Dataset

In the final assembly step of the tutorial we produce the analysis-ready ADPPK dataset. We add the record and analysis sequence columns (RECSEQ, ASEQ), fill the constant dose covariates within each subject-drug group, rename columns to match the ADPPK standard, select and order the columns in a regulatory-style layout, and round the Float64 columns to three decimal places using round_columns.

adppk = @chain adppk_aval begin

    @orderby :USUBJID :DRUG :ADTM :EVID

    transform(eachindex => :RECSEQ)
    transform(groupby(_, [:USUBJID, :DRUG]), eachindex => :ASEQ)

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :EXROUTE => (x -> first(skipmissing(x))) => :EXROUTE,
        :EXDOSFRM => (x -> first(skipmissing(x))) => :EXDOSFRM,
        :EXDOSFRQ => (x -> first(skipmissing(x))) => :EXDOSFRQ,
    )

    # Rename
    rename(:DRUG => :PROJID, :EXDOSFRQ => :DOSEFRQ, :EXROUTE => :ROUTE, :EXDOSFRM => :FORM)

    # Select Columns
    select(
        # exclusion flags
        :EXCLF,
        :EXCLFCOM,
        # subject level
        :STUDYID,
        :USUBJID,
        :ASEQ,
        :PROJID,
        # dose details
        :DOSEFRQ,
        :ROUTE,
        :FORM,
        :DOSEA,
        :AMT,
        :CMT,
        :EVID,
        # time details
        :AVISITN,
        :AFRLT,
        :APRLT,
        :NFRLT,
        :NPRLT,
        :ADTM,
        :ATM,
        :FANLDTM,
        # conc details
        :DV,
        :DVL,
        :MDV,
        :ALLOQ,
        :BLQFL,
        :BLQFN,
    )

    round_columns(3)
end

first(adppk, 5)

1: Sequence columns are derived (for excluded rows, the values are missing):
2: Constant dose covariates are filled across the dataset based on grouping.
3: Several columns are renamed according to ADPPK standards.
4: The required columns are selected and positioned in a specific order.
5: Float64 columns are rounded across the DataFrame using the round_columns function. The number of digits to retain after the decimal point is specified (3 in this case).

5×27 DataFrame

Row	EXCLF	EXCLFCOM	STUDYID	USUBJID	ASEQ	PROJID	DOSEFRQ	ROUTE	FORM	DOSEA	AMT	CMT	EVID	AVISITN	AFRLT	APRLT	NFRLT	NPRLT	ADTM	ATM	FANLDTM	DV	DVL	MDV	ALLOQ	BLQFL	BLQFN
	Float64	Missing	String7	String15	Float64	String15	String7	String15	String15	Float64	Float64?	Float64	Float64	Float64	Float64	Float64?	Float64	Float64	DateTime	Time	DateTime	Float64?	Float64?	Float64	Float64?	String?	Float64?
1	0.0	missing	DAPA01	DAPA01-001	1.0	Dapagliflozin	ONCE	Intravenous	Injection	5.0	missing	2.0	0.0	2.0	0.0	missing	0.0	0.0	2022-06-10T09:30:00	09:30:00	2022-06-10T09:30:00	157.021	5.056	0.0	0.1	N	0.0
2	0.0	missing	DAPA01	DAPA01-001	2.0	Dapagliflozin	ONCE	Intravenous	Injection	5.0	5.0	1.0	1.0	2.0	0.0	missing	0.0	0.0	2022-06-10T09:30:00	09:30:00	2022-06-10T09:30:00	missing	missing	1.0	missing	missing	missing
3	0.0	missing	DAPA01	DAPA01-001	3.0	Dapagliflozin	ONCE	Intravenous	Injection	5.0	missing	2.0	0.0	2.0	0.05	0.05	0.05	0.05	2022-06-10T09:33:00	09:33:00	2022-06-10T09:30:00	141.892	4.955	0.0	0.1	N	0.0
4	0.0	missing	DAPA01	DAPA01-001	4.0	Dapagliflozin	ONCE	Intravenous	Injection	5.0	missing	2.0	0.0	2.0	0.35	0.35	0.35	0.35	2022-06-10T09:51:00	09:51:00	2022-06-10T09:30:00	116.228	4.756	0.0	0.1	N	0.0
5	0.0	missing	DAPA01	DAPA01-001	5.0	Dapagliflozin	ONCE	Intravenous	Injection	5.0	missing	2.0	0.0	2.0	0.5	0.5	0.5	0.5	2022-06-10T10:00:00	10:00:00	2022-06-10T09:30:00	109.353	4.695	0.0	0.1	N	0.0

10 Conclusion

This tutorial walked through a minimal end-to-end ADPPK build with ADaM.jl:

Reading PC and EX from the DAPA01 study and pulling VISITDY from EX into PC via a lookup table.
Deriving analysis date/time variables (ADTM, ADT, ATM) and nominal time variables (NFRLT, AVISIT, AVISITN).
Combining PC and EX, then flagging records with set_exclusion for missing concentrations, missing dosing, and missing samples.
Looking up previous-dose references with join_columns to derive AFRLT, APRLT, and NPRLT.
Deriving analysis variables (DOSEA, CMT, AMT, DV, DVL, MDV, BLQFL, BLQFN) and assembling the final ADPPK dataset with sequence columns and a regulatory-style column order.

Because this study has no additional source datasets, no baseline covariates were added. In a typical analysis you would extend this template by merging DM, VS, LB, and other SDTM domains to derive demographic and time-varying covariates, then re-running the downstream exclusion and analysis-variable steps.

Reuse

CC BY-SA 4.0