DAPA01: Creating an ADPPK dataset with ADaM.jl (without covariates)

Author

Ragav Rajan

This tutorial presents a template for ADPPK dataset preparation using ADaM.jl. Since we do not have any other datasets for deriving covariates, the ADPPK dataset primarily contains information related to concentration, dose, and time. This template should be customized according to the specific study and data being used.

This tutorial uses code annotations with hover functionality to highlight certain code sections. Hovering over the numbered annotations beside code snippets will display additional information.

1 Load Packages

In this section we load every Julia package used in the tutorial. DataFramesMeta drives the data wrangling, Dates handles the date/time arithmetic, ReadStatTables and PharmaDatasets pull in the source SDTM domains, and ADaM provides the helpers — make_dtm, set_exclusion, join_columns, round_columns — that power each transformation step below.

using CSV
using DataFramesMeta
using Dates
using ReadStatTables
using StatsBase
using PharmaDatasets
using ADaM

2 Read Data

The datasets are read from the SDTM/DAPA01 datasets folder of the PharmaDatasets.jl package.

pc = @chain dataset("SDTM/DAPA01/pc") convert_to_missing(["", nothing])
ex = @chain dataset("SDTM/DAPA01/ex") convert_to_missing(["", nothing])

first(pc, 5)
5×19 DataFrame
Row STUDYID DOMAIN USUBJID PCSEQ PCTESTCD PCTEST PCORRES PCORRESU PCSTRESC PCSTRESN PCSTRESU PCSPEC PCLLOQ VISIT VISITNUM PCDTC PCDY PCTPT PCTPTNUM
String7 String3 String15 Float64 String7 String15 String15 String7 String15 Float64 String7 String7 String15 String15 Float64 String31 Float64 String31 Float64
1 DAPA01 PC DAPA01-001 1.0 DAPA Dapagliflozin 157.021 ng/mL 157.021 157.021 ng/mL plasma 0.1 ng/mL Period 1 Day 1 2.0 2022-06-10 09:30:00 1.0 0-HR POSTDOSE 0.0
2 DAPA01 PC DAPA01-001 2.0 DAPA Dapagliflozin 141.892 ng/mL 141.892 141.892 ng/mL plasma 0.1 ng/mL Period 1 Day 1 2.0 2022-06-10 09:33:00 1.0 0.05-HR POSTDOSE 0.05
3 DAPA01 PC DAPA01-001 3.0 DAPA Dapagliflozin 116.228 ng/mL 116.228 116.228 ng/mL plasma 0.1 ng/mL Period 1 Day 1 2.0 2022-06-10 09:51:00 1.0 0.35-HR POSTDOSE 0.35
4 DAPA01 PC DAPA01-001 4.0 DAPA Dapagliflozin 109.353 ng/mL 109.353 109.353 ng/mL plasma 0.1 ng/mL Period 1 Day 1 2.0 2022-06-10 10:00:00 1.0 0.5-HR POSTDOSE 0.5
5 DAPA01 PC DAPA01-001 5.0 DAPA Dapagliflozin 66.4814 ng/mL 66.4814 66.4814 ng/mL plasma 0.1 ng/mL Period 1 Day 1 2.0 2022-06-10 10:15:00 1.0 0.75-HR POSTDOSE 0.75

2.1 VISITDY Lookup

The pc dataset does not contain the VISITDY column, which is needed to derive nominal times, but this column is available in the ex dataset. Therefore, the ex dataset is used to create a lookup table with VISITNUM - VISITDY mapping, which can then be merged with the pc dataset to derive nominal variables.

visitdy_lookup = @chain ex begin
    @select :VISITNUM :VISITDY
    unique
end

first(visitdy_lookup, 5)
4×2 DataFrame
Row VISITNUM VISITDY
Float64 Float64
1 2.0 1.0
2 3.0 8.0
3 4.0 15.0
4 5.0 22.0

3 PC Data Preparation

In this section of the tutorial we shape the PC (Pharmacokinetics) domain into an analysis-ready form. We derive the analysis date/time variables (ADTM, ADT, ATM), join the VISITDY lookup to compute the nominal time variables (NFRLT, AVISIT, AVISITN), set the event identifier EVID = 0 for observations, and convert PCLLOQ from its unit-suffixed string form into a numeric lower limit of quantification.

pc_prep = @chain pc begin

    @rtransform :PCDTC = replace(:PCDTC, " " => "T")
    make_dtm(:PCDTC, prefix = "A")

    make_dtm_to_dt(:ADTM, prefix = "A")
    make_dtm_to_tm(:ADTM, prefix = "A")

    leftjoin(visitdy_lookup, on = :VISITNUM)

    @rtransform @astable begin
        :EVID = 0
        :DRUG = :PCTEST

        :NFRLT = 24 * (:VISITDY - 1) + :PCTPTNUM
        :AVISITN = :VISITNUM
        :AVISIT = "Visit " * string(:AVISITN)

        :PCLLOQ = parse(Float64, replace(:PCLLOQ, " ng/mL" => ""))
    end

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(pc_prep, 5)
1
@rtransform @astable begin lets you create several columns inside a single block. @astable makes a column derived earlier in the block (e.g. AVISITN) available when computing a later column (e.g. AVISIT).
2
PCLLOQ arrives as a String with the unit suffix (e.g. "5 ng/mL"). Pumas expects numeric values for the lower limit of quantification, so the unit is stripped with replace and the remaining text is parsed to Float64.
5×28 DataFrame
Row STUDYID DOMAIN USUBJID PCSEQ PCTESTCD PCTEST PCORRES PCORRESU PCSTRESC PCSTRESN PCSTRESU PCSPEC PCLLOQ VISIT VISITNUM PCDTC PCDY PCTPT PCTPTNUM ADTM ADT ATM VISITDY EVID DRUG NFRLT AVISITN AVISIT
String7 String3 String15 Float64 String7 String15 String15 String7 String15 Float64 String7 String7 Float64 String15 Float64 String Float64 String31 Float64 DateTime Date Time Float64? Int64 String15 Float64 Float64 String
1 DAPA01 PC DAPA01-001 1.0 DAPA Dapagliflozin 157.021 ng/mL 157.021 157.021 ng/mL plasma 0.1 Period 1 Day 1 2.0 2022-06-10T09:30:00 1.0 0-HR POSTDOSE 0.0 2022-06-10T09:30:00 2022-06-10 09:30:00 1.0 0 Dapagliflozin 0.0 2.0 Visit 2.0
2 DAPA01 PC DAPA01-001 2.0 DAPA Dapagliflozin 141.892 ng/mL 141.892 141.892 ng/mL plasma 0.1 Period 1 Day 1 2.0 2022-06-10T09:33:00 1.0 0.05-HR POSTDOSE 0.05 2022-06-10T09:33:00 2022-06-10 09:33:00 1.0 0 Dapagliflozin 0.05 2.0 Visit 2.0
3 DAPA01 PC DAPA01-001 3.0 DAPA Dapagliflozin 116.228 ng/mL 116.228 116.228 ng/mL plasma 0.1 Period 1 Day 1 2.0 2022-06-10T09:51:00 1.0 0.35-HR POSTDOSE 0.35 2022-06-10T09:51:00 2022-06-10 09:51:00 1.0 0 Dapagliflozin 0.35 2.0 Visit 2.0
4 DAPA01 PC DAPA01-001 4.0 DAPA Dapagliflozin 109.353 ng/mL 109.353 109.353 ng/mL plasma 0.1 Period 1 Day 1 2.0 2022-06-10T10:00:00 1.0 0.5-HR POSTDOSE 0.5 2022-06-10T10:00:00 2022-06-10 10:00:00 1.0 0 Dapagliflozin 0.5 2.0 Visit 2.0
5 DAPA01 PC DAPA01-001 5.0 DAPA Dapagliflozin 66.4814 ng/mL 66.4814 66.4814 ng/mL plasma 0.1 Period 1 Day 1 2.0 2022-06-10T10:15:00 1.0 0.75-HR POSTDOSE 0.75 2022-06-10T10:15:00 2022-06-10 10:15:00 1.0 0 Dapagliflozin 0.75 2.0 Visit 2.0

make_dtm helps in creating a DateTime column ADTM (Analysis DateTime) from a String column . The prefix A needs to be specified to name the resultant column ADTM.

make_dtm_to_dt helps in creating a Date column ADT (Analysis Date) from DateTime column ADTM (Analysis DateTime) based on prefix A.

make_dtm_to_tm helps in creating a Time column ATM (Analysis Time) from DateTime column ADTM (Analysis DateTime) based on prefix A.

After deriving the date variables, the nominal time variables NFRLT (Nominal Relative Time from First Dose), AVISIT (Analysis Visit) and AVISITN are derived.

4 EX Data Preparation

Here we walk through the parallel preparation of the EX (Exposure) domain. We derive ASTDTM, AENDTM, and ADTM from the dosing start time, generate the corresponding Date and Time columns, set EVID = 1 for dosing records, and use groupby + transform to carry the first dose DateTime (FANLDTM) and the first administered dose (EXDOSE_first) across each subject’s records.

ex_prep = @chain ex begin

    @rtransform :EXSTDTC = replace(:EXSTDTC, " " => "T")
    make_dtm(:EXSTDTC, prefix = "AST")
    @rtransform begin
        :AENDTM = :ASTDTM
        :ADTM = :ASTDTM
    end

    make_dtm_to_dt(:ASTDTM, prefix = "AST") # Output : ASTDT
    make_dtm_to_dt(:AENDTM, prefix = "AEN") # Output : AENDT
    make_dtm_to_dt(:ADTM, prefix = "A")     # Output : ADT
    make_dtm_to_tm(:ADTM, prefix = "A")     # Output : ATM

    @rtransform @astable begin
        :NFRLT = 24 * (:VISITDY - 1)
        :AVISITN = :VISITNUM
        :AVISIT = "Visit " * string(Int(:AVISITN))
        :EVID = 1
        :DRUG = :EXTRT
    end

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :EXDOSE => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
        :ADTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
    )

    @orderby :USUBJID :DRUG :ADTM :EVID

end

first(ex_prep, 5)
1
minimum(skipmissing(x)) is used to find the minimum ADTM across USUBJID, DRUG grouping, skipping the missing values (like na.rm = TRUE in R). The resultant column FANLDTM has the minimum dose DateTime filled across the group.
5×29 DataFrame
Row STUDYID DOMAIN USUBJID EXSEQ EXTRT EXDOSE EXDOSU EXDOSFRM EXDOSFRQ EXROUTE VISITNUM VISIT VISITDY EXSTDTC EXSTDY ASTDTM AENDTM ADTM ASTDT AENDT ADT ATM NFRLT AVISITN AVISIT EVID DRUG EXDOSE_first FANLDTM
String7 String3 String15 Float64 String15 Float64 String3 String15 String7 String15 Float64 String15 Float64 String Float64 DateTime DateTime DateTime Date Date Date Time Float64 Float64 String Int64 String15 Float64 DateTime
1 DAPA01 EX DAPA01-001 1.0 Dapagliflozin 5.0 mg Injection ONCE Intravenous 2.0 Period 1 Day 1 1.0 2022-06-10T09:30:00 1.0 2022-06-10T09:30:00 2022-06-10T09:30:00 2022-06-10T09:30:00 2022-06-10 2022-06-10 2022-06-10 09:30:00 0.0 2.0 Visit 2 1 Dapagliflozin 5.0 2022-06-10T09:30:00
2 DAPA01 EX DAPA01-001 2.0 Dapagliflozin 5.0 mg Capsule ONCE Oral 3.0 Period 2 Day 1 8.0 2022-06-17T09:30:00 8.0 2022-06-17T09:30:00 2022-06-17T09:30:00 2022-06-17T09:30:00 2022-06-17 2022-06-17 2022-06-17 09:30:00 168.0 3.0 Visit 3 1 Dapagliflozin 5.0 2022-06-10T09:30:00
3 DAPA01 EX DAPA01-001 3.0 Dapagliflozin 10.0 mg Capsule ONCE Oral 4.0 Period 3 Day 1 15.0 2022-06-25T09:30:00 15.0 2022-06-25T09:30:00 2022-06-25T09:30:00 2022-06-25T09:30:00 2022-06-25 2022-06-25 2022-06-25 09:30:00 336.0 4.0 Visit 4 1 Dapagliflozin 5.0 2022-06-10T09:30:00
4 DAPA01 EX DAPA01-001 4.0 Dapagliflozin 25.0 mg Capsule ONCE Oral 5.0 Period 4 Day 1 22.0 2022-07-02T11:30:00 22.0 2022-07-02T11:30:00 2022-07-02T11:30:00 2022-07-02T11:30:00 2022-07-02 2022-07-02 2022-07-02 11:30:00 504.0 5.0 Visit 5 1 Dapagliflozin 5.0 2022-06-10T09:30:00
5 DAPA01 EX DAPA01-002 1.0 Dapagliflozin 5.0 mg Injection ONCE Intravenous 2.0 Period 1 Day 1 1.0 2022-03-12T09:08:00 1.0 2022-03-12T09:08:00 2022-03-12T09:08:00 2022-03-12T09:08:00 2022-03-12 2022-03-12 2022-03-12 09:08:00 0.0 2.0 Visit 2 1 Dapagliflozin 5.0 2022-03-12T09:08:00

First, we derive the date variables such as ASTDTM,AENDTM,ADTM using make_dtm and ASTDT,AENDT,ADT using make_dtm_to_dt. For that the space () in the DateTime strings is replaced to T to make it DateTime convertible.

Second, we derive the nominal time variables such as NFRLT, AVISITN and AVISIT.

Lastly, we derive the FANLDTM(First Analyte Dose DateTime) and EXDOSE_first (First dosage administered).

5 EX Expansion

Dose expansion is the step where interval-style dosing records (e.g. “BID for 14 days”) are expanded into one row per administered dose. For DAPA01 no expansion is needed — the regimen is ONCE daily and each administration is already a separate EX record — so ex_prep is used directly downstream.

Note

If your study records dosing as intervals (start and end DateTime with a frequency), this is where you would expand the EX table into individual dose rows before combining it with PC.

6 Combine PC and EX

In this section of the tutorial we bring pc_prep and ex_prep together into a single combined dataset. We first stack the two domains with vcat, apply exclusion flags for problematic subjects, then perform a self-join against the dosing records to derive reference variables (ADTM_prev, EXDOSE_prev, NFRLT_prev) that downstream relative-time calculations rely on.

6.1 Flag the combined dataset

We begin by row-binding pc_prep and ex_prep, then use set_exclusion to mark subjects that should not enter the analysis — those with missing concentrations, no dosing records, or no concentration records. Each call adds (or extends) the EXCLF (Exclusion Flag) and EXCLFCOM (Exclusion Flag Comment) columns, so excluded subjects remain in the dataset but are traceable.

adppk_flag = @chain pc_prep begin
    vcat(ex_prep, cols = :union)

    # Exclusion 1: Subjects with missing conc. data
    set_exclusion(
        "Subjects with missing conc.",
        excl_func = group -> all(ismissing, group.PCSTRESN),
        group = [:USUBJID, :DRUG],
    )

    # Exclusion 2: Subjects with no dosing data
    set_exclusion(
        "Subjects with no dose records",
        excl_func = group -> all((==)(0), group.EVID),
        group = [:USUBJID, :DRUG],
    )

    # Exclusion 3: Subjects with no conc. data
    set_exclusion(
        "Subjects with no conc. records",
        excl_func = group -> all((==)(1), group.EVID),
        group = [:USUBJID, :DRUG],
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end;
5×6 DataFrame
Row USUBJID DRUG ADTM EVID EXCLF EXCLFCOM
String15 String15 DateTime Int64 Int64 Missing
1 DAPA01-001 Dapagliflozin 2022-06-10T09:30:00 0 0 missing
2 DAPA01-001 Dapagliflozin 2022-06-10T09:30:00 1 0 missing
3 DAPA01-001 Dapagliflozin 2022-06-10T09:33:00 0 0 missing
4 DAPA01-001 Dapagliflozin 2022-06-10T09:51:00 0 0 missing
5 DAPA01-001 Dapagliflozin 2022-06-10T10:00:00 0 0 missing

set_exclusion flags rows by group, adding EXCLF (Exclusion Flag) and EXCLFCOM (Exclusion Flag Comment). It takes a reason string, an excl_func = group -> condition predicate over a grouped sub-DataFrame, and a group key (e.g. [:USUBJID, :DRUG]).

The predicates above use point-free shorthand — all(ismissing, group.PCSTRESN) instead of all(x -> ismissing(x), group.PCSTRESN), and (==)(0) instead of x -> x == 0.

6.2 Derive Reference data

Next, we look up reference values from the dosing records for every row of the combined dataset. Using join_columns against ex_prep, we pull in the most recent prior dose DateTime (ADTM_prev), the most recent prior dose amount (EXDOSE_prev), and the most recent prior nominal time (NFRLT_prev). These “previous-dose” columns become the references used in the next section to compute relative time variables.

adppk_nom_prev = @chain adppk_flag begin
    join_columns(
        ex_prep,
        on = [:USUBJID, :DRUG],
        order = [:ADTM],
        keep = [:ADTM => :ADTM_prev, :EXDOSE => :EXDOSE_prev],
        filter_join = (t, r) -> t.ADTM > r.ADTM,
        mode = "last",
    )

    join_columns(
        ex_prep,
        on = [:USUBJID, :DRUG],
        order = [:NFRLT],
        keep = [:NFRLT => :NFRLT_prev],
        filter_join = (t, r) -> t.NFRLT > r.NFRLT,
        mode = "last",
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end;
1
join_columns looks up the most recent prior dosing record (from ex_prep) for each row, using filter_join to keep only references strictly earlier than the current row, and mode = "last" to pick the latest qualifying match.
5×6 DataFrame
Row USUBJID EVID ADTM ADTM_prev NFRLT NFRLT_prev
String15 Int64 DateTime DateTime? Float64 Float64?
1 DAPA01-001 0 2022-06-10T09:30:00 missing 0.0 missing
2 DAPA01-001 1 2022-06-10T09:30:00 missing 0.0 missing
3 DAPA01-001 0 2022-06-10T09:33:00 2022-06-10T09:30:00 0.05 0.0
4 DAPA01-001 0 2022-06-10T09:51:00 2022-06-10T09:30:00 0.35 0.0
5 DAPA01-001 0 2022-06-10T10:00:00 2022-06-10T09:30:00 0.5 0.0

Deriving Reference variables are an important step for deriving required time and analysis variables.

  • join_columns performs a self-join against the dosing dataset (ex_prep) based on USUBJID and DRUG grouping to derive the prev variables.

  • For each row, the filter_join condition t.<col> > r.<col> restricts matches to dose records strictly earlier than the current row, and mode = "last" selects the most recent qualifying dose, giving the previous dose’s ADTM, EXDOSE, and NFRLT.

7 Derive Relative Time Variables

In this section of the tutorial we use the reference columns derived above to compute the relative time variables that any population PK analysis depends on. We fill FANLDTM and the per-subject minimum NFRLT across each USUBJID + DRUG group, then derive AFRLT (Actual Time Relative to First Dose), APRLT (Actual Time Relative to Previous Dose), and NPRLT (Nominal Time Relative to Previous Dose), handling missing values with @passmissing and clamping any negative pre-dose times to zero.

adppk_aprlt = @chain adppk_nom_prev begin
    # Fill the missing values with minimum value based on groupby
    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :FANLDTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
        :NFRLT => (x -> minimum(skipmissing(x))) => :min_NFRLT,
        :EXDOSE_first => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
    )

    @rtransform @passmissing begin
        :AFRLT = (:ADTM - DateTime(:FANLDTM)) / Hour(1)
        :APRLT = (:ADTM - DateTime(:ADTM_prev)) / Hour(1)
    end

    @rtransform :NPRLT =
        (:EVID == 1) ? 0 :
        ismissing(:NFRLT_prev) ? (:NFRLT - :min_NFRLT) : (:NFRLT - :NFRLT_prev)

    # Handling negative times
    @rtransform @passmissing :AFRLT = :AFRLT < 0 ? 0 : :AFRLT
    @rtransform @passmissing :APRLT = :APRLT < 0 ? 0 : :APRLT

    @orderby :USUBJID :DRUG :ADTM :EVID
end;
5×9 DataFrame
Row USUBJID ADTM EVID FANLDTM AFRLT ADTM_prev APRLT NFRLT_prev NPRLT
String15 DateTime Int64 DateTime Float64 DateTime? Float64? Float64? Real
1 DAPA01-001 2022-06-10T09:30:00 0 2022-06-10T09:30:00 0.0 missing missing missing 0.0
2 DAPA01-001 2022-06-10T09:30:00 1 2022-06-10T09:30:00 0.0 missing missing missing 0
3 DAPA01-001 2022-06-10T09:33:00 0 2022-06-10T09:30:00 0.05 2022-06-10T09:30:00 0.05 0.0 0.05
4 DAPA01-001 2022-06-10T09:51:00 0 2022-06-10T09:30:00 0.35 2022-06-10T09:30:00 0.35 0.0 0.35
5 DAPA01-001 2022-06-10T10:00:00 0 2022-06-10T09:30:00 0.5 2022-06-10T09:30:00 0.5 0.0 0.5

Firstly, the FANLDTM column is filled across the combined dataset based on the minimum value, along with NFRLT which is represented in a new column min_NFRLT. Then the relative time variables are derived and their missing and negative values handled accordingly.

Relative Time Variables Column Label Reference Data
AFRLT Actual Time Relative to First Dose FANLDTM
APRLT Actual Time Relative to Previous Dose ADTM_prev
NPRLT Nominal Time Relative to Previous Dose NFRLT_prev
Note

@passmissing macro helps to skip missing values creating a new column or modifying an existing column.

8 Derive Analysis Variables

Here we walk through the derivation of the core analysis and flag variables that the modelling step will consume. We compute the actual dose carried on each record (DOSEA), the analysis lower limit of quantification (ALLOQ), the compartment and amount columns (CMT, AMT), the below-quantification flags (BLQFL, BLQFN), the dependent variable and its log transform (DV, DVL), the missing-dependent-variable flag (MDV), and the analysis unit (AVALU).

adppk_aval = @chain adppk_aprlt begin

    @orderby :USUBJID :ADTM

    @rtransform @astable begin
        # Derive Actual Dose
        :DOSEA =
            (:EVID == 1) ? :EXDOSE : ismissing(:EXDOSE_prev) ? :EXDOSE_first : :EXDOSE_prev

        # Analysis Lower Limit of Quantification
        :ALLOQ = :PCLLOQ

        # Compartment and Amount
        :CMT = (:EVID == 1) ? 1 : 2
        :AMT = (:EVID == 1) ? :EXDOSE : missing
    end

    # Below Lower Limit of Quantification Flag
    @rtransform @passmissing :BLQFL = (:PCSTRESN <= :ALLOQ) ? "Y" : "N"
    @rtransform @passmissing :BLQFN = (:PCSTRESN <= :ALLOQ) ? 1 : 0

    # Dependent Variable
    @rtransform :DV = (:EVID == 1) ? missing : :PCSTRESN

    # Log Transformed Dependent Variable
    @rtransform @passmissing :DVL = (:DV > 0) ? log(:DV) : missing

    # Missing Dependent Variable Result
    @rtransform :MDV = (:EVID == 1) ? 1 : ismissing(:DV) ? 1 : 0

    # Analysis Variable Unit
    @rtransform :AVALU = (:EVID == 1) ? :EXDOSU : :PCSTRESU

    @orderby :USUBJID :DRUG :ADTM :EVID
end;
5×13 DataFrame
Row USUBJID ADTM EVID DOSEA ALLOQ CMT AMT BLQFL BLQFN DV DVL MDV AVALU
String15 DateTime Int64 Float64 Float64? Int64 Float64? String? Int64? Float64? Float64? Int64 InlineSt…
1 DAPA01-001 2022-06-10T09:30:00 0 5.0 0.1 2 missing N 0 157.021 5.05638 0 ng/mL
2 DAPA01-001 2022-06-10T09:30:00 1 5.0 missing 1 5.0 missing missing missing missing 1 mg
3 DAPA01-001 2022-06-10T09:33:00 0 5.0 0.1 2 missing N 0 141.892 4.95507 0 ng/mL
4 DAPA01-001 2022-06-10T09:51:00 0 5.0 0.1 2 missing N 0 116.228 4.75555 0 ng/mL
5 DAPA01-001 2022-06-10T10:00:00 0 5.0 0.1 2 missing N 0 109.353 4.69458 0 ng/mL

Important analysis and flag variables are derived under this step.

Warning

Compartment CMT depends on concentration specimen and route of administration which can vary from study to study. Derivation must be done accordingly.

9 ADPPK Dataset

In the final assembly step of the tutorial we produce the analysis-ready ADPPK dataset. We add the record and analysis sequence columns (RECSEQ, ASEQ), fill the constant dose covariates within each subject-drug group, rename columns to match the ADPPK standard, select and order the columns in a regulatory-style layout, and round the Float64 columns to three decimal places using round_columns.

adppk = @chain adppk_aval begin

    @orderby :USUBJID :DRUG :ADTM :EVID

    transform(eachindex => :RECSEQ)
    transform(groupby(_, [:USUBJID, :DRUG]), eachindex => :ASEQ)

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :EXROUTE => (x -> first(skipmissing(x))) => :EXROUTE,
        :EXDOSFRM => (x -> first(skipmissing(x))) => :EXDOSFRM,
        :EXDOSFRQ => (x -> first(skipmissing(x))) => :EXDOSFRQ,
    )

    # Rename
    rename(:DRUG => :PROJID, :EXDOSFRQ => :DOSEFRQ, :EXROUTE => :ROUTE, :EXDOSFRM => :FORM)

    # Select Columns
    select(
        # exclusion flags
        :EXCLF,
        :EXCLFCOM,
        # subject level
        :STUDYID,
        :USUBJID,
        :ASEQ,
        :PROJID,
        # dose details
        :DOSEFRQ,
        :ROUTE,
        :FORM,
        :DOSEA,
        :AMT,
        :CMT,
        :EVID,
        # time details
        :AVISITN,
        :AFRLT,
        :APRLT,
        :NFRLT,
        :NPRLT,
        :ADTM,
        :ATM,
        :FANLDTM,
        # conc details
        :DV,
        :DVL,
        :MDV,
        :ALLOQ,
        :BLQFL,
        :BLQFN,
    )

    round_columns(3)
end

first(adppk, 5)
1
Sequence columns are derived (for excluded rows, the values are missing):
2
Constant dose covariates are filled across the dataset based on grouping.
3
Several columns are renamed according to ADPPK standards.
4
The required columns are selected and positioned in a specific order.
5
Float64 columns are rounded across the DataFrame using the round_columns function. The number of digits to retain after the decimal point is specified (3 in this case).
5×27 DataFrame
Row EXCLF EXCLFCOM STUDYID USUBJID ASEQ PROJID DOSEFRQ ROUTE FORM DOSEA AMT CMT EVID AVISITN AFRLT APRLT NFRLT NPRLT ADTM ATM FANLDTM DV DVL MDV ALLOQ BLQFL BLQFN
Float64 Missing String7 String15 Float64 String15 String7 String15 String15 Float64 Float64? Float64 Float64 Float64 Float64 Float64? Float64 Float64 DateTime Time DateTime Float64? Float64? Float64 Float64? String? Float64?
1 0.0 missing DAPA01 DAPA01-001 1.0 Dapagliflozin ONCE Intravenous Injection 5.0 missing 2.0 0.0 2.0 0.0 missing 0.0 0.0 2022-06-10T09:30:00 09:30:00 2022-06-10T09:30:00 157.021 5.056 0.0 0.1 N 0.0
2 0.0 missing DAPA01 DAPA01-001 2.0 Dapagliflozin ONCE Intravenous Injection 5.0 5.0 1.0 1.0 2.0 0.0 missing 0.0 0.0 2022-06-10T09:30:00 09:30:00 2022-06-10T09:30:00 missing missing 1.0 missing missing missing
3 0.0 missing DAPA01 DAPA01-001 3.0 Dapagliflozin ONCE Intravenous Injection 5.0 missing 2.0 0.0 2.0 0.05 0.05 0.05 0.05 2022-06-10T09:33:00 09:33:00 2022-06-10T09:30:00 141.892 4.955 0.0 0.1 N 0.0
4 0.0 missing DAPA01 DAPA01-001 4.0 Dapagliflozin ONCE Intravenous Injection 5.0 missing 2.0 0.0 2.0 0.35 0.35 0.35 0.35 2022-06-10T09:51:00 09:51:00 2022-06-10T09:30:00 116.228 4.756 0.0 0.1 N 0.0
5 0.0 missing DAPA01 DAPA01-001 5.0 Dapagliflozin ONCE Intravenous Injection 5.0 missing 2.0 0.0 2.0 0.5 0.5 0.5 0.5 2022-06-10T10:00:00 10:00:00 2022-06-10T09:30:00 109.353 4.695 0.0 0.1 N 0.0

10 Conclusion

This tutorial walked through a minimal end-to-end ADPPK build with ADaM.jl:

  • Reading PC and EX from the DAPA01 study and pulling VISITDY from EX into PC via a lookup table.
  • Deriving analysis date/time variables (ADTM, ADT, ATM) and nominal time variables (NFRLT, AVISIT, AVISITN).
  • Combining PC and EX, then flagging records with set_exclusion for missing concentrations, missing dosing, and missing samples.
  • Looking up previous-dose references with join_columns to derive AFRLT, APRLT, and NPRLT.
  • Deriving analysis variables (DOSEA, CMT, AMT, DV, DVL, MDV, BLQFL, BLQFN) and assembling the final ADPPK dataset with sequence columns and a regulatory-style column order.

Because this study has no additional source datasets, no baseline covariates were added. In a typical analysis you would extend this template by merging DM, VS, LB, and other SDTM domains to derive demographic and time-varying covariates, then re-running the downstream exclusion and analysis-variable steps.