`using Pumas`

# Poisson Regression

This lesson is really comprehensive and has a wide audience in mind. We recommend the reader to briefly navigate the Table of Contents and choose which topics to dive in. If you are in doubt feel free to read it from top to bottom.

In this lesson we’ll dive into **Poisson regression**. We’ll start by defining *what it is* and, more importantly, *when to use it*.

## 1 What is Poisson Regression?

Poisson regression is a **generalized linear model** (GLM). GLMs uses a linear regression under the hood for inference but combines that with a **non-linear transformation of the dependent variable** (also called *response*).

This allows us to extend the interface of linear regression to more use cases that it would not be possible to apply it directly. This happens because of the nature of some dependent variables, being *not* linear and, subsequently, not *continuous*. Such instances occurs in binary data, discrete data, count data, or ordinal data.

**Poisson Regression** is a GLM that can handle **count data**. That is, when our response variable takes positive discrete values only, without an upper limit and with a natural lower limit of 0. It handles that by using the **Poisson distrisbution** to condition the values of our **dependent variable**.

### 1.1 Poisson Distribution

The Poisson distribution is named after mathematician Siméon Denis Poisson, whom had as advisor no less than Laplace. Poisson made important contributions not only to probability and statistics but also partial differential equations, fluid dynamics and more.

The Poisson distribution is used to **model a random variable that takes positive discrete values**. It accepts only a single parameter named \(\lambda\) which represents the constant mean rate of events in a single unit of time. In other words, \(\lambda\) represents the expected value of the distribution.

The parameter \(\lambda\) from Poisson distribution controls *both* the **mean** (sometimes referred as location parameter) and **variance** (sometimes referred as scale parameter). This makes the Poisson distribution not as flexible to model **overdispersion** in count data, since it has only one parameter to model both mean and variance. We can deal with overdispersion by introducing an extra parameter and, thus, using a different distribution. This distribution is named **negative binomial** and we have a separated tutorial for that. Don’t forget to also check it.

Take a look at some Poisson distribution plots for different values of \(\lambda\):

```
using Distributions
using CairoMakie
using AlgebraOfGraphics
```

```
= 0:0.1:10.0
x = [1, 3, 5]
λs = [Poisson(λ) for λ in λs]
dists = mapreduce(d -> pdf.(d, x), vcat, dists)
pdfs =
plt data((; λs = repeat(λs; inner = length(x)), x = repeat(x; outer = length(λs)), pdfs)) *
mapping(
:x => "outcome",
:pdfs => "PMF";
= :λs => nonnumeric => L"\mathrm{parameter}~\lambda",
color = :λs => nonnumeric,
col *
) visual(BarPlot)
draw(plt; axis = (; xticks = 0:10), legend = (; position = :bottom, titleposition = :left))
```

As you can see, the higher the value of \(\lambda\), higher the mean but also higher the variance of the outcome value.

### 1.2 Link Function \(e^x\)

The Poisson distribution has one caveat: **\(\lambda\) has to be positive real number higher than 0; i.e. \(\lambda > 0\)**. This means that any parameter (or combinations of parameters) that we condition our Poisson distribution have to be higher than 0. We can accomplish this restriction with a non-linear transformation.

The exponential function is an ideal candidate to accomplish that. This is the **exponential function**:

\[\mathrm{exp}(x) = e^x\]

Let’s take a look at the graph of our exponential function:

```
data((; x = 0:0.1:10, y = exp.(0:0.1:10))) *
mapping(:x, :y => L"e^x") *
visual(Lines; color = :steelblue, linewidth = 3) |> draw
```

Let’s define our dependent variable as \(y\) which follows a Poisson Distribution. We can use the **exponential function as a link function** for the parameters of the model, such that \(y\) is parameterized by the exponential transformation of the parameter values:

\[y \sim \mathrm{Poisson}(e^\alpha)\]

That allows us to **parametrize \(y\) with a real-valued parameters**, while **restraining \(y\) to being positive**.

In this case, \(\alpha\) takes the role of an **intercept**, i.e. it is the basal rate of the values of \(y\).

### 1.3 Covariate \(x\)

Now with the dependent variable and intercept \(\alpha\) explained, let’s focus on **covariate(s) \(x\)**.

Generally in regression problems, we attribute coefficients to act as *weights* to our covariate \(x\). These coefficients measure the strength of association between \(x\) and \(y\), where high *positive* coefficients means high *positive* association while high *negative* coefficients indicates high *negative* association.

Coefficients are usually denominated as the greek letter \(\beta\). Thus, we can parametrize \(y\) as:

\[y \sim \mathrm{Poisson}\left( e^\left(\alpha + \beta \cdot x \right) \right)\]

Almost everything that we want to infer or guess in statistics has greek letter such as \(\beta\) and \(\alpha\), while what is observed has roman letters such as \(x\) and \(y\).

We can express it in matrix form:

\[\mathbf{y} = \mathrm{Poisson}\left( e^\left(\boldsymbol{\alpha} + \mathbf{X} \cdot \boldsymbol{\beta} \right) \right)\]

where \(\mathbf{y}\), \(\boldsymbol{\alpha}\) and \(\boldsymbol{\beta}\) are vectors and \(\mathbf{X}\) is the data matrix where each row is an observation and each column a covariate.

Since \(\mathbf{y}\) and \(\mathbf{X}\) are known, all we have to do is to **find the best value of our coefficient \(\beta\)**. That process is called **model fit** and we will deal with it in details further in this lesson.

#### 1.3.1 How to Interpret Coefficient \(\beta\)?

Now, suppose we have found our ideal value for \(\beta\). **How we would interpret our \(\beta\) estimated value?**

First, to recap, \(\beta\) measures the strength of association between the covariate \(x\) and dependent variable \(y\). But, \(\beta\) is nested inside a non-linear transformation called exponential function:

\[\mathrm{linear~predictor} = \mathrm{exponential}(\alpha + \beta \cdot x) = e^{\left( \alpha + \beta \cdot x \right)}\]

where the linear predictor is the rate parameter (\(\lambda\)) of the Poisson distribution that we condition our dependent variable \(y\).

You can see that if \(\beta\) takes *negative* values, the association between covariate \(x\) with the linear predictor is negative, i.e. higher values of \(x\) means lower values of \(y\). Also, the converse is true. If \(\beta\) takes *positive* values, the association between covariate \(x\) with the linear predictor is positive, i.e. higher values of \(x\) means higher values of \(y\).

Thus, if you want just to analyze the **raw directional effect** of covariate \(x\) you can leave \(\beta\) in its original scale. However, if you want to have a deeper interpretation of the effect of covariate \(x\) you would need to transform \(\beta\).

First, we need to make \(\beta\) comparable to our dependent variable \(y\). Since \(y\) is modeled as the exponential of a linear combination of parameters and covariates, we need to also apply the exponential function to \(\beta\).

Second, notice that the exponential function has the following property:

\[e^{(a + b)} = e^a \cdot e^b\]

Hence, we can substitute \(a\) for our intercept \(\alpha\) and \(b\) for our covariate \(x\) multiplied by our coefficient \(\beta\):

\[e^{(\alpha + \beta \cdot x)} = e^\alpha \cdot e^{\beta \cdot x}\]

which means that the \(\beta\) has a ** compounding effect** into the basal value (intercept \(\alpha\)), instead of a

*linear*effect as we saw in other discrete models.

Now, in order to interpret \(\beta\) we need to first exponentiate all other parameters in the linear predictor term, such as the intercept \(\alpha\). \(\alpha\) is the basal value take \(y\) takes if all of the covariates \(x\) are set to zero. Then, for each increase of 1 unit of covariate \(x\) we expect to see a compounding effect of \(\beta\) into the basal value of \(y\), i.e. the intercept \(\alpha\).

Here’s an example. Suppose that \(\alpha = 1\) and \(\beta = 0.05\). This makes the basal value of \(y\):

\[e^\alpha = e^1 = e \approx 2.7182\]

The effect of the covariate \(x\) on \(y\) is:

\[e^{\beta \cdot x} = e^{0.05x} \approx 1.05x\]

Thus every increase of 1 unit in \(x\) increases \(y\) by 1.05, i.e. 5% increase.

## 2 When to use Poisson Regression?

Now that we know what Poisson regression is, let’s cover *when* to use it. The simple answer is **whenever your dependent variable is discrete positive, i.e. have discrete positive outcome as values**. Then you can use a **Poisson likelihood** inside your regression model.

## 3 `seizurecount`

Dataset

Let’s use the `seizurecount`

dataset from `PharmaDatasets.jl`

to run our Poisson regression example in this lesson:

```
using DataFrames
using DataFramesMeta
using PharmaDatasets
```

```
= dataset("seizurecount")
seizurecount first(seizurecount, 5)
```

Row | ID | Age | Gender | isF | TRT | NSeizure |
---|---|---|---|---|---|---|

Int64 | Int64 | String7 | Int64 | String7 | Int64 | |

1 | 1 | 43 | Female | 1 | Placebo | 6 |

2 | 2 | 55 | Male | 0 | Placebo | 2 |

3 | 3 | 76 | Female | 1 | Placebo | 5 |

4 | 4 | 93 | Male | 0 | Placebo | 0 |

5 | 5 | 95 | Female | 1 | Placebo | 2 |

We can see that we have 200 observations and also some key variables:

(`:NSeizure`

**dependent variable**): discrete positive outcome values are in the range (0, 16).: Age in years.`:Age`

: patient gender with 2 levels – “Female, Male”.`:Gender`

: discrete binary if patient`:isF`

**is F**emale – “1, 0”.: treatment variable with 2 levels – “Placebo, DrugA”.`:TRT`

### 3.1 Exploratory Data Analysis

Let’s see some summary statistics with `describe()`

:

`describe(seizurecount)`

Row | variable | mean | min | median | max | nmissing | eltype |
---|---|---|---|---|---|---|---|

Symbol | Union… | Any | Union… | Any | Int64 | DataType | |

1 | ID | 75.5 | 1 | 75.5 | 150 | 0 | Int64 |

2 | Age | 48.4733 | 18 | 45.0 | 112 | 0 | Int64 |

3 | Gender | Female | Male | 0 | String7 | ||

4 | isF | 0.54 | 0 | 1.0 | 1 | 0 | Int64 |

5 | TRT | DrugA | Placebo | 0 | String7 | ||

6 | NSeizure | 3.39333 | 0 | 2.0 | 16 | 0 | Int64 |

As we can see `:Age`

has mean 48.47 and median 45.0 which signals for a bell-shaped (normally-distributed) distribution. We have weight values ranging from 18 and 112.

To conclude our summary statistics, `:NSeizure`

, our dependent variable, has mean 3.39 and median 2.0 which implies a long-tailed distribution (or assimetric shifted to the right).

It is also important to check the categorical columns to see what are the number of unique values (`:nunique`

) along with first and last values:

`describe(seizurecount, :first, :last, :nunique; cols = [:Gender, :TRT])`

Row | variable | first | last | nunique |
---|---|---|---|---|

Symbol | String7 | String7 | Int64 | |

1 | Gender | Female | Female | 2 |

2 | TRT | Placebo | DrugA | 2 |

Ok, no surprises in our categorical covariates.

`@by seizurecount [:TRT, :isF] :count = length(:isF)`

Row | TRT | isF | count |
---|---|---|---|

String7 | Int64 | Int64 | |

1 | Placebo | 0 | 24 |

2 | Placebo | 1 | 26 |

3 | DrugA | 0 | 45 |

4 | DrugA | 1 | 55 |

### 3.2 Visualizations

Now we turn to some visualizations to see what is going on beyond the summary statistics in our `seizurecount`

dataset.

First, we’ll explore our dependent variable: `:NSeizure`

.

```
data(seizurecount) *
mapping(:Gender, :NSeizure; color = :TRT, dodge = :TRT) *
expectation() |> draw AlgebraOfGraphics.
```

Female patients have more seizures than male patients. And male patients seems to not benefit from out treament `"DrugA"`

.

```
data(seizurecount) *
mapping(:NSeizure; color = :TRT, layout = :Gender) *
density() |> draw AlgebraOfGraphics.
```

We can definitely see a difference between the densities of our dependent variable `:NSeizures`

if we stratify by `:TRT`

and `:Gender`

.

Now let’s turn our attention to our covariates `:Age`

and `:TRT`

.

We can see clearily a negative relationship between `:Age`

and our dependent variable `:NSeizure`

:

```
data(seizurecount) *
mapping(:Age, :NSeizure; color = :TRT) *
visual(Scatter; alpha = 0.7) + AlgebraOfGraphics.linear() * visual(; linewidth = 4)) |>
( draw
```

Also, this negative relationship does not appear to have any interaction effect with `:TRT`

since the slopes of the different `:TRT`

arms in the scatter plot above do not differ.

However, if we stratify by `:Gender`

, we can see that the interaction exists in males. The slopes have different values for different `:TRT`

arms in this specific stratum:

```
data(seizurecount) *
mapping(:Age, :NSeizure; color = :TRT, layout = :Gender) *
visual(Scatter; alpha = 0.7) + AlgebraOfGraphics.linear() * visual(; linewidth = 4)) |>
( draw
```

## 4 Pumas modeling

Finally, we’ll **create a Pumas model for Poisson regression**.

### 4.1 Pumas’ Workflow

Pumas’ workflow

**Define a model**. It can be a`PumasModel`

or a`PumasEMModel`

.**Define a**.`Subject`

and`Population`

**Fit your model**.**Perform inference on the model’s population-level parameters**(also known as*fixed*effects).**Predict from a fitted model**using the empirical Bayes estimate or**Simulate random observations from a fitted model**using the sampling distribution (also known as*likelihood*).**Model diagnostics**and**visual predictive checks**.

**Pumas modeling is a highly iterative process**. Eventually you’ll probably go back and forth in the stages of workflow. This is natural and expected in Pumas’ modeling. Just remember to have a good code organization and version control for all of the workflow iterations.

Before we define the model, we need to convert the `:TRT`

column into a dummy variable. Currently `:TRT`

is a string:

`unique(seizurecount.TRT)`

```
2-element Vector{InlineStrings.String7}:
"Placebo"
"DrugA"
```

`@rtransform! seizurecount :DrugA = ifelse(:TRT == "DrugA", 1, 0)`

#### 4.1.1 Defining a model in Pumas

To define a model in Pumas, you can use either the **macros** or the **function** interfaces. We will cover the macros interface in this lesson.

To define a model using the macro interface, you start with a `begin .. end`

block of code with `@model`

:

`= @model begin end my_model `

```
PumasModel
Parameters:
Random effects:
Covariates:
Dynamical variables:
Derived:
Observed:
```

This creates an **empty model**. Now we need to populate it with model components. These are additional macros that we can include inside the `@model`

definition. We won’t be covering all the possible macros that we can use in this lesson, but here is the full list:

, fixed effects specifications.`@param`

, random effects specifications.`@random`

, covariate names.`@covariates`

, pre-processing variables for the dynamic system and statistical specification.`@pre`

, specification of any dose control parameters present in the model.`@dosecontrol`

, shorthand notation.`@vars`

, initial conditions for the dynamic system.`@init`

, dynamics of the model.`@dynamics`

, statistical modeling of dependent variables.`@derived`

, model information to be stored in the model solution.`@observed`

##### 4.1.1.1 Model parameter with `@param`

Ok, let’s start with our `poisson_model`

using the macro `@model`

interface.

First, we’ll define our **parameters with the @parameters macro**:

```
@param begin
∈ Domain(...)
ParameterName ...
end
```

To use the “in” (`∈`

) operator in Julia, you can either replace `∈`

for `in`

or type `\in <TAB>`

for the Unicode character.

By using the `begin ... end`

block we can specify one parameter per line. We will just name the parameters the same name as it’s respective covariate, while also adding the `β`

prefix to make clear that this is a coefficient.

Regarding the `Domain(...)`

, Pumas has several types of domains for you to specify in your `@param`

block. Here is a list:

for scalar parameters`RealDomain`

for vectors`VectorDomain`

for positive definite matrices with diagonal structure`PDiagDomain`

for general positive semi-definite matrices`PSDDomain`

We will only use scalar parameters in our Poisson regression model, so we just need the `RealDomain`

.

If you don’t specify any arguments inside the domain constructor it will either error (for some domains that have *required* arguments) or will use the defaults. In the case of the `RealDomain()`

without arguments it just uses the following arguments:

`RealDomain(; lower = -∞, upper = ∞, init = 0)`

```
@param begin
∈ RealDomain()
α ∈ RealDomain()
βAge ∈ RealDomain()
βisF ∈ RealDomain()
βDrugA end
```

##### 4.1.1.2 Model covariates with @covariates

We also have to specify our **model covariates using the macro @covariates**. This can be done using a one-liner syntax, without

`begin ... end`

block:`@covariates Age isF DrugA... # and so on`

Or using the `begin ... end`

block and specifying one covariate per line. This is what we will do, since our model has several covariates, which makes easy to visualize them and also to comment out a covariate to perform iterative model exploration:

```
@covariates begin
Age
isF
DrugAend
```

##### 4.1.1.3 Pre-processing variables with `@pre`

We can specify all the **necessary variable and statistical pre-processing with the @pre macro**.

The `@pre`

block is traditionally used to specify the inputs of the **O**rdinary **D** ifferential **E**quations (ODE) system used for **n**on-**l**inear **m**ixed-**e**ffects PK/PD models (NLME).

But we will use `@pre`

to specify our variable transformations instead.

Here, we are defining all the coefficients-covariates multiplication operation here. Note that we are adding the prefix `_`

to the resulting operation. Finally, we are also defining the `linear_pred`

as the linear predictors that should be added to the intercept and passed to the non-linear exponential function transformation:

```
@pre begin
= Age * βAge
_Age = isF * βisF
_isF = DrugA * βDrugA
_DrugA = _Age + _isF + _DrugA
linear_pred end
```

##### 4.1.1.4 Statistical modeling of dependent variables with `@derived`

Our final block, ** @derived**, is used to

**specify all the assumed distributions of observed variables that are derived from the blocks above**. This is where we include our

**dependent variable:**.

`NSeizures`

Note that `NSeizures`

is being declared as following a `Poisson`

distribution. This is why we use the tilde notation `~`

. It means (much like the mathematical model notation) that `NSeizures`

follows a `Poisson`

distribution. Since `NSeizures`

is a vector of values, we need to broadcast, i.e. vectorize, the operation with the dot `.`

operator:

`~ Poisson.(exp.(λ)) NSeizures `

where `λ`

is the single rate parameter that parametrizes the Poisson distribution. In some cases we would need to do a lot of broadcasting in the righthand side of the `~`

. For example, since `NSeizures`

depends on the exponential function of all the linear predictors, i.e. the exponential function of `α + linear_pred`

, we would need to vectorize everything:

`~ Poisson.(exp.(α .+ linear_pred)) NSeizures `

This is cumbersome, so we can use the `@.`

macro which tells Julia to apply the `.`

in every operator and function call after it:

`~ @. Poisson(exp(α + linear_pred)) NSeizures `

We are using the `@.`

macro which tells Julia to vectorize (add the “dot syntax”) to all operations and function calls to the right of it.

Much better!

In this block we can use all variables defined in the previous blocks, in our example the `@param`

, `@covariates`

and `@pre`

blocks.

This is our `@derived`

block:

```
@derived begin
~ @. Poisson(exp(α + linear_pred))
NSeizure end
```

Here is our **full model**, once we combined all the macro blocks we just covered:

```
= @model begin
poisson_model @param begin
∈ RealDomain()
α ∈ RealDomain()
βAge ∈ RealDomain()
βisF ∈ RealDomain()
βDrugA end
@covariates begin
Age
isF
DrugAend
@pre begin
= Age * βAge
_Age = isF * βisF
_isF = DrugA * βDrugA
_DrugA = _Age + _isF + _DrugA
linear_pred end
@derived begin
~ @. Poisson(exp(α + linear_pred))
NSeizure end
end
```

```
PumasModel
Parameters: α, βAge, βisF, βDrugA
Random effects:
Covariates: Age, isF, DrugA
Dynamical variables:
Derived: NSeizure
Observed: NSeizure
```

#### 4.1.2 Define a `Subject`

and `Population`

Once we have our model defined we have to specify a `Subject`

and a `Population`

.

In Pumas, subjects (i.e. observations) are represented by the `Subject`

type and collections of subjects are represented as vectors of `Subject`

s are defined as `Population`

.

** Subjects** can be constructed with the

`Subject()`

constructor, for example:`Subject()`

```
Subject
ID: 1
```

We just constructed a `Subject`

that has `ID`

equal to `1`

and no extra information. In order to provide extra information, you would need to pass **keyword arguments** to the `Subject()`

constructor.

You can use the following keyword arguments inside `Subject()`

:

: identifier.`id`

: a named tuple of the dependent variables.`observations`

: a named tuple containing the covariates, or`covariates`

`nothing`

.: a vector of`events`

`Event`

s.: a vector of time stamps for the observations.`time`

For this lesson, we will just use the `covariates`

keyword argument.

This is an example of a `Subject`

with `ID`

equal to `42`

and with some `covariates`

specified as a named tuple:

`Subject(id = 42, covariates = (; Age = 35, isF = 1, DrugA = 1))`

```
Subject
ID: 42
Covariates: Age, isF, DrugA
```

**Since a Population is just a vector of Subjects**, we can use a simple

`map()`

function over the `ID`

s present at our `seizurecount`

dataset:```
= map(
pop -> Subject(
i = i,
id = (;
covariates = seizurecount.Age,
Age = seizurecount.isF,
isF = seizurecount.DrugA,
DrugA
),
),1:maximum(seizurecount.ID),
)
```

```
Population
Subjects: 150
Covariates: Age, isF, DrugA
Observations:
```

##### 4.1.2.1 Reading `Subject`

s directly from a `DataFrame`

Another option is to use the ** read_pumas()** function to read

`Subject`

s (and `Population`

) directly from a `DataFrame`

. This is more convenient than the `map()`

with the `Subject()`

constructor inside.The `read_pumas()`

function accepts as first argument a `DataFrame`

followed by the following keyword options (which are very similar to the `Subject()`

constructor):

: dependent variables specified by a vector of column names, i.e.`observations`

`[:NSeizure]`

.: covariates specified by a vector of column names, i.e.`covariates`

`[:Age, :isF, :DrugA]`

.: specifies the`id`

`ID`

column of the`DataFrame`

.: specifies the`time`

`time`

column of the`DataFrame`

.: toggles assertions applicable to event data, either`event_data`

`true`

or `false.

Our `seizurecount`

dataset do not have any event data, so we should set `event_data = false`

. Also, we don’t have a `:time`

column, but Pumas need us to specify a column for this keyword argument. So let’s create a dummy column `:time`

where all values are `0.0`

:

`@rtransform! seizurecount :time = 0.0`

Now, we can create our population with `read_pumas()`

:

```
= read_pumas(
seizurecount_pop
seizurecount;= [:NSeizure],
observations = [:Age, :isF, :DrugA],
covariates = :ID,
id = :time,
time = false,
event_data )
```

```
Population
Subjects: 150
Covariates: Age, isF, DrugA
Observations: NSeizure
```

#### 4.1.3 Fit your model

Now we are ready to fit our model! We already have a model specified, `poisson_model`

, along with a `Population`

: `seizurecount_pop`

. We can proceed with **model fitting**.

Model fiting in Pumas has the purpose of **estimate parameters** and is done by calling the ** fit() function** with the following positional arguments:

- Pumas
**model**. - a
.`Population`

- a named tuple of the
**initial parameter’s values**. - an
**inference algorithm**.

##### 4.1.3.1 Initial Parameter’s Values

We already covered model and `Population`

, now let’s talk about **initial parameter’s values**. It is the **3rd positional argument inside fit()**.

You can specify you initial parameters as a named tuple. For instance, if you want to have a certain parameter, `β1`

, as having an initial value as `3.14`

, you can do so by passing it inside a named tuple in the 3rd positional argument of `fit()`

:

`fit(model, population, (; β1 = 3.14))`

You can also use the helper function ** init_params()** which will recover all the initial parameters we specified inside the model’s

`@param`

block.Since we used the `RealDomain()`

defaults, all of our example’s initial parameters are set to `0.0`

.

##### 4.1.3.2 Inference Algorithms

Finally, our **last (4th) positional argument is the choice of inference algorithm**.

Pumas has the following available inference algorithms:

**Marginal Likelihood Estimation**:: first order approximation without random-effects.`NaivePooled()`

: first-order approximation.`FO()`

: first-order conditional estimation with automatic interaction detection.`FOCE()`

: second-order Laplace approximation.`LaplaceI()`

**Bayesian with Markov Chain Monte Carlo (MCMC)**:: MCMC using`BayesMCMC()`

**N**o-**U**-**T**urn**S**ampler (NUTS).

We can also use a **M**aximum **A** **P**osteriori (MAP) estimation procedure for any marginal likelihood estimation algorithm. You just need to call the `MAP()`

constructor with the desired marginal likelihood algorithm inside, for instance:

`fit(model, population, init_params(model), MAP(FOCE()))`

Ok, we are ready to **fit our model**. Let’s use the `NaivePooled()`

since we are not using random-effects in this lesson:

```
=
poisson_fit fit(poisson_model, seizurecount_pop, init_params(poisson_model), Pumas.NaivePooled())
```

```
[ Info: Checking the initial parameter values.
[ Info: The initial negative log likelihood and its gradient are finite. Check passed.
Iter Function value Gradient norm
0 7.084862e+02 1.520100e+04
* time: 0.019721031188964844
1 5.542122e+02 9.418562e+03
* time: 0.35043787956237793
2 5.389460e+02 4.901604e+03
* time: 0.35077500343322754
3 5.282896e+02 1.137961e+03
* time: 0.35105204582214355
4 5.263390e+02 4.599655e+02
* time: 0.35129690170288086
5 5.239327e+02 1.328781e+03
* time: 0.3515899181365967
6 5.112082e+02 4.376380e+03
* time: 0.3519399166107178
7 4.946020e+02 6.231816e+03
* time: 0.3522498607635498
8 4.738243e+02 5.619874e+03
* time: 0.35251283645629883
9 4.608481e+02 2.532336e+03
* time: 0.3528120517730713
10 4.576253e+02 3.761918e+02
* time: 0.35317492485046387
11 4.571941e+02 1.579012e+02
* time: 0.35350799560546875
12 4.566451e+02 6.052517e+02
* time: 0.35382795333862305
13 4.551647e+02 1.328673e+03
* time: 0.3540949821472168
14 4.518011e+02 2.255583e+03
* time: 0.35434985160827637
15 4.447513e+02 3.202994e+03
* time: 0.35460400581359863
16 4.344023e+02 3.431150e+03
* time: 0.35486698150634766
17 4.262636e+02 2.411983e+03
* time: 0.35511302947998047
18 4.223312e+02 9.614849e+02
* time: 0.35535597801208496
19 4.213184e+02 2.425184e+01
* time: 0.3555939197540283
20 4.212106e+02 6.211133e+01
* time: 0.35585498809814453
21 4.210555e+02 1.521072e+02
* time: 0.3561129570007324
22 4.206892e+02 2.831647e+02
* time: 0.3563649654388428
23 4.198621e+02 4.494193e+02
* time: 0.3566398620605469
24 4.184215e+02 5.611914e+02
* time: 0.35695385932922363
25 4.168899e+02 4.531058e+02
* time: 0.3572218418121338
26 4.161840e+02 1.660627e+02
* time: 0.35749292373657227
27 4.160731e+02 9.336645e+00
* time: 0.3577589988708496
28 4.160686e+02 3.322691e+00
* time: 0.35805392265319824
29 4.160685e+02 4.550751e-01
* time: 0.3583669662475586
30 4.160685e+02 1.900354e-02
* time: 0.3586900234222412
31 4.160685e+02 2.946794e-04
* time: 0.3590219020843506
```

```
FittedPumasModel
Successful minimization: true
Likelihood approximation: NaivePooled
Log-likelihood value: -416.06855
Number of subjects: 150
Number of parameters: Fixed Optimized
0 4
Observation records: Active Missing
NSeizure: 150 0
Total: 150 0
---------------------
Estimate
---------------------
α 2.1503
βAge -0.018032
βisF 0.22675
βDrugA -0.35713
---------------------
```

#### 4.1.4 Perform inference on the model’s population-level parameters

Once the model is fitted, we can analyze our inference and estimates. But first, let’s discuss what the is the meaning of the estimated coefficients in a Poisson regression model.

##### 4.1.4.1 Interpreting the Coefficients

We can see that after the model is fitted, it prints a result with a summary and a table of the parameter estimates.

We can also recover the estimates as a named tuple with ** coef()**:

`coef(poisson_fit)`

```
(α = 2.150255882229772,
βAge = -0.018031843808920766,
βisF = 0.22675059376714593,
βDrugA = -0.35713447425647227,)
```

Or as a `DataFrame`

with ** coeftable()**:

`coeftable(poisson_fit)`

Row | parameter | estimate |
---|---|---|

String | Float64 | |

1 | α | 2.15026 |

2 | βAge | -0.0180318 |

3 | βisF | 0.226751 |

4 | βDrugA | -0.357134 |

Remember that the Poisson regression’s coefficients are the **log scale**, i.e. the **natural logarithm of the effect size**.

Log scales provide the facility that anything that is **negative** pulls down the base values of our dependent value \(y\). Anything **positive** pulls it up; and zero is a neutral effect. But since it is in a log scale it is not easily interpretable. This is why, in order for interpret it, often we undo the log transformation by **exponentiating the effects**, i.e. exponentiating the coefficients.

We can do that easily with the `DataFrame`

returned by `coeftable()`

:

`@rtransform coeftable(poisson_fit) :estimate_exp = exp(:estimate)`

Row | parameter | estimate | estimate_exp |
---|---|---|---|

String | Float64 | Float64 | |

1 | α | 2.15026 | 8.58706 |

2 | βAge | -0.0180318 | 0.98213 |

3 | βisF | 0.226751 | 1.25452 |

4 | βDrugA | -0.357134 | 0.699678 |

Now let us interpret the `:estimate_exp`

column.

Our basal value `α`

is the estimated number of seizures when all the other covariates are zero.

Let us now interpret the `βAge`

, the covariate `Age`

’s coefficient. It’s value is approximate 0.98, which means that for every increase in a single unit of `Age`

we can expect a decrease in 2% in the number of seizures. Numerically this is: \(\alpha \cdot 0.98 \cdot \mathrm{Age}\). For instance, a 30 increase in `Age`

would reduce the basal value \(8.58\) to \((8.58 - ( 8.58 \cdot 0.98)) \cdot 30 = 5.15\). Note that this association has the assumption that all other covariates are fixed/controlled.

The `βisF`

, which represents the effect for the dummy variable `isF`

. The exponentiated coefficient value is 1.25, which means that for every unit of increase in the covariate `isF`

we can expected on average an increase of 25% in the number of seizures. Since, `isF`

is a dummy variable taking either 0 or 1 as values, this means that female observations have, on average, 25% more seizures events than male observations.

Our `DrugA`

covariate, which is a binary representation of the observation being either on the placebo or treatment for drug A arm, has an exponentiated coefficient of 0.70. This means that observations treated with drug A, when all other covariates are controlled, have 30% less seizures than when treated with placebo.

We can also inspect our estimated coefficients’ **standard errors ( SE)** along with the

**95% confidence intervals**with the

**function:**

`infer()`

`infer(poisson_fit)`

```
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
```

```
Asymptotic inference results using sandwich estimator
Successful minimization: true
Likelihood approximation: NaivePooled
Log-likelihood value: -416.06855
Number of subjects: 150
Number of parameters: Fixed Optimized
0 4
Observation records: Active Missing
NSeizure: 150 0
Total: 150 0
------------------------------------------------------------------------
Estimate SE 95.0% C.I.
------------------------------------------------------------------------
α 2.1503 0.26682 [ 1.6273 ; 2.6732 ]
βAge -0.018032 0.0049478 [-0.027729; -0.0083343]
βisF 0.22675 0.16322 [-0.093163; 0.54666 ]
βDrugA -0.35713 0.16018 [-0.67108 ; -0.043187 ]
------------------------------------------------------------------------
```

Also if you prefer other confidence interval band, you can choose with the **keyword argument level inside infer()**.

For instance, one common band for the confidence intervals are 89%:

`infer(poisson_fit; level = 0.89)`

```
[ Info: Calculating: variance-covariance matrix.
[ Info: Done.
```

```
Asymptotic inference results using sandwich estimator
Successful minimization: true
Likelihood approximation: NaivePooled
Log-likelihood value: -416.06855
Number of subjects: 150
Number of parameters: Fixed Optimized
0 4
Observation records: Active Missing
NSeizure: 150 0
Total: 150 0
----------------------------------------------------------------------
Estimate SE 89.0% C.I.
----------------------------------------------------------------------
α 2.1503 0.26682 [ 1.7238 ; 2.5767 ]
βAge -0.018032 0.0049478 [-0.025939; -0.010124]
βisF 0.22675 0.16322 [-0.034113; 0.48761 ]
βDrugA -0.35713 0.16018 [-0.61313 ; -0.10114 ]
----------------------------------------------------------------------
```

#### 4.1.5 Predict from a fitted model or Simulate random observations from a fitted model

Now that we understand our model estimates, we can use it to make predictions or simulate random observations.

##### 4.1.5.1 Predictions with `predict()`

In order to calculate prediction from a fitted model, we need to use the `predict()`

function and passing **two positional arguments**:

- a fitted Pumas model.
- either a single
`Subject`

or a collection of them with a`Population`

.

For example to make predictions onto a single `Subject`

, we can:

`= Subject(id = 42, covariates = (; Age = 35, isF = 1, DrugA = 1)) my_subject `

```
Subject
ID: 42
Covariates: Age, isF, DrugA
```

`predict()`

also needs a **keyword argument obstimes** which represents the observation times that we want to predict the dependent variable. The value must be a vector. Since we used a dummy

`0.0`

value as `:time`

variable in our `read_pumas`

function, we can pass the same `0.0`

but as a vector to the `predict()`

function.`predict(poisson_fit, my_subject; obstimes = [0.0])`

```
SubjectPrediction
ID: 42
Predictions: NSeizure: (n=1)
Covariates: Age, isF, DrugA
Time: [0.0]
```

We can also generate prediction for a `Population`

:

`predict(poisson_fit, seizurecount_pop; obstimes = [0.0])`

```
Prediction
Subjects: 150
Predictions: NSeizure
Covariates: Age, isF, DrugA
```

##### 4.1.5.2 Simulations with `simobs()`

We can generate random observations from our fitted Pumas model with the `simobs()`

function. You need to supply at least **three positional arguments**:

- a
*non-fitted*Pumas model, in our case`poisson_model`

. - either a single
`Subject`

or a collection of them with a`Population`

. - a named tuple of parameter values.

The 1st and 2nd positional arguments we already covered in our previous examples. Let’s focus on the **3rd positional argument: param**. It can be a named tuple with the parameter values. For instance, we can use our estimated parameters from the fitted model with

`coef(poisson_fit)`

:`coef(poisson_fit)`

```
(α = 2.150255882229772,
βAge = -0.018031843808920766,
βisF = 0.22675059376714593,
βDrugA = -0.35713447425647227,)
```

`simobs(poisson_model, seizurecount_pop, coef(poisson_fit));`

You can also pass a single parameter as a Pumas *fitted* model to `simobs()`

.

`= simobs(poisson_fit); seizurecont_sim `

#### 4.1.6 Model diagnostics

Finally, our last step is to **assess model diagnostics**.

##### 4.1.6.1 Assessing model diagnostics

To assess model diagnostics we can use the `inspect()`

function in our fitted Pumas model:

```
= DataFrame(inspect(poisson_fit))
my_inspect first(my_inspect, 5)
```

```
[ Info: Calculating predictions.
[ Info: Calculating empirical bayes.
[ Info: Evaluating individual parameters.
[ Info: Evaluating dose control parameters.
[ Info: Done.
```

Row | id | time | evid | NSeizure | Age | isF | DrugA | NSeizure_pred | NSeizure_ipred | _Age | _isF | _DrugA | linear_pred |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

String | Float64 | Int64 | Float64? | Int64? | Int64? | Int64? | Float64? | Float64? | Float64? | Float64? | Float64? | Float64? | |

1 | 1 | 0.0 | 0 | 6.0 | 43 | 1 | 0 | 4.96115 | 4.96115 | -0.775369 | 0.226751 | -0.0 | -0.548619 |

2 | 10 | 0.0 | 0 | 5.0 | 50 | 1 | 0 | 4.37285 | 4.37285 | -0.901592 | 0.226751 | -0.0 | -0.674842 |

3 | 100 | 0.0 | 0 | 6.0 | 42 | 1 | 1 | 3.53437 | 3.53437 | -0.757337 | 0.226751 | -0.357134 | -0.887721 |

4 | 101 | 0.0 | 0 | 2.0 | 43 | 0 | 1 | 2.76697 | 2.76697 | -0.775369 | 0.0 | -0.357134 | -1.1325 |

5 | 102 | 0.0 | 0 | 0.0 | 55 | 0 | 1 | 2.22859 | 2.22859 | -0.991751 | 0.0 | -0.357134 | -1.34889 |

###### 4.1.6.1.1 AIC

We can also check for **A**kaike **I**nformation **C**riterion (AIC):

`aic(poisson_fit)`

`840.1370972852749`

###### 4.1.6.1.2 BIC

Besides AIC, there is also the **B**ayesian **I**nformation **C**riterion (BIC):

`bic(poisson_fit)`

`852.1796384616599`

##### 4.1.6.2 Visual predictive checks

To conclude, we can inspect visual predictive checks with the `vpc_plot()`

function. But first, we need to generate a `VPC`

object with the `vpc()`

function.

By default, `vpc`

will use `:time`

as the independent variable for VPC. We need to change that, because or `:time`

variable is just a vector of `0`

s in the Poisson regression.

Let’s use `:Age`

:

`= vpc(poisson_fit; covariates = [:Age], bandwidth = 85); vpc_poisson `

Now, we need to use the `vpc_plot`

function into our newly created `VPC`

object. For this, we need to import the package `PumasUtilities`

:

`using PumasUtilities`

Since our interval is not on \([0,1]\), we need to tell `vpc_plot`

to *not* use it with `unit_yaxis=false`

. Additionally, we change our `ylabel`

to represent a count instead of a proportion.

`vpc_plot(vpc_poisson; unit_yaxis = false, axis = (; ylabel = "Count NSeizure"))`

## 5 Conclusion

I hope you enjoyed this tutorial on Poisson regression. Whenever you have count data, this should be your first approach. We’ll explore a more robust and complex model named **negative binomial regression** next. If you are experiencing problems fitting your count data with Poisson regression you would also want to try it.