Probability of the response \(y\) according to the model given specific values of \(\theta\), \(\eta\) and \(x\).
\[ p_c(y \mid \theta, \eta, x) \]
Fit model by simply finding the values of \(\theta\) and \(\eta\) that maximizes the conditional probability?
This can horribly overfit the data, too many parameters!
Integrates out the effect of the random effects
\[ p_m (y \mid \theta, x) = \int \overbrace{p_c (y \mid \theta, \eta, x) \cdot p_{\text{prior}} (\eta \mid \theta)}^{\text{joint probability}} \, d\eta \]
Average conditional probability weighted by a prior.
Which model has the largest area under the curve?
A good model should have: