Extending the Standard Joint Model
In a previous post we have briefly introduced the framework of joint models for longitudinal and time-to-event data, and we have shown how models of this type can be fitted in R using package JMbayes. In that post we have presented what can be called the 'standard' joint model, in which we have one continuous longitudinal outcome and one survival endpoint. Starting from this standard joint model different extensions have been proposed in the literature to accommodate different types of longitudinal and survival data. The aim of this post is to present one of these extensions, namely joint models with categorical longitudinal responses, and how such models can be fitted using JMbayes.Generalized Linear Mixed Effects Models
Before discussing joint models with categorical longitudinal outcomes we need first to introduce the framework of Generalized Linear Mixed Models (GLMMs). GLMMs are an extension of linear mixed models to allow response variables from different distributions, such as binary responses. Alternatively, you could think of GLMMs as an extension of generalized linear models (e.g., logistic regression) to include both fixed and random effects (hence mixed models). Contrary to linear mixed models, fitting GLMMs under maximum likelihood is computationally a much more challenging task. This is due to the fact that in the definition of the observed data log-likelihood there is an integral over the (normally distributed) random effects, which does not have, in general, a closed-form solution (for linear mixed models this integral does have a closed-form solution which facilitates computations). Under the Bayesian approach there is essentially no difference between linear mixed models and GLMMs because in both the random effects are considered parameters for which we wish to derive their posterior distribution. In R there are several packages that can fit GLMMs under either maximum likelihood or the Bayesian approach, such as lme4 and MCMCglmm, among others. However, in order to fit joint models with categorical longitudinal response using package JMbayes the user needs first to fit the GLMMs using function glmmPQL() from package MASS. Even though the PQL algorithm has been shown not to work that optimally, especially for dichotomous longitudinal outcomes, the purpose of fitting the GLMMs with glmmPQL() is to extract from the resulting model object the response vector and the design matrices for the fixed and random effects, and initial values for the parameters. The following code illustrates the use of glmmPQL() for fitting a mixed effects logistic regression using the Primary Biliary Cirrhosis (PBC) data set. Since in the PBC data there was no binary biomarker recorded during follow-up, we artificially create one by dichotomizing serum bilirubin at the threshold value of 1.8 mg/dL.The syntax of glmmPQL() is similar to the one of function lme() from package nlme (the PQL algorithm actually requires fitting linear mixed effects model for a transformed response variable).
A Joint Model with a Binary Longitudinal Biomarker
To fit a joint model with a binary longitudinal outcome we need to appropriately define argument densLong of the jointModelBayes() function. This argument accepts a function that calculates the probability density function (and its natural logarithm) of the longitudinal outcome. The arguments of this function are y the vector of longitudinal responses, eta.y the subject-specific linear predictor (i.e., including both the fixed and random effects), scale a potential scale parameter (e.g., the measurement error standard deviation in the case of linear mixed models), log a logical argument denoting whether the pdf or the logarithm of the pdf is returned by the function, and data a data.frame that contains variables potentially used in the definition of densLong. For our example, the function we need to define is:Following we fit a Cox model that represents the survival submodel of the joint model and we put all the above as arguments to jointModelBayes(), i.e.,
In the output the association parameter Assoct denotes the log hazard ratio, at any time point t, for a unit increase in the log odds of having serum bilirubin above the threshold value of 1.8 mg/dL, at the same time point.