Introduction

Multi-state models are commonly used to analyze complex longitudinal survival data involving multiple events of interest ((P. K. Andersen et al. 1993); (Hougaard 2000); (H. Putter, Fiocco, and Geskus 2007); (Meira-Machado et al. 2009); (Meira-Machado and Sestelo 2019)). These models can be considered as a generalization of the survival process where survival is the ultimate outcome of interest, but where information is available about intermediate events that individuals may experience during the study period. For instance, in some biomedical applications, besides the ‘healthy’ initial state and the absorbing ‘dead’ state, one may observe intermediate (transient) states based on health conditions (e.g., heart or lung disease), disease stages (e.g., stages of cancer, HIV infection or Alzheimer’s), clinical symptoms (e.g., bleeding episodes), biological markers (e.g., CD4 T-lymphocyte cell counts; serum immunoglobulin levels) or they can represent a non-fatal complication in the course of the illness (e.g., cancer recurrence, transplantation, etc.). The number of states and transitions among them defines the complexity of the model, which is illustrated by a directed graph with rectangular boxes representing possible states and arrows representing the allowed transitions. Among the multi-state models, the simplest is the mortality model for survival data (with only two states). The competing risks model ((P. K. Andersen and Keiding 2002); (H. Putter, Fiocco, and Geskus 2007)) can be seen as an extension of the simple mortality model for survival data in which each individual may ‘die’ due to any of several causes. The most common is the illness-death model, also known as the disability model, which comprises three states. In the irreversible version of this model (Figure 1), individuals start in the ‘alive and disease-free’ state and subsequently move either to the ‘diseased’ state or to the ‘dead’ state. Individuals in the ‘diseased’ state will eventually move to the ‘dead’ state, without any possibility of recovery. Data that arises from the model depicted in Figure 1 has been termed as semi-competing risk data and can be seen as an emerging challenge for time-to-event data where two events of interest are studied: non-terminal (e.g., disease diagnosis) and terminal (e.g., death) events. The non-terminal event is observed only if it precedes the terminal event, which may occur before or after the non-terminal event. In this context, the terminal event is a competing risk for the intermediate state but not the contrary ((Xu, Kalbfleisch, and Tai 2010); (Li and Peng 2017); (Nevo and Gorfine 2020)).

Multi-state models can be used to assess the effects of covariates on the transition intensities, but they can also be used to obtain conditional prediction probabilities of future events. From the transition intensities, it is possible to evaluate the probability of one event ahead, but the transition probabilities are the relevant predictive quantities since they consider several transitions ahead. The occupation probabilities, often used in practice, are particular examples of transition probabilities in which the time origin is taken as the current time. When the underlying stochastic process is Markov, an elegant theory can be used to link the transition intensities to the transition probabilities, leading to the Aalen-Johansen estimator (Aalen and Johansen 1978). This assumption claims that given the present state, the future evolution of the process is independent of the states previously visited and the transition times among them; in other words, the history of the process is summarized by the current state. The Aalen-Johansen estimator is adapted to censoring, and benefits from the assumption of Markovianity to get consistent estimates. However, when the multi-state model is non-Markov, this is no longer the case. In fact, there are many applications for which the Markov assumption may not be met; for example, the arrival time at the current state of the process often influences the transition intensities, leading to non-Markov structures ((PK. Andersen, Esbjerg, and Sorensen 2000), (P. K. Andersen and Keiding 2002)). Although the Aalen–Johansen estimator may be used to consistently estimate occupation probabilities for non-Markov multi-state models (Datta and Satten 2001), in general it provides biased estimators if the Markov assumption does not hold. To tackle this issue, (Meira-Machado, de Uña-Álvarez, and Cadarso-Suárez 2006) introduced, for the first time, estimators that do not rely on the Markov assumption. These authors showed the practical superiority of their estimators relative to the Aalen-Johansen estimator in situations in which the Markov condition is strongly violated. However, this approach has the drawback of requiring that the support of the censoring distribution contains the support of the lifetime distribution, otherwise they only report valid estimators for truncated transition probabilities. Recently, alternative estimators that are consistent regardless of the Markov condition and the referred assumption on the censoring support were proposed by (Uña-Álvarez and Meira-Machado 2015). The idea behind the proposed estimators is to use a procedure based on differences between Kaplan-Meier estimators derived from a subset of the data, consisting of all subjects observed to be in a given state at a given time. The same idea of subsampling, combined with the Aalen-Johansen estimator, was later used by H. Putter and Spitoni (2018) to introduce new estimators which were termed landmark Aalen-Johansen.

In the multi-state setting, it is of practical interest to determine whether the Markov property holds within a particular data set. This information can be used to assess the validity of the transition probability estimates as well as the best option for performing inference on the transition intensities when one aims to relate the individual characteristics to the intensity rates through a multi-state regression model. This assumption is usually checked by including covariates depending on the history ((Kay 1986); (PK. Andersen, Esbjerg, and Sorensen 2000); (P. K. Andersen and Keiding 2002)). For the progressive illness-death model, for example, the Markov assumption is particularly relevant for modeling the death transition after disease and, consequently, assessing whether this transition rate is affected by the time in the previous state. Alternative methods, based on a local Kendall’s tau measuring the future-past association along time, were proposed by Rodriguez-Girondo and Uña-Álvarez (2012) and Rodriguez-Girondo and Uña-Álvarez (2016). The method proposed by these authors can be used for three-state progressive and illness-death models, but the extension of this test to general multi-state models is not straightforward, and thus flexible methods that may be used in general models are required. A very recent work by Titman and Putter (2020) considers new approaches for checking this assumption. In one of these approaches, local and general tests have been developed by considering summaries from families of log-rank statistics where individuals are grouped by the state occupied at varying initial time \(s\). Chiou et al. (2018) also considered an equivalent problem to testing Markovianity (in the progressive illness-model) but involving tests for dependent truncation. Making use of the landmark approach, G. Soutinho and Meira-Machado (2021) introduces new methods for checking the Markov assumption through a local and global test given by measuring the discrepancy between the Aalen-Johansen estimator and the landmark estimators ((Uña-Álvarez and Meira-Machado 2015), (H. Putter and Spitoni 2018)) that are free of the Markov condition.

There are some software packages available for multi-state survival analysis. A comprehensive list of the available packages in the Comprehensive R Archive Network (CRAN) can be seen in the CRAN task view ‘Survival Analysis’ (Allignol and Latouche 2022). In particular, in the context of multi-state models, mstate package provides functions for estimating hazards and probabilities, possibly depending on covariates. This package also permits checking the Markov assumption from families of log-rank statistics where individuals are grouped by the state occupied at different times (Titman and Putter 2020). The msm package can be used to fit a Markov model with any number of states and any pattern of transitions to panel data. It includes several extensions, such as hidden Markov models and models whose transition intensities vary with individual-specific or time-varying covariates (Jackson 2011). survidm package provides estimates of predictive probabilities, such as the transition probabilities, occupation probabilities, cumulative incidence functions, and waiting (sojourn) time distributions. It also allows one to perform multi-state regression for transition intensities as well as to implement global tests for the Markov assumption in illness-death models (Gustavo Soutinho, Sestelo, and Meira-Machado 2021). Other examples are the SemiMarkov package that can be used to fit semi-Markov multistate models to longitudinal data proposed by (Król and Saint-Pierre 2015) and the frailtyEM which contains functions for fitting shared frailty models with a semi-parametric baseline hazard with the Expectation-Maximization algorithm (Balan and Putter 2019). Multi-state regression can be performed using several of the aforementioned packages by decoupling the whole process into various survival models, by fitting separate intensities to all permitted transitions using, for example, semi-parametric Cox proportional hazard regression models, while making appropriate adjustments to the risk set. To this end, the JM and joineR packages, proposed by (Rizopoulos 2010) and (Philipson et al. 2018), can be useful to estimate the parameters for the joint modeling of longitudinal and time-to-event data. In this context, we can also refer to the joineRML which extends the inference of the joint model to the case of multiple continuous longitudinal measures (Hickey et al. 2018).

The organization of this paper is as follows: In Section 2, we present a brief introduction of the notation and the mathematical background of existing methods for testing the Markov assumption. In Section 3, we introduce the markovMSM package, a software application for R that performs these local and global tests in the multi-state models described in (G. Soutinho and Meira-Machado 2021) using three real data examples. Finally, the main conclusions are reported in Section 4.

The progressive illness-death model with three states and three irreversible transitions. ‘Disease-free’ correspond to the initial state, ‘Diseased’ the intermediate state and the absorbing state ‘dead’.

Background and notation

This section contains a brief description of the mathematical background underlying the markovMSM package. Further details on the proposed methods can be found in (G. Soutinho and Meira-Machado 2021).

Multi-state models and transition probabilities

A multi-state model \((X(t),t\in [0,\infty))\) is a model for a time continuous stochastic process. These models provide a relevant modeling framework for dealing with complex longitudinal survival data, since they allow the representation of the movement of the individuals among a set of finite states space \(\mathcal{S}={1,\ldots ,K}\). They can be fully characterized through the transition probabilities, for two states \(h\), \(j\) and two time points \(s < t\), that are given by the conditional probability \(p_{hj}(s,t)=P(X(t)=j|X(s)=h, \mathcal{H}_{s-})\), where \(\mathcal{H}_{s-}\) denotes the history of the process up to time \(s\). This history can represent information about the process, such as the states previously visited, transition times, etc. Under the Markov assumption, \(P(X(t)=j|X(s)=h, X(u)=y)=P(X(t)=j|X(s)=h)\), for any \(0\leq u<s\) and \(y \in \mathcal{S}\), and thus, the future of the process after time \(s\) depends only on the state occupied at time \(s\), not on the arrival time to that state or on the states previously visited.

Without loss of generality and for the purpose of simplicity, let us consider the progressive illness-death model, with state space \(\{1,2,3\}\), given by the initial State (e.g., ‘alive and disease-free’; State 1), the intermediate state (e.g., ‘diseased’; State 2), and the absorbing state (e.g., ‘dead’; State 3). For this type of model, there are five different transition probabilities to be considered that can be obtained as follows (Meira-Machado and Sestelo 2019):

\(p_{11}(s,t) = \exp\left(-\int_s^t \left(\lambda_{12}(u)+\lambda_{13}(u)\right)du\right)\)

\(p_{22}(s,t\mid t_{12})= \exp\left(-\int_s^t \lambda_{23}(u, t_{12})du\right)\)

\(p_{12}(s,t) =\int_s^t p_{11}(s,u-)\lambda_{12}(u)p_{22}(u,t\mid u)du\).

Here, \(p_{22}(s, t\mid t_{12})\) denotes the transition probability \(p_{22}\) conditional on a particular entry time \(t_{12}\), and \(\lambda_{hj}(.)\) correspond to the transition intensities that are the instantaneous hazards for movement from one state (\(h\)) to another (\(j\)). If the process is Markov, \(h_{23}(t, t_{12})=h_{23}(t)\) and \(p_{22}(s, t\mid t_{12})=p_{22}(s, t)\). The two other transition probabilities, \(p_{13}(s,t)\) and \(p_{23}(s,t)\), can be estimated from the two obvious relations that exist in the progressive illness-death model: \(p_{11}(s,t)+p_{12}(s,t)+p_{13}(s,t)=1\) and \(p_{22}(s,t)+p_{23}(s,t)=1\). Expressions for general models are not possible. Since the prognosis for an individual in the intermediate state may be influenced by the subject’s specific arrival time, the illness-death model is not necessarily Markovian.

The estimation of the transition probabilities is a major goal in multi-state models since they allow for long-term predictions. A common nonparametric method to estimate these quantities is the Aalen-Johansen ( AJ) estimator, (Aalen and Johansen 1978) which can be obtained as the product–integral of the Nelson-Aalen estimators for the cumulative transition intensities (P. K. Andersen et al. 1993). Explicit formulae of the Aalen-Johansen estimator for the illness-death model can be found in (Borgan 2005). When the multi-state process is Markovian, the AJ estimator provides consistent estimates of the transition probabilities, but this is no longer the case if the process is non-Markov. To avoid this problem, (Uña-Álvarez and Meira-Machado 2015) use the idea of subsampling, also referred to as landmarking (Van Houwelingen 2007), which is based on (differences between) Kaplan-Meier estimators derived from a subset of the data consisting of all subjects observed to be in the given state at the given time. For the specific case of the illness-death model, given the time point \(s\), to estimate \(p_{1j}(s,t)\) for \(j=1,2,3\) the landmark analysis is restricted to the individuals observed in State 1 at time \(s\); whereas, to estimate \(p_{2j}(s,t)\), \(j=2,3\), the landmark analysis proceeds from the sample restricted to the individuals observed in State 2 at time \(s\). Further details on the formulae and proofs can be seen in (Uña-Álvarez and Meira-Machado 2015). The subsampling approach was later used by H. Putter and Spitoni (2018) to derive a landmark Aalen-Johansen estimator ( LMAJ) of the transition probabilities. The idea behind the proposed estimator is to use the Aalen-Johansen estimator of the state occupation probabilities derived from those subsets (consisting of subjects occupying a given state at a particular time) for which consistency has already been proved in multi-state models that are not necessarily Markov (Datta and Satten 2001). It is worth mentioning that, in the illness-death model, the landmark Aalen-Johansen estimators ( LMAJ) and the landmark estimators ( LM) of the transition probabilities have shown similar performances. However, the former methods ( LMAJ) can be used in general multi-state models, which can be considered an advantage.

From now on, we will use the abbreviation AJ for the Aalen-Johansen estimator, LM for the LandMark estimator proposed by (Uña-Álvarez and Meira-Machado 2015), and LMAJ for the LandMark Aalen-Johansen estimator (H. Putter and Spitoni 2018).

Tests for the Markov assumption

Modeling particular transition intensities

Traditionally, the Markov condition is verified by modeling particular transition intensities on aspects of the history of the process using a proportional hazard model ((Kay 1986); (PK. Andersen, Esbjerg, and Sorensen 2000; P. K. Andersen and Keiding 2002)). In the progressive illness-death model, the Markov condition is particularly relevant for the transition from the intermediate state ‘disease’ (State 2) to ‘death’ (State 3). Under this model, we can examine whether the time spent in the initial state (State 1) is important in the transition from the disease state to death or not. For doing that, let \(\lambda_{23}(t)\) denote the hazard function of \(T\) for those individuals going from State 2 to State 3. Fitting a Cox model \(\lambda_{23}(t\mid Z)=\lambda_{23,0}(t)\exp(\beta Z)\), where \(\lambda_{23,0}\) is the baseline hazard, \(Z\) the ‘time spent in the healthy (initial) state’ and \(\beta\) a regression parameter, we now need to test the null hypothesis, if the transition rate from the disease state into death is unaffected by the time spent in the healthy state, \(H_0:\beta=0\), against the general alternative, \(H_1:\beta\neq 0\). This ‘global’ test, based on the semiparametric Cox proportional hazards model, assumes proportional hazards and a linear effect on the hazard for the covariate. When either of these two assumptions is not fulfilled, the test may not be able to detect the lack of Markovianity.

Since the landmark methods ( LM) for estimating the transition probabilities proposed by (Uña-Álvarez and Meira-Machado 2015), and ( LMAJ) by (H. Putter and Spitoni 2018) are free of the Markov assumption, they can be used as a basis for the construction of tests for Markovianity. These ideas were used by (G. Soutinho and Meira-Machado 2021) to introduce local and global tests for Markovianity based on the discrepancy of the landmark estimators to Markovian Aalen-Johansen estimators ( AJ). Subsampling, also known as landmarking, has been also used by (Titman and Putter 2020) to introduce a general test based on summaries from families of log-rank statistics where individuals are grouped by the state they occupied at a given (landmark) time.
AUC Local Test

(G. Soutinho and Meira-Machado 2021) introduce a local test based on the area under curve, AUC, given by the estimated transition probabilities, for a fixed time \(s>0\), that can be used for a general multi-state model for checking the Markovianity of the process. They propose the following test statistic that is based on the difference between the area under the estimated transition probability curve for the non-Markov LMAJ estimator and the AJ estimator, \(U=\int_{s}^{\tau} \left(\widehat p_{hj}^{\texttt{ LMAJ}}(s,u) - \widehat p_{hj}^{\texttt{ AJ}}(s,u)\right)du\), where \(\tau\) is the upper bound of the support of \(T\). When the process is Markovian, it is expected that the test statistic will be close to zero. The Markov assumption becomes less likely as the test statistic gets further away from zero in either direction. Because of censoring, both estimators ( LMAJ and AJ) may reveal high variability in the right tail, which may inflate the test statistic. In addition to this issue, since landmarking is based on reduced data, the maximum point for which the LMAJ transition probability estimate is strictly defined may be lower than the maximum point for AJ. To address these issues, we propose that when computing \(U\), the minimum between the upper bound for which LMAJ is defined and the 90th percentile of the total time for the upper limit in the integral that defines the test statistic be used.

In the progressive illness-death model, besides the transition probability \(\widehat p_{23}(s,t)\), \(\widehat p_{12}(s,t)\) can also be used to test the Markov assumption. For general multi-state models, one can use transitions depending on history. In fact, if the goal is to decide which estimator is the most appropriate to use to estimate a specific transition probability \(p_{hj}(s,t)\), then the test statistic should be the one based on that same transition probability.

To approximate the distributions of the test statistic, bootstrap methods with a large number of resamples, \(M\), are used. We generate \(M\) bootstrap samples, and for each sample, we calculate the test statistic \(U^{\star}\) which corresponds to the difference between the areas of the AJ and the LMAJ estimators. Finally, the variance \(\sigma^{\star}_{(U^{\star})}\) is obtained from the \(M\) values of \(U^{\star}\). Then, according to large sample asymptotic distribution theory, when \(M\), the number of replicates, goes to infinity, we have the following statistic distributed approximately as a standard normal distribution with a mean of 0 and variance of 1: \(V=(U-0)/\sigma^{\star}_{(U^{\star})}\sim N(0,1)\). The null hypothesis will be rejected if \(V>v_{(1-\alpha/2)}\) or \(V<v_{(\alpha/2)}\), where \(v_{(\alpha/2)}\) and \(v_{(1-\alpha/2)}\) denote the \(\alpha/2\) and \(1-\alpha/2\) percentiles, respectively, of a normal distribution with a mean of 0 and variance of 1.
AUC Global Test

(G. Soutinho and Meira-Machado 2021) also introduce a global test, which can be achieved by combining the results obtained from local tests over different times. The testing procedure used here involves the following steps:

  1. Using the original sample of the illness-death model, obtain the percentiles 5, 10, 20, 30, and 40 of the times spent by all individuals in State 1, the so-called waiting (soujorn) time. For general multi-state models, we recommend the use of the same percentiles of the subject’s specific arrival time at the corresponding state.

  2. For each of the values \(s\) obtained in Step 1, obtain the probability values for the local method as explained before.

  3. Obtain the mean of the probability values for each closest pair; i.e., the mean of the probability values of the following pairs of percentiles: \((5,10)\), \((10,20)\), \((20,30)\) and \((30,40)\).

  4. Finally, the p-value for the global test is equal to the minimum between the four probability values obtained in Step 3. In the case of this probability value being below a particular significance value, for example \(0.05\), we reject the null hypothesis of the process being Markovian.

Step 1 considers a global test based on local tests computed at low percentiles of subject-specific arrival times at the corresponding state. This is based on our experience that the failure of Markovianity often occurs at small transition times. Besides the hypothesis tests proposed above, we also suggest graphical local tests that can be used to check the Markov assumption in the illness-death model as well as for more complex multi-state models, possibly with reversible transitions between states. These graphical tests can be used to validate the default values proposed in Step 1 or to propose alternative values for which a discrepancy between the two methods ( LMAJ and AJ) is more evident.

The procedure described in Step 3 can be used to ensure that there is a discrepancy between the two estimated curves over a large range of time values.
Log-rank test

The construction of the landmark Aalen-Johansen estimators ( LMAJ) also motivated the development of new tests of the Markov assumption (Titman and Putter 2020). Assume initially that interest lies in assessing the validity of transition probability estimates from a specific start point time \(s\) and a given state \(h\), i.e. \(p_{hj}(s,t)\). Under the landmark approach, the estimators of the transition probabilities should be derived from the subset of the data consisting of all subjects observed to be in the state \(h\) at time \(s\), \(S=\{i:X_i(s)=j, Y_i(s)=1\}\) where \(Y_i(s)\) is the at risk indicator of the process ((Uña-Álvarez and Meira-Machado 2015); (H. Putter and Spitoni 2018)). On the other hand, the Aalen-Johansen estimator would use also the set of subjects in \(S^c=\{i:X_i(s)\neq j, Y_i(s)=1\}\). Besides, under a Markov process, the transition intensities of the process for \(t>s\) will be the same in the two groups. This motivated (Titman and Putter 2020) to propose a local test for the Markov assumption that lies in testing \(H_0: \lambda(t|X(s)=h)= \lambda(t|X(s)\neq h)\) for \(t\geq s\) versus a general alternative. The proposed test is constructed using several log-rank statistics grouped by membership in \(S\). (Titman and Putter 2020) also propose a global test via a grid of times in which, for each \(s\) time, local tests are computed. In this paper, the grid of values is given by the percentile times.

The same authors pointed out that the use of the individual supremum or integrated statistics may be combined to provide an overall test of the dependence of the transition intensities on occupation in state \(j\) at previous times.

Application of the markovMSM {#sec: package}

This section offers the guidelines for the use of the markovMSM package with the R statistical program (R Core Team 2021). To this end, a description of the functions of the package is illustrated using three real data sets. The first one involves data from a clinical trial on colon cancer modeled using the progressive illness-death model (C. Moertel et al. 1995). Extensions to progressive processes beyond the three-state illness-death model are demonstrated using data from the European Group for Blood and Marrow Transplantation (EBMT) (H. Putter, Fiocco, and Geskus 2007). Finally, we use data from a study with liver cirrhosis patients subjected to prednisone treatment (P. K. Andersen et al. 1993). The package comprises seven main functions, which are briefly summarized in Table 1.

Data manipulation

In this section, we reanalyze data from a large clinical trial on Duke’s stage III patients affected by colon cancer who underwent curative surgery for colorectal cancer (C. G. Moertel et al. 1990). Of the 929 patients, 468 developed a recurrence, and among these, 414 died; 38 died without a recurrence. The remaining 423 patients were alive and disease-free up to the end of the follow-up. Besides the two event times (time to recurrence and time to death) and the corresponding indicator statuses, a vector of covariates including age, sex, number of lymph nodes, and extent of local spread is also available. Below is an excerpt of the data set with one row per individual. Individuals were chosen in order to represent all possible combinations of movements among the three states.

> library(markovMSM) > data("colonMSM") > db_wide <- colonMSM > head(db_wide\[c(1:2,16,21),1:11\])

time1 event1 Stime event rx sex age obstruct perfor adhere nodes 1 968 1 1521 1 Lev+5FU 1 43 0 0 0 5 2 3087 0 3087 0 Lev+5FU 1 63 0 0 0 1 16 1323 1 3214 0 Obs 1 68 0 0 0 1 21 2789 0 2789 1 Obs 1 64 1 0 0 1

Summary of functions in the markovMSM package.
Function Description
PHM.test Performs a global test for the Markov assumption based on the Cox PH model.
AUC.test Performs global and local tests for the Markov assumption using the Area Under Curves (AUC) method.
LR.test Log-rank based test for the validity of the Markov assumption.
plot.markovMSM Plot for an object of class markovMSM.
eventsMSM Counts the number of observed transitions in the multi-state model.
prepMSM Prepares the data set for multi-state modeling in a long format from a data set in wide format.
transMatMSM Define transition matrix for a multi-state model.
print.markovMSM Print for an object of class markovMSM.
summary.markovMSM Summary for an object of class markovMSM.

The four initial variables describe the movement of the patients among the three states of the illness-death model: time1 denote the time measured in days from surgery to recurrence, whereas Stime is the total time or the time to death or censoring; event1 and event denote the corresponding status/censoring indicator (1 for an event and 0 for censoring). Patient 1 had a recurrence after 968 days (i.e., observed a transition from the initial state to the intermediate state) and then died after 1521 days in the study. Patient 2 remained alive and without recurrence at the end of follow-up ( event1 = 0 and event = 0). The two event times are equal in these cases. The patient represented in the third line had a recurrence after 1323 days but remained alive at the end of the follow-up (i.e., in State 2). Finally, the patient represented in the last line died after 2789 days in the study without experiencing a recurrence.

As the original data set is in the wide format, the next step to implementing the proposed methods will be to convert the data into a long format, which is given by one line for each transition for which a subject is at risk. This can be done using functions transMatMSM and prepMSM. transMatMSM function defines the transition matrices, revealing which transitions are possible, whereas prepMSM provides a new dataset in a long format for which each row will correspond to a transition for which a patient is at risk. For the progressive illness-death model, these two functions are used as follows:

> positions <- list(c(2, 3), c(3), c()) > state.names <- c("Alive", "Rec", "Death") > tmat <- transMatMSM(positions, state.names) > tmat to from Alive Rec Death Alive NA 1 2 Rec NA NA 3 Death NA NA NA

> timesNames <- c(NA, "time1", "Stime") > status <- c(NA, "event1", "event") > trans <- tmat > db_long <- prepMSM(data = db_wide, trans, timesNames, status) > db_long\[1:10,\]

id from to trans Tstart Tstop time status 1 1 1 2 1 0 968 968 1 2 1 1 3 2 0 968 968 0 3 1 2 3 3 968 1521 553 1 4 2 1 2 1 0 3087 3087 0 5 2 1 3 2 0 3087 3087 0 6 3 1 2 1 0 542 542 1 7 3 1 3 2 0 542 542 0 8 3 2 3 3 542 963 421 1 9 4 1 2 1 0 245 245 1 10 4 1 3 2 0 245 245 0

Finally, in terms of manipulation of data, a useful function is eventsMSM since it allows one to summarise the number of transitions among states and their percentages:

> eventsMSM(db_long)

\(Frequencies to from Alive Rec Death no event total entering Alive 0 468 38 423 929 Rec 0 0 414 54 468 Death 0 0 0 452 452\)Proportions to from Alive Rec Death no event Alive 0.0000000 0.5037675 0.0409042 0.4553283 Rec 0.0000000 0.0000000 0.8846154 0.1153846 Death 0.0000000 0.0000000 0.0000000 1.0000000

Methods for testing the Markov condition in the illness-death model {#sec:mkv tests1}

The illness-death model is a special multi-state model with several applications in the biomedical literature. Many time-to-event data sets with multiple end points can be reduced to this generic structure. Unlike the survival or competing risk models, this model is not necessarily Markovian since the prognosis for an individual in the intermediate state may be influenced by the subject’s specific arrival time. Traditionally, the Markov assumption is checked by including covariates depending on the history. In the particular case of the colon cancer data set, we are interested in assessing if the transition rate from the recurrence state into death is unaffected by the time spent in the previous state. This global test for the Markov assumption can be done using the function PHM.test. A brief description of the arguments of this function is shown in Table \[Tab-PHM.test\]. Results for this global test of our data indicated that the effect of the time spent in State 1 is not significant (p-value of 0.154), revealing no evidence against the Markov model for the colon data. The corresponding input codes are the following:

> res <- PHM.test(data = db_long, from = 2, to = 3) \[1\] 0.1543195 > res \(p.value [1] 0.1543195\)from \[1\] 2 $to [1] 3 \end{example} %\end{CodeInput} %\end{CodeChunk}

In the package the local test proposed in Section \(\ref{sec2.3}\) is performed using function , through argument . A summary of the arguments of this function is presented in Table~\(\ref{Tab3}\).

The input commands to perform the AUC local test, for a fixed time\(s=180\)and transitions 1\[2 and 1\]3 are the following:

%\begin{CodeChunk} %\begin{CodeInput}

%\end{CodeInput} %\end{CodeChunk}

Results reveal a possible failure of the Markov assumption with low probability values for transitions 1\[2 and 2\]3 for\(s=180\)(less than 5%). These findings agree with the results depicted in Figure~\(\ref{fig4}\), which reports the estimated transition probabilities. In fact, we can observe departures between the two Markov-free estimators ( and ) and the Aalen-Johansen estimator () revealing a possible failure of the Markov assumption. The input commands to obtain the plots shown in Figure~\(\ref{fig4}\) are the following:

%\begin{CodeChunk} %\begin{CodeInput}

%\end{CodeInput} %\end{CodeChunk}

Putting the parameter , we can also obtain the discrepancy between the Aalen-Johansen estimator (Markovian) and the landmark non-Markovian estimator (), for\(p\_12(s,t)\)and\(p\_23(s,t)\), for\(s=180\), measured through\(D\_hj=p\_hj\^`AJ`(s,t)-p\_hj\^` LMAJ`(s,t)\),\(h=1,2\),\(j=h+1\). The 95% pointwise confidence limits were obtained using a simple bootstrap (Figure~\(\ref{fig6}\)). The corresponding input commands in this case are the following:

%\begin{CodeChunk} %\begin{CodeInput}

%\end{CodeInput} %\end{CodeChunk}

%PUT plot about here

The plots in Figure \(\ref{fig6}\) can be thought of as graphical local tests for the Markov assumption. As expected, in both cases they reveal differences between the two methods for\(s=180\)since they show a clear deviation with respect to the horizontal line\(y=0\). From these, one gets some (graphical) evidence of the lack of Markovianity of the underlying process beyond half a year after surgery. Therefore, for this specific time, the application of the Aalen-Johansen method may not be recommended here, due to possible biases. %They also reveal a possible failure of the Markov assumption.

In Section 2, a global test was also introduced that combines the probability values of the local test over different times (given by the percentiles of the sojourn time in State 1). In the package, this can be obtained using the function . The arguments for this function are described in Table~\(\ref{Tab3}\). Some examples of how to perform the proposed AUC global test are shown in the following input commands, in which we consider the default percentiles in the argument . Depending on the number of replicas, the implementation of the function can be computationally time-consuming.

%{ o nome da função pode levar a confusão pois temos também o teste global pelo modelo de cox e pelo LR. Sugestão: AUC.test esta função pode proporcionar os resultados do teste local e global pelo método AUC.}

%{ tal como está para os testes locais poderia colocar no output dos testes globais as respetivas transições em cima.}

%\begin{CodeInput}

%\end{CodeInput}

Results reported by the first command lines provide the probability values for the global test based on the AUC for the three transitions leaving State 1 (i.e.,\(`<!-- -->`{=html}1\),\(`<!-- -->`{=html}1\)and\(`<!-- -->`{=html}1\)). As expected, a higher probability value was obtained for transition 1\[1 than the two remaining transitions, revealing evidence against the Markov condition. Results reported in the second set of input commands agree with previous findings, reporting a probability value of 0.006 for 2\]3. Among the objects saved by this function, displays the probability values for each percentile time (default to 5, 10, 20, 30, and 40) through the following codes:

%\begin{CodeInput}

%\end{CodeInput}

These outputs show, for instance, that the probability value of the AUC local test reveal the fail of the Markovian assumption for the third percentile time (\(p\)-value < 0.001). Plots with the graphical local tests for the respective quantiles can be easily obtained using the following input commands:

%\begin{CodeInput}

%\end{CodeInput}

The plots shown in Figure \(\ref{fig8}\) display the differences between the and estimates for the third quantile () of the sojourn time in the initial state being in accordance with the -values obtained in the local tests.

%{ tal como apontado atras na funcao para o teste pelo modelo de cox, nao me parece boa ideia o argumento transition tal como esta pois o utilizador vai ter dificuldade em saber o que é a transição 3. Sugestão: transition=“23” ou from=2, to=3.}

Often, multi-state models include covariates, and it may be the omission of covariate effects that induces apparent non-Markovianity. The methods proposed in this paper can also deal with this problem since discrete covariates can be included in the estimation of the transition probabilities\(p\_hj(s, t)\)by splitting the sample for each level of the covariate and repeating the described procedures for each subsample. As an example, the following input codes show how to obtain the AUC global for individuals with the treatment Obs’ of covariaterx’. To this end, we start to filter the wide data set and run the function to obtain the subgroups of the new data in the format long. At least, we just need to use the function for the specific transitions as follows:

\begin{example} > db_wide_obs <- db_wide[db_wide\(rx == 'Obs',\] \> db_long_obs \<- prepMSM(data = db_wide_obs, trans, timesNames, status) \> set.seed(12345) \> res4.obs \<- AUC.test(data = db_long_obs, times = 365, from = 1, to = 3, type='local', replicas = 100, tmat = tmat) \> res4.obs\)localTest

s      1->1         1->2         1->3

1 365 0.6295861 0.0003563306 0.0004307342

set.seed(12345) res5.obs <- AUC.test(data = db_long_obs, times = 365, from = 2, to = 3, type = ‘local’, replicas=100, tmat = tmat) res5.obs$localTest

s 2->2 2->3 1 365 5.935795e-05 0.0001894722

The markovMSM package can also be used to compute the results of the global and local tests proposed by (Titman and Putter 2020) which are based on log-rank statistics. A summary of the arguments of the LR.test function is shown in Table 2. The following input commands illustrate the usage of LR.test function to implement these tests:

As we can see, the probability value of the global tests based on the log-rank statistics confirms the possible failure of Markovianity for lower values of \(s\) (P=0.047 < 0.05).

Summary of the arguments of the function LR.test.
Argument Description
data Multi-state data in msdata format, which is created from the mstate package. It should also contain (dummy coding of) the relevant covariates; no factors allowed.
times Grid of time points at which to compute the statistic.
from The starting state of the transition to check the Markov condition.
to The last state of the considered transition to check the Markov condition.
replicas The number of wild bootstrap replications. This method, proposed by (Wu 1986) is commonly applied to solve problems related to heterocedasticity. (Beyersmann and Allignol 2012) found better small sample results using centered Poisson random variables rather than standard normal weights. For further information on the usage of the wild bootstrap, see (Titman and Putter 2020).
formula Right-hand side of the formula. If NULL will fit with no covariates (formula="1" will also work), offset terms can also be specified.
fn A list of summary functions to be applied to the individual zbar traces (or a list of lists).
fn2 A list of summary functions to be applied to the overall chi-squared trace.
dist Distribution of wild bootstrap random weights, either poisson for centred Poisson (default), or normal for standard normal.
min_time The minimum time for calculating optimal weights.
other_weights Other (than optimal) weights can be specified here.

Tests for Markov assumption for more complex multi-state models {#sec:mkv tests1}

In this section, we use two data sets to illustrate the extension of the previous functions to more complex multi-state models. As a first example, we consider the data of 2279 patients transplanted by the European Society for Blood and Marrow Transplantation (EBMT) and, as a second example, data from liver cirrhosis patients subjected to prednisone treatment. Further details on the description of the data can be found in (H. Putter, Fiocco, and Geskus 2007) and (P. K. Andersen et al. 1993), respectively.

The steps taken in the analysis are quite similar to those introduced for the illness-death model. To extend the proposed local and global tests to more complex models, we make use of the LMAJ estimator that produces consistent estimates of the transition probabilities in the case of non-Markovianity of the process.

We start by considering the data set comprising 2279 patients who suffered from blood cancers and who were treated at the EBMT between 1985 and 1998 after a transplant. The movement of the patients among the six states can be modeled through the multi-state model with the following six states: ‘Alive and in remission, no recovery or adverse event’ (state 1); ‘Alive in remission, recovered from the treatment’ (state 2); ‘Alive in remission, occurrence of the adverse event’ (state 3); ‘Alive, both recovered and adverse event’ (state 4); ‘Alive, in relapse’ (treatment failure) (state 5) and ‘Dead (treatment failure)’ (state 6). In total, there are 12 transitions: three intermediate events given by recovery (Rec), adverse event (AE) and a combination of the two (AE and Rec), and two absorbing states: relapse and death (Figure 2).

A six-state model given by the Blood and Marrow Transplantation (EBMT) data set for leukemia patients after bone marrow transplantation.
A six-state model given by the Blood and Marrow Transplantation (EBMT) data set for leukemia patients after bone marrow transplantation.

Since the original data ebmt4 is in the wide format, before implementing a global test we need to convert it into the long format using functions transMatMSM, prepMSM before using function AUC.test with the argument type=‘global’:

> data("ebmt4") > db_wide <- ebmt4 > positions <- list(c(2, 3, 5, 6), c(4, 5, 6), c(4, 5, 6), c(5, 6), c(), c()) > state.names <- c("Tx", "Rec", "AE", "Rec+AE", "Rel", "Death") > tmat <- transMatMSM(positions, state.names) > timesNames <- c(NA, "rec", "ae","recae", "rel", "srv") > status <- c(NA, "rec.s", "ae.s", "recae.s","rel.s", "srv.s") > trans <- tmat > db_long <- prepMSM(data = db_wide, trans, timesNames, status) > db_long\[1:10,\]

Data: id from to trans Tstart Tstop time status 1 1 1 2 1 0 22 22 1 2 1 1 3 2 0 22 22 0 3 1 1 5 3 0 22 22 0 4 1 1 6 4 0 22 22 0 5 1 2 4 5 22 995 973 0 6 1 2 5 6 22 995 973 0 7 1 2 6 7 22 995 973 0 8 2 1 2 1 0 12 12 0 9 2 1 3 2 0 12 12 1 10 2 1 5 3 0 12 12 0

> set.seed(1234) > res7 <- AUC.test(data = db_long, from = 1, to = 5, type = ‘global’, quantiles = c(0.05, 0.10, 0.20, 0.30, 0.40), tmat = tmat, replicas = 100, positions = positions, namesStates = state.names, timesNames = timesNames, status = status)

> round(res7$globalTest, 4) 1->1 1->2 1->3 1->4 1->5 1 0.1417 0.0094 0.0231 0.099 0.0068

round(res7$localTests,4) s 1->1 1->2 1->3 1->4 1->5 1 9.9 0.2822 0.6030 0.0176 0.0848 0.0137 2 12.0 0.2833 0.5230 0.0285 0.4643 0.0000 3 15.0 0.0001 0.2106 0.0398 0.0191 0.1193 4 18.0 0.4584 0.0041 0.1763 0.1789 0.0240 5 21.0 0.0060 0.0147 0.4673 0.2363 0.4796

The interpretation of the output for the global test and for local tests of the five percentile times is similar to that shown for the illness-death model. Thus, object res7 is used to get a list with components giving the probability values for the Markov test for all transitions leaving State 1 (i.e., for 1\(\rightarrow\) j with \(j\in \{1,2,3,4,5\}\)). For instance, the probability value of 0.0094 corresponds to the result of the AUC global test for transition 1\(\rightarrow\)2, while 0.0147 is the probability value for the local test, for \(s=21\) for the same transition.

The proposed methods can also be used in reversible multi-state models such as those applied to the data set of liver cirrhosis patients who were included in a randomized clinical trial at several hospitals in Copenhagen between 1962 and 1974. The study aimed to evaluate whether a treatment based on prednisone prolongs survival for patients with cirrhosis (P. K. Andersen et al. 1993). State 1 corresponds to ‘normal prothrombin level’, State 2 to ‘low (or abnormal) prothrombin level’, and State 3 to ‘dead’. The movement of the patients among these three states can be modeled using the reversible illness-death model shown in Figure 3. The input commands for the local and global tests of the AUC and Log-rank tests are presented as a vignette in the markovMSM package.

A reversible illness-death model from the data set of patients with liver cirrhosis subjected to prednisone treatment. Reversible transitions are possible in transient states defined by prothrombin levels.

Conclusions

This paper discusses the implementation in R of new methods for checking the Markov assumption in multi-state models proposed by (G. Soutinho and Meira-Machado 2021). Instead of testing, as usual, the Markovianity of the multi-state process by including covariates depending on the history, (G. Soutinho and Meira-Machado 2021) propose global and local tests based on the comparison between estimated transition probabilities. This can be done by measuring the discrepancy between the Aalen-Johansen estimator, which gives consistent estimators in Markov processes, and recent approaches that do not rely on this assumption, given by the Landmark and the Landmark Aalen-Johansen estimators.

The use of local tests is recommended whenever the goal is to estimate the transition probabilities and, in particular, decide which estimator is the most appropriate to use: the Aalen-Johansen estimator or a robust estimator. A global test, such as the test proposed here, might be preferable for regression purposes. To this end, a common simplifying strategy is to decouple the whole process into various survival models. Then, for all permitted transitions, the transition intensities may be modeled using separate Cox models, assuming the process to be Markovian (also known as the clock forward modeling approach). When the test rejects the Markov assumption, one alternative approach is to use a semi-Markov Cox model in which the future of the process does not depend on the current time but rather on the duration of the current state.

A brief summary of the theory underlying these methods has been provided, along with an extensive review of the current literature. A detailed usage of the three main functions that compose the markovMSM package, AUC.test, LR.test and PHM.test are illustrated using three real data sets: colonIDM, ebmt4 and prothr. Whereas the first data enables the application of the Markov tests to an illness-death model, the last two cases show how to extend the proposed methods to more complex multi-state models (with more than three states or with reversible transitions). The comparison of the proposed tests to the new methods given by the log-rank test described by (Titman and Putter 2020) is also shown. We also present approaches to validate the suspicion of the failure of the Markovianity of the process using graphical tests. The markovMSM package is available at the first author’s GitHub repository as well as the CRAN repository at https://cran.r-project.org/web/packages/markovMSM. Further details on the usage of its functions can be obtained from the corresponding help pages.

Acknowledgments

This research was financed by Portuguese Funds through FCT - "Fundação para a Ciência e a Tecnologia", within Projects projects UIDB/00013/2020, UIDP/00013/2020 and the research grant PD/BD/142887/2018.

Aalen, O., and S. Johansen. 1978. “An Empirical Transition Matrix for Non Homogeneous Markov and Chains Based on Censored Observations.” Scandinavian Journal of Statistics 5: 141–50.
Allignol, A., and A. Latouche. 2022. “CRAN Task View: Survival Analysis.” Version 2022-03-07, URL Http://CRAN.R-Project.org/View=Survival.
Andersen, P. K., Ø. Borgan, R. D. Gill, and N. Keiding. 1993. Statistical Models Based on Counting Processes. New York: Springer-Verlag.
Andersen, P. K., and N. Keiding. 2002. “Multi-State Models for Event History Analysis.” Statistical Methods in Medical Research 11 (2): 91–115.
Andersen, PK., S Esbjerg, and TIA. Sorensen. 2000. “Multistate Models for Bleeding Episodes and Mortality in Liver Cirrhosis.” Statistics in Medicine 19 (4): 587–99.
Balan, Theodor Adrian, and Hein Putter. 2019. frailtyEM: An R Package for Estimating Semiparametric Shared Frailty Models.” Journal of Statistical Software 90 (7): 1–29. https://doi.org/10.18637/jss.v090.i07.
Beyersmann, M., J. Schumacher, and A. Allignol. 2012. Competing Risks and Multistate Models with r. New York: Springer.
Borgan, O. 2005. Encyclopedia of Biostatistics: Aalen-Johansen Estimator. John Wiley & Sons.
Chiou, S. H., J. Qian, E. Mormino, and R. A. Betensky. 2018. “Permutation Tests for General Dependent Truncation.” Computational Statistics & Data Analysis 318: 308–24. https://doi.org/10.1016/j.csda.2018.07.012.
Datta, S., and G. A. Satten. 2001. “Validity of the Aalen-Johansen Estimators of Stage Occupation Probabilities and Nelson Aalen Integrated Transition Hazards for Non-Markov Models.” Statistics & Probability Letters 55: 403–11.
Hickey, Graeme L., Pete Philipson, Andrea Jorgensen, and Ruwanthi Kolamunnage-Dona. 2018. joineRML: A Joint Model and Software Package for Time-to-Event and Multivariate Longitudinal Outcomes. BMC Medical Research Methodology. https://doi.org/10.1186/s12874-018-0502-1.
Hougaard, P. 2000. Analysis of Multivariate Survival Data. Statistics for Biology and Health. New York: Springer-Verlag.
Jackson, C. H. 2011. “Multi-State Models for Panel Data: The Msm Package for r.” Journal of Statistical Software 38:8: 1–28.
Kay, R. 1986. “A Markov Model for Analyzing Cancer Markers and Disease States in Survival Studies.” Biometrics 42 (4): 457–81.
Król, Agnieszka, and Philippe Saint-Pierre. 2015. SemiMarkov: An R Package for Parametric Estimation in Multi-State Semi-Markov Models.” Journal of Statistical Software 66 (6): 1–16. http://www.jstatsoft.org/v66/i06/.
Li, Ruosha, and Limin Peng. 2017. Handbook of Quantile Regression. London: Chapman; Hall/CRC.
Meira-Machado, L., J. de Uña-Álvarez, and C. Cadarso-Suárez. 2006. “Nonparametric Estimation of Transition Probabilities in a Non-Markov Illness-Death Model.” Lifetime Data Analysis 12: 325–44.
Meira-Machado, L., and M. Sestelo. 2019. “Estimation in the Progressive Illness-Death Model: A Nonexhaustive Review.” Biometrical Journal 61 (2): 245–63. https://doi.org/10.1002/bimj.201500038.
Meira-Machado, L., J. de Uña-Álvarez, C. Cadarso-Suárez, and P. K. Andersen. 2009. “Multi-State Models for the Analysis of Time to Event Data.” Statistical Methods in Medical Research 18: 195–222.
Moertel, CG, TR Fleming, JS Macdonald, DG Haller, JA Laurie, CM Tangen, JS Ungerleider, et al. 1995. Fluorouracil Plus Levamisole as Effective Adjuvant Therapy After Resection of Stage III. Colon Carcinoma: A Final Report.” The Annals of Internal Medicine 122 (5): 321–26.
Moertel, Charles G., Thomas R. Fleming, John S. Macdonald, Daniel G. Haller, John A. Laurie, Phyllis J. Goodman, James S. Ungerleider, et al. 1990. “Levamisole and Fluorouracil for Adjuvant Therapy of Resected Colon Carcinoma.” New England Journal of Medicine 322 (6): 352–58.
Nevo, Daniel, and Malka Gorfine. 2020. “Causal Inference for Semi-Competing Risks Data.” arXiv. https://doi.org/arXiv:2010.04485.
Philipson, Pete, Ines Sousa, Peter J. Diggle, Paula Williamson, Ruwanthi Kolamunnage-Dona, Robin Henderson, and Graeme L. Hickey. 2018. joineR: Joint Modelling of Repeated Measurements and Time-to-Event Data. https://github.com/graemeleehickey/joineR/.
Putter, H., M. Fiocco, and R. B. Geskus. 2007. “Tutorial in Biostatistics: Competing Risks and Multi-State Models.” Statistics in Medicine 26 (11): 2389–2430.
Putter, H, and C Spitoni. 2018. “Non-Parametric Estimation of Transition Probabilities in Non-Markov Multi-State Models: The Landmark Aalen-Johansen Estimator.” Statistical Methods in Medical Research 27: 2081–92.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/.
Rizopoulos, Dimitris. 2010. JM: An R Package for the Joint Modelling of Longitudinal and Time-to-Event Data.” Journal of Statistical Software 35 (9): 1–33. https://doi.org/10.18637/jss.v035.i09.
Rodriguez-Girondo, M., and J. de Uña-Álvarez. 2012. “A Nonparametric Test for Markovianity in the Illness-Death Model.” Statistics in Medicine 31 (30): 4416–27. https://doi.org/10.1002/sim.5619.
———. 2016. “Methods for Testing the Markov Condition in the Illness-Death Model: A Comparative Study.” Statistics in Medicine 35 (20): 3549–62. https://doi.org/10.1002/sim.6940.
Soutinho, G., and L. Meira-Machado. 2021. “Methods for Checking the Markov Condition in Multi-State Survival Data.” Computational Statistics. https://doi.org/10.1007/s00180-021- 01139-7.
Soutinho, Gustavo, Marta Sestelo, and Luís Meira-Machado. 2021. survidm: An R package for Inference and Prediction in an Illness-Death Model.” The R Journal 13 (2): 70–89. https://doi.org/10.32614/RJ-2021-070.
Titman, A. C., and H. Putter. 2020. “General Tests of the Markov Property in Multi-State Models.” Bio-Statistics. https://doi.org/10.1093/biostatistics/kxaa030.
Uña-Álvarez, Jacobo de, and Luis Meira-Machado. 2015. “Nonparametric Estimation of Transition Probabilities in the Non-Markov Illness-Death Model: A Comparative Study.” Biometrics 71 (2): 364–75.
Van Houwelingen, H. C. 2007. Dynamic Prediction by Landmarking in Event History Analysis.” Scandinavian Journal of Statistics 34 (1): 70–85.
Wu, C. F. J. 1986. Jackknife, Bootstrap, and Other Resampling Methods in Regression Analysis.” Annals of Statistics 14 (4): 1261–95.
Xu, J., JD. Kalbfleisch, and B. Tai. 2010. “Statistical Analysis of Illness-Death Processes and Semicompeting Risks Data.” Biometrics 66: 716–27. https://doi.org/10.1111/j.1541-0420.2009.01340.x.