Generate a sample of time to event dataset with dependent right censoring under an Archimedean copula

Generate a sample of time to event dataset with, dependent right censoring based on one of the Archimedean copulas the given Kendall's tau, sample size n and covariates matrix Z.

surv_data_dc(n, a, Z, lambda, betas, phis, cons7, cons9, tau, copula, distr.ev, distr.ce)

Arguments

n	the sample size, or the number of the subjects in a sample.
a	the shape parameter of baseline hazard for the event time $T$.
Z	the covariate matrix with dimension of $n$ by $p$, where $p$ is the number of covariates.
lambda	the scale parameter of baseline hazard for event time $T$.
betas	the regression coefficient vector of proportional hazard model for the event time $T$ with dimenion of $p$ by $1$.
phis	the regression coefficient vector of proportional hazard model for dependent censoring time $C$ with dimenion of $p$ by $1$.
cons7	the parameter of baseline hazard for the dependent censoring time $C$ if assuming an exponential distribution.
cons9	the upper limit parameter of uniform distribution for the independent censoring time $A$, i.e. $A$~U(0, cons9).
tau	the Kendall's correlation coefficient between $T$ and $C$.
copula	the Archemedean copula that captures the dependence between $T$ and $C$, a characteristc value, i.e. 'independent', 'clayton', 'gumbel' or 'frank'.
distr.ev	the distribution of the event time, a characteristc value, i.e. 'weibull' or 'log logit'.
distr.ce	the distribution of the dependent censoring time, a characteristc value, i.e. 'exponential' or 'weibull'.

Value

A sample of time to event dataset under dependent right censoring, which includes observed time $X$, event indicator $\delta$ and dependent censoring indicator $\eta$.

Details

surv_data_dc allows to generate a survival dataset under dependent right censoring, at sample size n, based on one of the Archimedean copula, Kendall's tau, and covariates matrix Z with dimension of $n$ by $p$. For example, at p=2, we have Z=cbind(Z1, Z2), where Z1 is treatment generated by distribution of bernoulli(0.5), i.e. 1 represents treatment group and 0 represents control group; Z2 is the age generated by distribution of U(-10, 10).

The generated dataset includes three varaibles, which are $X_i$, $\delta_i$ and $\eta_i$, i.e. $X_i=min(T_i, C_i, A_i)$, $\delta_i=I(X_i=T_i)$ and $\eta_i=I(X_i=C_i)$, for $i=1,\ldots,n$. 'T' represents the event time, whose hazard function is $$h_T(x)=h_{0T}(x)exp(Z^{\top}\beta)$$, where the baseline hazard can take weibull form, i.e. $h_{0T}(x) = ax^{a-1} / \lambda^a$, or log logistic form, i.e. $$ h_{0T}(x) = \frac{ \frac{ 1 }{ a exp( \lambda ) } ( \frac{ x }{ exp( \lambda ) } )^{1/a -1 } }{ 1 + ( \frac{ x }{ exp( \lambda ) } )^{1/a} } $$. 'C' represents the dependent censoring time, whose hazard function is $ h_{C}(x) = h_{0C}(x)exp( Z^{\top}\phi) $, where the baseline hazard can take exponential form, i.e. $h_{0C}(x)=cons7$, or weibull form, i.e. $h_{0C}(x) = ax^{a-1} / \lambda^a$.'A' represents the administrative or independent censoring time, where A~U(0, cons9).

References

Xu J, Ma J, Connors MH, Brodaty H. (2018). "Proportional hazard model estimation under dependent censoring using copulas and penalized likelihood". Statistics in Medicine 37, 2238–2251.

Author

Jing Xu, Jun Ma, Thomas Fung

Examples

 ##-- Copula types
 copula3 <- 'frank'

 ##-- Marginal distribution for T, C, and A
 a <- 2
 lambda <- 2
 cons7 <- 0.2
 cons9 <- 10
 tau <- 0.8
 betas <- c(-0.5, 0.1)
 phis <- c(0.3, 0.2)
 distr.ev <- 'weibull'
 distr.ce <- 'exponential'

 ##-- Sample size
 n <- 200

 ##-- One sample Monte Carlo dataset
 cova <- cbind(rbinom(n, 1, 0.5), runif(n, min=-10, max=10))
 surv <- surv_data_dc(n, a, cova, lambda, betas, phis, cons7, cons9,
                     tau, copula3, distr.ev, distr.ce)
 n <- nrow(cova)
 p <- ncol(cova)
 ##-- event and dependent censoring proportions
 colSums(surv)[c(2,3)]/n
#>   del   eta 
#> 0.480 0.325 
 X <- surv[,1] # Observed time
 del<-surv[,2] # failure status
 eta<-surv[,3] # dependent censoring status

n	the sample size, or the number of the subjects in a sample.
a	the shape parameter of baseline hazard for the event time \(T\).
Z	the covariate matrix with dimension of \(n\) by \(p\), where \(p\) is the number of covariates.
lambda	the scale parameter of baseline hazard for event time \(T\).
betas	the regression coefficient vector of proportional hazard model for the event time \(T\) with dimenion of \(p\) by \(1\).
phis	the regression coefficient vector of proportional hazard model for dependent censoring time \(C\) with dimenion of \(p\) by \(1\).
cons7	the parameter of baseline hazard for the dependent censoring time \(C\) if assuming an exponential distribution.
cons9	the upper limit parameter of uniform distribution for the independent censoring time \(A\), i.e. \(A\)~U(0, cons9).
tau	the Kendall's correlation coefficient between \(T\) and \(C\).
copula	the Archemedean copula that captures the dependence between \(T\) and \(C\), a characteristc value, i.e. 'independent', 'clayton', 'gumbel' or 'frank'.
distr.ev	the distribution of the event time, a characteristc value, i.e. 'weibull' or 'log logit'.
distr.ce	the distribution of the dependent censoring time, a characteristc value, i.e. 'exponential' or 'weibull'.