*Bounty: 100*

*Bounty: 100*

Issue is around high intercept with AFT Regression. Let me explain below:

Suppose you are modelling the time to an event via an Accelerated Failure Time Regression i.e. given survival time $T$, suppose we have observed values of covariates $x_{i1}, …, x_{ip}$ and possibly censored survival time $t_i$, then:

$$ log(t_i) = beta_0 + beta_1 x_{i1} + … + beta_p x_{ip}+ sigma epsilon_i $$

Suppose we are looking at a Weibull AFT i.e. where $epsilon_i $ are IID according to a Gumbel Distribution (Extreme Value Type 1).

You are looking at the case of time varying covariates (assume just one for now) e.g. you have a dataset like the following example with a single time dependent covariate (TDC_1). Where Start is the enter time (period start) and End is the period end (exit time) and UNIT_ID is the ID for the entity in the study:

```
START END EVENT UNIT_ID TDC_1
0 1 0 1 0.1
1 2 0 1 0.2
2 3 0 1 0.3
...
19 20 1 1 1.9
0 1 0 2 0.1
1 2 0 2 0.2
2 3 0 2 0.3
...
19 20 1 2 1.9
```

With the `aftreg`

function from the `eha`

library in R you can construct a Weibull AFT e.g.

```
model <- aftreg(Surv(START, END, EVENT) ~ TDC_1, dist="weibull", data=df, id=UNIT_ID, param='lifeExp')
```

Calling `model.coefficients`

gives:

```
model.coefficients
TDC_1 -0.905
log(scale) 9.393
log(shape) 0.046
```

The expected time to event when $T$ follows a Weibull distrubtion is given by:

$$E(T|X_i) = exp left( beta_0 + x_i beta_1 right)Gamma(1 + sigma) = exp left( 9.393 – 0.905*TDC_1 right)*0.98 $$

As $beta_0 = log(scale)$ and $sigma = frac{1}{exp(log(shape))}$

My question is around these parameter estimates (in particular the the intercept term ($beta_0 = log(scale)$). No matter how I change the error term parameterisation e.g. if $epsilon_i$ are distributed normally (then $T$ lognormal) or if $epsilon_i$ ~ Logistic etc, the intercept is exceptionally high and appears not to be optimal in terms of minimising error on time to event.

**For example if I manually subtract 2 from the intercept (9.393 – 2) I can reduce the root mean squared error on the time to event on the dataset fit**:.

```
Intercept TIME_TO_EVENT_RMSE
9.393 776 days
7.393 97 days
```

Here TIME_TO_EVENT_RMSE is calculated as (with a dataset that only contains non-censored events):

$$ RMSE = sqrt{sum_{i}^{n} frac{(exp left( beta_0 + x_i beta_1 right)Gamma(1 + sigma) – t_i)^2}{n}} $$

For further illustration, suppose you model directly using exponential regression (i.e. linear regression and logging the target variable) with exactly the same dataset (only using non-censored events so the two are comparable). I know they are minimising different loss functions and aren’t directly comparable, but just for illustration purposes:

```
TIME_TO_EVENT UNIT_ID TDC_1
19 1 0.1
18 1 0.2
17 1 0.3
...
```

Here we have:

$$E(T|X_i) = exp left( beta_0 + x_i beta_1 right) = exp left( 8.03 – 0.5*x_i right) $$

I know that AFT Regression is not directly minimising RMSE, and that with the AFT regression the TDC_1 coefficient magnitude is larger in addition to a larger intercept, however with the intercept as high as it is, the model isn’t particularly useful (significantly over-predicting the time to event).

Questions:

- Has anyone experienced this before and have any advice on how to improve the AFT model?
- Is there anyway to fix the scale with time varying covariates in AFTRegression?