Abstract
This paper illustrates the usage of the betategarch package, a package for the simulation, estimation and forecasting of Beta-Skew-t-EGARCH models. The Beta-Skew-t-EGARCH model is a dynamic model of the scale or volatility of financial returns. The model is characterised by its robustness to jumps or outliers, and by its exponential specification of volatility. The latter enables richer dynamics, since parameters need not be restricted to be positive to ensure positivity of volatility. In addition, the model also allows for heavy tails and skewness in the conditional return (i.e. scaled return), and for leverage and a time-varying long-term component in the volatility specification. More generally, the model can be viewed as a model of the scale of the error in a dynamic regression.It is well known that financial returns are characterised by volatility clustering: Large returns in absolute value are likely to be followed by other large returns in absolute value, and small returns in absolute value are likely to be followed by other small returns in absolute value. This characteristic is usually modelled by specifications in the Autoregressive Conditional Heteroscedasticity (ARCH) class of models by R. Engle (1982). If \(y_t\) denotes the financial return at \(t\) such that \[\notag y_t = \sigma_t\varepsilon_t, \qquad \varepsilon_t\sim IID(0,\sigma_\varepsilon^2), \qquad \sigma_t^2 = h(\sigma_{t-1}, y_{t-1}, \ldots), \qquad t=1,2,\ldots,\] then the scale or volatility \(\sigma_t > 0\) is said to follow an ARCH process. Arguably, the most popular ARCH specification is the first order Generalised ARCH (GARCH) of Bollerslev (1986), where \(\sigma_t^2\) is modelled in an ARMA(1,1)-like manner, \(\sigma_t^2 = \omega + \phi_1\sigma_{t-1}^2 + \kappa_1 y_{t-1}^2\) with \(\sigma_\varepsilon^2 = 1\), see Francq and Zakoïan (2010) for a recent survey of GARCH models. If the financial return in question is predictable, then \(y_t\) can be interpreted as de-meaned return, i.e. the unpredictable part of return. However, more generally, \(y_t\) can be viewed as the error-term in a dynamic regression. Three characteristics that are often exhibited by financial returns are leverage (i.e. volatility asymmetry), conditional fat-tailedness and conditional skewness. The former means volatility tends to be higher after negative returns – this is typically attributed to leverage (hence the name), whereas conditional fat-tailedness means the standardised conditional return (i.e. \(\varepsilon_t\)) is more fat-tailed than the Gaussian. Conditional skewness means the standardised return is not symmetric. For stock returns, the skewness is typically negative, which means the probability of a large negative return is greater than a large positive return, even after controlling or adjusting for the recent level of volatility.
Several R packages provide facilities for the estimation and forecasting of univariate GARCH models that contains one or more of these features. Arguably, the three most important packages are tseries by Trapletti and Hornik (2013), fGarch by Wuertz et al. (2013) and rugarch by Ghalanos (2013). The tseries package probably has the fastest GARCH optimiser, but does not offer state-of-the-art specifications with leverage and fat-tailed skewed densities. This, by contrast, is provided by the fGarch and rugarch packages. The former has been considered by many – including myself – as the benchmark package in R for quite a while, since it provides a wide range of GARCH models coupled with a variety of densities. However, unfortunately, fGarch does not offer the possibility to estimate Exponential GARCH (EGARCH) models, i.e. models where the dynamics is specified in terms of \(\ln\sigma_t^2\) rather than in terms of \(\sigma_t^2\). EGARCH models are attractive, since they enable richer and more flexible dynamics (parameters can be negative), and since the autocorrelation function of returns depends on the conditional density (this is not the case for non-exponential ARCH models). The rugarch package, which despite its relative recent origin (available on CRAN since September 2011) already offers an impressive range of GARCH specifications and variations of these, fills this gap to some extent by providing Nelson’s (1991) EGARCH model. Another package that partially fills this gap is AutoSEARCH by Sucarrat (2012), which offers estimation and automated General-to-Specific model selection of log-ARCH-X models. However, to the best of my knowledge, there are no other R packages that provide facilities for additional EGARCH models. The betategarch package thus contributes to the R world by offering utilities for the simulation, estimation and forecasting of Beta-t-EGARCH models.
The Beta-t-EGARCH model was first proposed by Harvey (2013) and Harvey and Chakravarty (2008), but it can also be viewed as an unrestricted version of the Generalised Autoregressive Score (GAS) model of Creal, Koopmans, and Lucas (2013). The code upon which betategarch is based was originally developed for Harvey and Sucarrat (2013), which extends the Beta-t-EGARCH model to the skewed case. The Beta-Skew-t-EGARCH model has a number of attractions. First, the model is robust to jumps or outliers, and fares very well empirically for a variety of financial returns when compared with other GARCH models, see Harvey and Sucarrat (2013). Second, the model accommodates the most important characteristics of time-varying financial volatility: Leverage, conditional fat-tailedness, conditional skewness and a decomposition of volatility into a short-term and a long-term component. Third, the unconditional moments of return (i.e. \(y_t\)) exist (if the conditional moments exist), which is important in long-term forecasting and for the computation of the autocorrelation function of returns. By contrast, this is generally not the case for Nelson’s (1991) EGARCH when coupled with a \(t\) density: A necessary condition for the unconditional moments of the first-order specification to exist when \(\varepsilon_t\) is \(t\) is that the ARCH parameter is negative, see condition (A1.6) and the subsequent discussion in Nelson (1991, 365). Moreover, in the presence of leverage the ARCH parameter must be even more negative for the unconditional moments to exist. This is why Nelson (1991) proposed his model with a Generalised Error Distribution (GED) instead of a \(t\). Fourth, the asymptotic properties are much easier to derive than those of Nelson’s (1991) EGARCH, see Straumann and Mikosch (2006) and Wintenberger (2012) for a detailed description of the difficulties. Finally, since the conditional score drives the dynamics of the model, the Beta-t-EGARCH acquires some attractive theoretical properties. In particular, a simple transformation of the score is Beta-distributed (hence the name).
The two main functions of the betategarch package (version
3.1) are tegarchSim and tegarch. The first
simulates from a Beta-Skew-t-EGARCH, whereas the latter estimates one.
The tegarch function returns an object (a list) of the
tegarch class, and a collection of S3 methods developed for
this class can be applied to such objects: coef,
fitted, logLik, predict,
print, residuals, summary and
vcov. The rest of the functions in the package are either
auxiliary functions called by the two main functions, or a dataset,
nasdaq, which is included for illustration purposes (see
empirical example below). Finally, it is worth noting that the objects
returned by tegarchSim function and the fitted
and predict methods are of the class zoo
defined in package zoo, see
Zeileis and Grothendieck (2005) and Zeileis, Grothendieck, and Ryan (2013). This
means a large range of useful time-series methods and facilities can be
applied to these objects, including plotting and printing methods.
The martingale difference version of the first order one-component Beta-Skew-t-EGARCH model (see Sections 4 and 6 in Harvey and Sucarrat 2013) is given by \[\begin{aligned} y_t &=& \exp(\lambda_t)\varepsilon_t = \sigma_t\varepsilon_t, \qquad \varepsilon_t\sim st(0,\sigma_\varepsilon^2, \nu, \gamma), \qquad \nu>2, \qquad \gamma \in (0,\infty), \label{eq:betaskewty} \end{aligned} (\#eq:betaskewty) \]
\[\begin{aligned} \lambda_t &=& \omega + \lambda_t^\dagger, \label{eq:betaskewtlambda} \end{aligned} (\#eq:betaskewtlambda) \]
\[\begin{aligned} \lambda_t^\dagger &=& \phi_1\lambda_{t-1}^\dagger + \kappa_1 u_{t-1} + \kappa^* sgn(-y_{t-1})(u_{t-1}+1), \qquad |\phi_1|<1. \label{eq:betaskewtlambdadagger} \end{aligned} (\#eq:betaskewtlambdadagger) \]
The \(\sigma_t\) is the conditional scale or volatility, which need not equal the conditional standard deviation. In other words, \(\varepsilon_t\) is not standardised to have variance one. The conditional standard deviation is obtained as \(\sigma_t\sigma_\varepsilon\), where \(\sigma_\varepsilon^2\) is the variance of \(\varepsilon_t\). The conditional error \(\varepsilon_t\) is distributed as a Skewed \(t\) with zero mean, scale \(\sigma_\varepsilon^2\), degrees of freedom parameter \(\nu\) and skewness parameter \(\gamma\). The conditional error is defined as \(\varepsilon_t = \varepsilon_t^* - \mu_{\varepsilon^*}\), where \(\varepsilon_t^*\) is an uncentred (i.e. mean not necessarily equal to zero) Skewed \(t\) variable with \(\nu\) degrees of freedom, skewness parameter \(\gamma\) and mean \(\mu_{\varepsilon^*}\). A centred and symmetric (i.e. ordinary) \(t\)-distributed variable with mean zero is obtained when \(\gamma=1\), in which \(\mu_{\varepsilon^*}=0\), whereas a left-skewed (right-skewed) \(t\)-variable is obtained when \(\gamma < 1\) (\(\gamma > 1\)). More details on the distribution are given below. The \(\omega\) is the log-scale intercept and can be interpreted as the long-term log-volatility, \(\phi_1\) is the persistence parameter (the bigger, the more clustering), \(\kappa_1\) is the ARCH parameter (the bigger in absolute value, the greater the response to shocks), \(u_t\) is the conditional score (i.e. the derivative of the log-likelihood of \(y_t\) at \(t\) with respect to \(\lambda_t\)) and \(\kappa^*\) is the leverage parameter. A sufficient condition for stability in \(\lambda_t\) is \(|\phi_1| < 1\).
Let \(\epsilon^*\) denote an ordinary (i.e. centred and symmetric) \(t\)-distributed variable (with unit scale), and let \(f(\epsilon^*)\) denote its density. By means of the skewing method of Fernández and Steel (1998), the density of an uncentred Skewed \(t\) variable can be written as
\[f(\varepsilon^*|\gamma)=\frac{2}{\gamma
+\gamma ^{-1}}f\left( \frac{\varepsilon^*}{\gamma
^{sgn(\varepsilon^*)}}\right).
\label{eq:skewtdensityoffernandezsteele} (\#eq:skewtdensityoffernandezsteele) \]
Computation of the density values and of the mean of an uncentred Skewed
\(t\) variable, and random number
generation, can be undertaken with dST, STmean
and rST, respectively. For example, the following code
compares the empirical average of ten thousand random numbers with the
analytical mean:
library(betategarch)
set.seed(123)
eps <- rST(10000, df=5, skew=0.7)
mean(eps)
[1] -0.69805
STmean(df=5, skew=0.7)
[1] -0.6914265
In addition, the functions STvar,
STskewness and STkurtosis return the
analytical values of the variance, skewness (the standardised 3rd
moment) and kurtosis (the standardised 4th moment), respectively, of an
uncentered Skewed \(t\) variable.
The conditional score of the martingale difference version of the Beta-Skew-t-EGARCH model (see Harvey and Sucarrat 2013 Equation (32) in Section 4) is given by
\[\begin{eqnarray} \notag \frac{\partial \ln f_{y}(y_t)}{\partial \lambda_{t}} &=& u_t \\ &=& \frac{(\nu+1)[y_{t}^2 + y_t\mu_{\varepsilon^*}\exp(\lambda_{t})]}{\nu\exp(2\lambda_{t})\gamma^{2sgn(y_{t} + \mu_{\varepsilon^*}\exp(\lambda_{t}))} + (y_{t} + \mu_{\varepsilon^*}\exp(\lambda_{t}))^2} - 1. \label{eq:scoreu} \end{eqnarray} (\#eq:scoreu) \]
It is useful to note, however, that for simulation purposes the score \(u_t\) can also be written more conveniently as
\[u_t = \frac{ (\nu+1)(\varepsilon_t^{*2}-\mu_{\varepsilon^*} \varepsilon_t^*) }{ \nu\gamma^{2sgn(\varepsilon_t^*)} +\varepsilon_t^{*2} } - 1,\] where \(\varepsilon_t^*\) is an uncentred Skewed \(t\) variable. When the conditional distribution is symmetric (i.e. \(\gamma=1\)), then \(\frac{u_t+1}{\nu+1} \sim Beta(1/2,\nu/2)\). This explains the origin of the name Beta-t-EGARCH.
Financial returns that follow the one-component Beta-Skew-t-EGARCH
model given by (@ref(eq:betaskewty))-(@ref(eq:betaskewtlambdadagger))
can be simulated with the tegarchSim function. For example,
in order to generate two thousand returns from a specification with
empirically plausible values on \(\omega\), \(\phi_1\), \(\kappa_1\) and \(\nu\), but without leverage and skewness,
then the following code can be used:
y1 <- tegarchSim(2000, omega=0.1, phi1=0.95, kappa1=0.05, df=10)
Similarly, the following three commands each generate two thousand returns with, respectively, moderate leverage, strong left-skewness, and both moderate leverage and strong left-skewness:
y2 <- tegarchSim(2000, omega=0.1, phi1=0.95, kappa1=0.05, kappastar=0.02, df=10)
y3 <- tegarchSim(2000, omega=0.1, phi1=0.95, kappa1=0.05, df=10, skew=0.8)
y4 <- tegarchSim(2000, omega=0.1, phi1=0.95, kappa1=0.05, kappastar=0.02, df=10,
skew=0.8)
By default, the tegarchSim function returns the values
of \(y_t\) only. However, for the full
set of output one may use the verbose option. For example,
the following code generates 100 observations using the default
parameter values, stores the output in the matrix mY that
is of class zoo and returns the first six observations:
mY <- tegarchSim(100, verbose=TRUE)
head(mY)
y sigma stdev lambda lambdadagg u epsilon
1 0.19977534 1.0000000 1.118034 0.000000000 0.000000000 -0.9562733 0.19977534
2 -1.35118283 0.9904828 1.107393 -0.009562733 -0.009562733 0.7258681 -1.36416581
3 0.15475640 0.9981758 1.115994 -0.001825916 -0.001825916 -0.9736225 0.15503923
4 -0.04853563 0.9885947 1.105282 -0.011470845 -0.011470845 -0.9973492 -0.04909558
5 0.48034223 0.9793455 1.094942 -0.020870795 -0.020870795 -0.7415964 0.49047270
6 0.39742433 0.9731245 1.087986 -0.027243220 -0.027243220 -0.8195400 0.40840028
The last column named "epsilon" contains the centred Skewed \(t\) variable \(\varepsilon_t\) as defined in
(@ref(eq:betaskewty)). The zeros for \(\lambda_t\) and \(\lambda_t^\dagger\) in the first row are
due to the default initial value. This can be changed via the
lambda.initial option.
Squared financial return often exhibit long-memory, see Ding, Engle, and Granger (1993) and Ding and Granger (1996). Two-component models of volatility accommodate the long-memory property by decomposing volatility into one long-term component and one short-term component. The role of the latter is to pick up temporary changes following a shock.
The martingale difference version of the first order two-component Beta-Skew-t-EGARCH model (see Sections 2.5 and 6 in Harvey and Sucarrat 2013) is given by
\[\begin{aligned} y &=& \exp(\lambda_t)\varepsilon_t = \sigma_t\varepsilon_t, \qquad \varepsilon_t\sim st(0,\sigma_\varepsilon^2, \nu, \gamma), \qquad \nu, \gamma \in (0,\infty), \label{eq:2betaskewty} \end{aligned} (\#eq:2betaskewty) \]
\[\begin{aligned} \lambda_t &=& \omega + \lambda_{1,t}^\dagger + \lambda_{2,t}^\dagger, \label{eq:2betaskewtlambda} \end{aligned} (\#eq:2betaskewtlambda) \]
\[\begin{aligned} \lambda_{1,t}^\dagger &=& \phi_1\lambda_{1,t-1}^\dagger + \kappa_1 u_{t-1}, \qquad |\phi_1|<1, \label{eq:2betaskewtlambdadaggerlong} \end{aligned} (\#eq:2betaskewtlambdadaggerlong) \]
\[\begin{aligned} \lambda_{2,t}^\dagger &=& \phi_2\lambda_{2,t-1}^\dagger + \kappa_2 u_{t-1} + \kappa^* sgn(-y_{t-1})(u_{t-1}+1), \qquad |\phi_2|<1, \qquad \phi_1\neq\phi_2. \label{eq:2betaskewtlambdadaggershort} \end{aligned} (\#eq:2betaskewtlambdadaggershort) \] The \(\lambda_{1,t}\) and \(\lambda_{2,t}\) can be interpreted as the time-varying long-term and short-term components of log-volatility, respectively. The conditional score \(u_t\), also here given by (@ref(eq:scoreu)), drives both the long-run and short-run components, but leverage appears only in the short-term component. This is in line with the view that shocks only matter for short-term volatility, see e.g. R. F. Engle and Lee (1999). The model is not identifiable if \(\phi_{2}=\phi_{1}\).
Returns that follow a two-component Beta-Skew-t-EGARCH model given by
(@ref(eq:2betaskewty))-(@ref(eq:2betaskewtlambdadaggershort)) can also
be simulated with the tegarchSim function. For example, the
following code generates three thousand returns from a specification
with empirically plausible values on the other parameters, but without
leverage and skewness:
y1 <- tegarchSim(3000, omega=0.2, phi1=0.98, phi2=0.9, kappa1=0.01, kappa2=0.02, df=5)
Similarly, just as in the one-component case, the following code generates three thousand values of \(y_t\) with, respectively, leverage, skewness, and both leverage and skewness:
y2 <- tegarchSim(3000, omega=0.2, phi1=0.98, phi2=0.9, kappa1=0.01, kappa2=0.02,
kappastar=0.04, df=5)
y3 <- tegarchSim(3000, omega=0.2, phi1=0.98, phi2=0.9, kappa1=0.01, kappa2=0.02,
df=5, skew=0.95)
y4 <- tegarchSim(3000, omega=0.2, phi1=0.98, phi2=0.9, kappa1=0.01, kappa2=0.02,
kappastar=0.04, df=5, skew=0.95)
Also here is the verbose option available for a more
detailed output, and also here can thelambda.initial option
be used to change the initial values.
One-component and two-component specifications can be estimated with
the tegarch function. For example, the following code
generates 5000 values of \(y_t\) with
default parameter values, estimates a one-component specification with
leverage and skewness, and then prints the most important
information:
set.seed(123)
y <- tegarchSim(5000)
onecompmod <- tegarch(y)
onecompmod
Date: Wed Dec 04 19:53:52 2013
Message (nlminb): relative convergence (4)
Coefficients:
omega phi1 kappa1 kappastar df skew
Estimate: -0.002491487 0.92076991 0.014298775 -0.003284977 9.371817 0.99867519
Std. Error: 0.018374338 0.04263685 0.005027258 0.003574193 1.087466 0.01979645
Log-likelihood: -7635.482215
BIC: 3.064414
Estimation without leverage or skewness or both can be achieved by
setting the asym and skew options to
FALSE. For alternative summaries of the estimation results
the summary method can be used, either with its
verbose option set to FALSE (default) or
TRUE (more information is returned). The latter returns,
amongst other, the initial values used, the upper and lower bounds used
(see Section on “Computational challenges” below) and the numerically
estimated Hessian. For additional inference on the parameters the
covariance-matrix of the parameter estimates can be extracted with the
vcov method:
vcov(onecompmod)
omega phi1 kappa1 kappastar df
omega 3.376163e-04 4.821495e-06 -3.369968e-06 3.781322e-06 1.202316e-02
phi1 4.821495e-06 1.817901e-03 -1.321280e-04 7.384223e-05 1.460351e-05
kappa1 -3.369968e-06 -1.321280e-04 2.527332e-05 -5.374307e-06 -5.498031e-04
kappastar 3.781322e-06 7.384223e-05 -5.374307e-06 1.277486e-05 2.261583e-05
df 1.202316e-02 1.460351e-05 -5.498031e-04 2.261583e-05 1.182581e+00
skew 1.808787e-06 -6.327049e-05 3.869190e-06 -7.473051e-06 2.075498e-04
skew
omega 1.808787e-06
phi1 -6.327049e-05
kappa1 3.869190e-06
kappastar -7.473051e-06
df 2.075498e-04
skew 3.918993e-04
In order to estimate a two-component Beta-Skew-t-EGARCH model the
components option has to be set to 2:
twocompmod <- tegarch(y, components=2)
To estimate a two-component model without skewness the
skew argument must be set to FALSE. Leverage,
however, cannot be turned off in the two-component model for
identifiability reasons.
Forecasts of volatility – i.e. either the conditional standard
deviation or the conditional scale – can be generated with the
predict method applied to a tegarch object.
The formula for \(n\)-step ahead
forecasts of conditional scale \(\sigma_t\) is \[E_t(\sigma_{t+n}) =
\exp(\omega+\phi_{1}^{n}\lambda_{t}^\dagger)\cdot \Pi_{i=1}^{n}
E_t\left[ \exp(\phi_1^{n-i}g_{t+i-1}) \right]\] for the
one-component model, where \(g_t\) is
an IID term equal to \(\kappa_1u_t +
\kappa^*sgn(-\varepsilon)(u_t + 1)\). The \(t\) subscript in the conditional
expectation operator \(E_t(\cdot)\)
means the set of conditioning information contains all values up to and
including period \(t\). Accordingly,
for \(i=1\) the value of \(E_t[ \exp(\phi_1^{n-i}g_{t+i-1}) ]\) is
\(\exp(\phi_1^{n-1}g_{t})\). For \(i>1\), however, the expectations are not
available in explicit form, so they are estimated by computing the
sample mean of a large number of simulated IID variates. The default
number of IID variates is 10 000, but this can be changed via the
n.sim option. Another default option is that the
predict method only returns forecasts of the conditional
standard deviation \[E_t(\sigma_{t+n})\sigma_\varepsilon.\]
However, if the verbose option is changed to
TRUE, then forecasts of the conditional scale are also
returned. Similarly, the initial values used are by default those of the
last observation in the estimation sample. This can be altered via the
initial.values option, e.g. for counterfactual or scenario
analysis purposes.
The scale formula for the two-component specification is \[E_t(\sigma_{t+n}) = \exp(\omega+\phi_{1}^{n}\lambda_{1,t}^\dagger+\phi_{2}^{n}\lambda_{2,t}^\dagger)\cdot \Pi_{i=1}^{n} E_t\left[ \exp(\phi_1^{n-i}g_{1,t+i-1}+\phi_2^{n-i}g_{2,t+i-1}) \right],\] where \(g_{1,t} = \kappa_1u_t\) and \(g_{2,t} = \kappa_2u_t + \kappa^*sgn(-\varepsilon)(u_t + 1)\). Again, forecasts of conditional standard deviations are given by \(E_t(\sigma_{t+n})\sigma_\varepsilon\), and the expectations in the last term on the right are estimated by simulation for \(i>1\).
As an example, the following code generates forecasts up to 7-period ahead for both scale and standard deviation for the one-component model estimated above:
set.seed(123)
predict(onecompmod, n.ahead=7, verbose=TRUE)
sigma stdev
1 1.012363 1.141463
2 1.014470 1.143838
3 1.012704 1.141847
4 1.011081 1.140017
5 1.009589 1.138335
6 1.008217 1.136788
7 1.006955 1.135365
The returned object is of class zoo, so a convenient
time-series plotting method is readily available. Similarly, the command
predict(twocompmod, n.ahead=7, verbose=TRUE) generates a
corresponding set of forecasts for the two-component model estimated
above.
Estimation of the Beta-Skew-t-EGARCH model is by exact Maximum
Log-likelihood (ML). The expressions of the first and second derivatives
are not available in explicit form, so the procedure is all numerical.
This leads to four computational challenges. The first is simply that
the model is highly nonlinear, which is readily apparent by simply
looking at the expression for the score (equation (@ref(eq:scoreu))).
Moreover, the degree of non-linearity is compounded in the two-component
specification. However, as no positivity constraints on the ARCH, GARCH
and leverage parameters (i.e. \(\omega,
\phi_1, \phi_2, \kappa_1, \kappa_2, \kappa\)) are needed, the
numerical challenges are in fact not as large as those of, say, the
two-component GARCH model of (R. F. Engle and Lee
1999). There, positivity constraints on all parameters are
necessary. As in all highly nonlinear estimation problems a set of good
initial values is essential. By default, these are 0.02, 0.95, 0.05,
0.01, 10, 0.98 for one-component specifications, and 0.02, 0.95, 0.9,
0.001, 0.01, 0.005, 10, 0.98 for two-component specifications. However,
if the user wishes to do so they can be changed via the
initial.values option in the tegarch function.
The summary method with option verbose=TRUE
returns (amongst other) the initial values used in estimation.
The second computational challenge is that dynamic stability or
stationarity – in practice that \(|\phi_1|
< 1\) (and \(|\phi_2| <
1\) in the two-component case) – is required for estimation,
whereas empirically \(\phi_1\) (and
\(\phi_2\)) is often close to, but just
below, 1. In order to avoid explosive recursions during estimation it is
therefore desirable to restrict \(\phi_1\) (and \(\phi_2\)) such that \(|\phi_1| < 1\) (and \(|\phi_2| < 1\)). My experience suggests
the nlminb function is an excellent choice for this type of
problems. The optim function with the L-BFGS-U option
provides a similar bounding facility, but in practice it does not work
as well as nlminb (it is less precise and fails more often
in my experience). As for bounds, the skewing parameter \(\gamma\) is only defined on strictly
positive values, and on theoretical grounds the degrees of freedom
parameter \(\nu\) must be greater than
2. For these reasons the default lower bounds on \(\phi_1\), \(\phi_2\), \(\nu\) and \(\gamma\) are
-1+.Machine$double.eps,
-1+.Machine$double.eps, 2+.Machine$double.eps
and .Machine$double.eps, and their default upper bounds are
1-.Machine$double.eps, 1-.Machine$double.eps,
+Inf and +Inf (i.e. \(\nu\) and \(\gamma\) are unbounded from above). The
other parameters are unbounded (i.e. their default upper and lower
bounds are +Inf and -Inf, respectively). If
the user wishes to do so the bounds can be changed via the
lower and upper options in the
tegarch function. Additional control options can also be
passed on to the nlminb optimiser (see nlminb
documentation) by introducing them as arguments in the
tegarch function.
A third computational challenge is due to the presence of the sign
function \(sgn\) in the skewed density
(@ref(eq:skewtdensityoffernandezsteele)), which means the log-likelihood
is not differentiable in \(\gamma\) at
the skewness change point. The log-likelihood is continuous, so the
problem is likely to be numerical only and not theoretical. In fact,
consistency and asymptotic normality results often hold regardless (see e.g. Zhu and Galbraith 2010). Most of the
time the user will not encounter problems due to this characteristic,
neither in simulation nor in empirical practice. Occasionally, however,
the numerically estimated gradients (analytical ones are not available
in explicit form) may explode when they iterate towards the proximity of
the skewness change point. The nlminb function together
with its bounding facility resolves this problem to a large extent. For
extra protection against NA and Inf values the
log-likelihood functions (tegarchLogl and
tegarchLogl2) check for NA and
Inf values at each iteration: Whenever such values are
produced, then a small value on the log-likelihood is returned. By
default this small value is the log-likelihood associated with the
initial values, but this can be changed via the
logl.penalty option in the tegarch
function.
The fourth computational challenge is easily resolved. The
expressions for the density function of the \(t\)-distribution and the moments of a \(t\)-distributed variable contain the
gamma function in both the numerator and the denominator.
When the argument of the gamma function is 172 or larger
(i.e. 344 degrees of freedom or higher), then a value of
Inf/Inf = NaN is returned, which
can be detrimental to estimation. This issue is not particular to R but
a consequence of how the Gamma function is implemented numerically in
computer languages in general. Luckily, the issue can be easily resolved
by noting that beta(x,y) = gamma(x) *
gamma(y)/gamma(x+y). The \(t\)-density and the moments of \(t\)-variables are thus readily re-written
in terms of the beta function. The fGarch package
has been using this parametrisation of the \(t\)-distribution for some time (always?),
and I suspect it is for the same reason.
This section illustrates the use of the package in an empirical
example (it is available as a demo by typing
demo(betategarch)). The financial return in question is the
daily log-return (in percent) of the Nasdaq 100 composite index
(adjusted closing values). The study by Harvey
and Sucarrat (2013) suggests stock market indices are
particularly prone to be characterised by leverage, conditional
fat-tailedness, skewness and a long-term component, so a stock market
index is therefore specially suited to illustrate the usefulness of the
Beta-Skew-t-EGARCH model. Also, the Nasdaq index was not included in
Harvey and Sucarrat (2013) study. The
source of the data is http://finance.yahoo.com/ and the period in question is
3 January 2001 to 15 October 2013, i.e. a total of 3215
observations.
The following code loads the data, defines y to equal
daily return in terms of a ‘zoo’ object and plots
y:
data(nasdaq)
y <- zoo(nasdaq[,"nasdaqret"], order.by=as.Date(nasdaq[,"day"], "%Y-%m-%d"))
plot(y, main="The Nasdaq 100 index (daily)", xlab="", ylab="Log-return in %")
The plot appears as the upper graph in Figure 1 and shows that the return series is clearly characterised by time-varying volatility.
To estimate a one-component Beta-Skew-t-EGARCH, to extract its fitted values and to plot the fitted conditional standard deviations, the following code can be used:
nasdaq1comp <- tegarch(y)
nasdaq1stdev <- fitted(nasdaq1comp)
plot(nasdaq1stdev, main="", ylab="1-comp: St.dev.", xlab="")
The estimation results are stored in nasdaq1comp, so
typing nasdaq1comp yields
Date: Mon Dec 09 17:42:06 2013
Message (nlminb): relative convergence (4)
Coefficients:
omega phi1 kappa1 kappastar df skew
Estimate: 1.0421017 0.996543407 0.023508613 0.032033017 10.336337 0.85670426
Std. Error: 0.2412326 0.001184624 0.003542337 0.003065121 1.646172 0.01925872
Log-likelihood: -5586.666891
BIC: 3.490447
The degrees of freedom in the Skewed \(t\) is estimated to be 10.3, a reasonably
fat-tailed conditional \(t\) density,
whereas the skewness is estimated to be about 0.86, which corresponds to
pronounced negative skewness in \(\varepsilon_t\). Also, the test statistic
\((0.8567-1)/0.0193 = -7.4249\) is
significant at all conventional significance levels. For a model without
skewness, the command tegarch(y, skew=FALSE) can be used.
For a closer look at the standardised residuals \(\hat{\varepsilon}_t/\hat{\sigma}_\varepsilon\)
in the estimated model above, the residuals method can be
used for extraction. For the unstandardised residuals \(\hat{\varepsilon}_t\), the
standardised argument must be set to FALSE.
BIC is the Schwarz (1978)
information criterion in terms of the average log-likelihood. By
default, the fitted method returns the fitted conditional
standard deviation. However, more output is available by using the
verbose option, i.e. by typing
fitted(nasdaq1comp, verbose=TRUE). This
returns a matrix with, amongst other, the fitted scale and log-scale,
the estimated score and the residuals. The returned matrix is a
‘zoo’ object that is automatically matched with the dates –
if any – of returns \(y_t\). The middle
graph in Figure 1 contains the plot of the
fitted conditional standard deviations of the one-component model.
To estimate a two-component Beta-Skew-t-EGARCH model, to extract its fitted values and to plot its fitted conditional standard deviation, the following code can be used:
nasdaq2comp <- tegarch(y, components=2)
nasdaq2stdev <- fitted(nasdaq2comp)
plot(nasdaq2stdev, main="", ylab="2-comp: St.dev.", xlab="")
The plot of the fitted conditional standard deviations appears as the
lower graph in Figure 1. Typing
nasdaq2comp returns
Date: Mon Dec 09 17:42:09 2013
Message (nlminb): relative convergence (4)
Coefficients:
omega phi1 phi2 kappa1 kappa2 kappastar
Estimate: 1.4409972 1.0000000000 0.94175462 0.022074604 0.006442775 0.049203985
Std. Error: 0.1991004 0.0004089023 0.01940377 0.004854267 0.006545395 0.006899511
df skew
Estimate: 9.732886 0.89320223
Std. Error: 1.564813 0.02094124
Log-likelihood: -5573.471563
BIC: 3.487262
Comparison of the BIC values of the two models suggests
the latter provides a better fit. This provides further evidence in
favour of two-component models in modelling the volatility of financial
stock-market returns.
To generate out-of-sample forecasts up to 5-days ahead, the following code can be used:
set.seed(123)
predict(nasdaq1comp, n.ahead=5)
1 2 3 4 5
0.8121401 0.8179406 0.8218067 0.8256776 0.8295533
In other words, the predicted conditional standard deviation 5
trading days after the 15th of October 2013 is about 0.8296. By default,
the fitted values of \(\lambda_t\) and
\(\lambda_t^\dagger\) for the last day
of the estimation sample are used as initial values. However,
alternative initial values on \(\lambda_t\) and \(\lambda_t^\dagger\) can be specified by the
user via the initial.values option.
Conceptually the Beta-Skew-t-EGARCH model can appear complicated. So one may ask whether it performs better than conceptually simpler models like, say, an ordinary GARCH model with leverage and the two-component model of R. F. Engle and Lee (1999). An ordinary GARCH(1,1) model with leverage of the Glosten, Jagannathan, and Runkle (1993) type, sometimes referred to as the GJR-GARCH, coupled with a standardised Skewed \(t\) conditional density is given by n\[\begin{eqnarray} y &=& \sigma_t\varepsilon_t, \qquad \varepsilon_t\sim st(0,1, \nu, \gamma), \\ \sigma_t^2 &=& \omega + \phi_1\sigma_{t-1}^2 + \kappa_1 y_{t-1}^2 + \kappa^* I_{\{y_{t-1}<0\}}y_{t-1}^2, \end{eqnarray}\] where the parameters have similar interpretations to those of the Beta-Skew-t-EGARCH specification. The model can be estimated with the fGarch package by means of
library(fGarch)
nasdaqgarch <- garchFit(data=y, cond.dist="sstd",
include.mean=FALSE, include.skew=TRUE, leverage=TRUE)
Next, summary(nasdaqgarch) yields (amongst other)
Estimate Std. Error t value Pr(>|t|)
omega 0.016866 0.004315 3.908 9.29e-05 ***
alpha1 0.040237 0.009882 4.072 4.67e-05 ***
gamma1 0.712143 0.183734 3.876 0.000106 ***
beta1 0.933986 0.008357 111.762 < 2e-16 ***
skew 0.898964 0.021099 42.608 < 2e-16 ***
shape 9.910543 1.562340 6.343 2.25e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-5591.593 normalized: -1.73922
Information Criterion Statistics:
AIC BIC SIC HQIC
3.482173 3.493511 3.482166 3.486237
In the table alpha1, gamma1 and
beta1 correspond to \(\kappa_1\), \(\kappa^*\) and \(\phi_1\), respectively. The skewness and
degrees of freedom estimates are relatively similar to those of the
Beta-Skew-t-EGARCH above. However, the BIC value is lower, which means
the Beta-Skew-t-EGARCH provides a better fit according to this
criterion.
In the rugarch package the first order two-component model of R. F. Engle and Lee (1999) coupled with a standardised Skewed \(t\) conditional density is given by n\[\begin{eqnarray} y &=& \sigma_t\varepsilon_t, \qquad \varepsilon_t\sim st(0,1, \nu, \gamma), \\ \sigma_t^2 &=& q_t + \phi_1(\sigma_{t-1}^2-q_{t-1}) + \kappa_1 (y_{t-1}^2-\sigma_{t-1}^2),\\ q_t^2 &=& \omega + \phi_2q_{t-1}^2 + \kappa_1 (y_{t-1}^2-\sigma_{t-1}^2). \end{eqnarray}\] In other words, rugarch does not provide the option to include leverage in the R. F. Engle and Lee (1999) model. Estimation can be undertaken with
library(rugarch)
EngleLeeSpec <- ugarchspec(variance.model = list(model = "csGARCH",
garchOrder = c(1, 1)), mean.model=list(armaOrder=c(0,0),
include.mean=FALSE), distribution.model="sstd")
nasdaqEngleLee <- ugarchfit(EngleLeeSpec, y, solver="nlminb")
Next, summary(nasdaqgarch) yields (amongst other)
Estimate Std. Error t value Pr(>|t|)
omega 0.008419 0.003911 2.1528 0.031337
alpha1 0.021006 0.014765 1.4228 0.154807
beta1 0.931799 0.044489 20.9446 0.000000
eta11 0.995956 0.002383 417.9196 0.000000
eta21 0.044147 0.013570 3.2533 0.001141
skew 0.912973 0.020984 43.5076 0.000000
shape 11.055129 2.095500 5.2757 0.000000
LogLikelihood : -5617.186
Information Criteria
------------------------------------
Akaike 3.4987
Bayes 3.5119
Shibata 3.4987
Hannan-Quinn 3.5035
In the table alpha1, beta1,
eta11 and eta21 correspond to \(\kappa_1\), \(\phi_1\), \(\phi_2\) and \(\kappa_2\), respectively. The skewness and
degrees of freedom estimates are again relatively similar to those of
the Beta-Skew-t-EGARCH above, and again the BIC value is higher. So also
in this case does the Beta-Skew-t-EGARCH provide a better fit according
to BIC. In fact, the log-likelihood of the R. F.
Engle and Lee (1999) model is even lower than that of the
GJR-GARCH, which contains fewer parameters. This suggests leverage,
which is omitted from the rugarch implementation of the R. F. Engle and Lee (1999) model, is an
important determinant of Nasdaq 100 return volatility.
This paper illustrates how the betategarch package can be
used for the simulation, estimation and forecasting of one-component and
two-component first order Beta-Skew-t-EGARCH models. The model allows
for skewed and heavy-tailed \(t\)-distributed conditional errors, and
leverage and a time-varying long-term component in the volatility
specification. The two main functions of the package are
tegarchSim and tegarch. The first simulates
from a Beta-Skew-t-EGARCH model, either a one-component or two-component
specification, whereas the latter function estimates one. The object (a
list) returned by the second function is of class tegarch,
and a collection of S3 methods can be applied to objects from this
class: coef, fitted, logLik,
predict, print, residuals,
summary and vcov. Finally, the empirical
illustration on daily Nasdaq 100 returns provides further in-sample
evidence in favour of the Beta-Skew-t-EGARCH model.