Stochastic Mortality

StMoMo: An R Package for Stochastic Mortality
Andr´es M. Villegas
UNSW Business School
UNSW Sydney
Pietro Millossovich
Cass Business School
City, University of London
DEAM, University of Trieste
Vladimir K. Kaishev
Cass Business School
City, University of London
In this paper we mirror the framework of generalized (non-)linear models to define the
family of generalized age-period-cohort stochastic mortality models which encompasses
the vast majority of stochastic mortality projection models proposed to date, including
the well-known Lee-Carter and Cairns-Blake-Dowd models. We also introduce the R package StMoMo which exploits the unifying framework of the generalized age-period-cohort
family to provide tools for fitting stochastic mortality models, assessing their goodness
of fit and performing mortality projections. We illustrate some of the capabilities of the
package by performing a comparison of several stochastic mortality models applied to the
England and Wales population.
Keywords: mortality modeling; mortality forecasting; age-period-cohort; generalized nonlinear models.
1. Introduction
During the last two centuries developed countries experienced a persistent increase in life
expectancy. For instance, Oeppen and Vaupel (2002) estimate that during the last 160 years
the world record in female life expectancy at birth has increased at an approximate steady
pace of 3 months per year. This increase in life expectancy, though a sign of social progress,
poses a challenge to governments, private pension plans and life insurers because of its impact on pension and health costs. Actuaries and demographers have recognized the problems
caused by an aging population and rising longevity and have thus devoted significant attention
to the development of statistical techniques for the modeling and projection of mortality rates.
One of the most influential approaches to the stochastic modeling of mortality rates is the
parsimonious mortality model proposed by Lee and Carter (1992). This model uses principal
component analysis to decompose the age-time matrix of mortality rates into a bilinear combination of age and period parameters, with the latter being treated as time series to produce
mortality projections. The Lee-Carter model has inspired numerous variants and extensions.
For instance, Lee and Miller (2001), Booth et al. (2002), and Brouhns et al. (2002) have
proposed alternative estimation approaches in order to improve the goodness-of-fit and the
forecasting properties of the model. In particular, Brouhns et al. (2002) propose a more formal
2 StMoMo: An R Package for Stochastic Mortality Modeling
statistical approach to estimating the parameters by embedding the Lee-Carter model into a
Poisson regression setting. Other authors have extended the Lee-Carter model by including
additional terms, such as multiple bilinear age-period components (Renshaw and Haberman
2003; Hyndman and Ullah 2007), or a cohort effect term (Renshaw and Haberman 2006).
The two factor Cairns-Blake-Dowd (CBD) model introduced by Cairns et al. (2006) is one
of the most prominent variants of the Lee-Carter model. The CBD model relies on the linearity of the logit of one-year death probabilities at older ages. Specifically, it assumes that,
for a given, year the logit of the one-year death probability is a linear function of age, and
treats the intercept and slope parameters across years as stochastic processes. Cairns et al.
(2009) consider three extensions to the original CBD model by incorporating combinations
of a quadratic age term and a cohort effect term. Plat (2009) has combined features of the
CBD and the Lee-Carter models to produce a model that is suitable for full age ranges and
captures the cohort effect.
Given the abundance and rapid increase in the number of stochastic mortality models proposed in the literature, there have been some recent attempts to find the commonalities among
these models. Hunt and Blake (2015) review the structure of mortality models and describe
an age-period-cohort model structure which encompasses the vast majority of stochastic mortality models. Currie (2016) shows that many mortality models can be expressed in terms of
generalized linear models or generalized non-linear models.
In this paper, we build upon the works of Hunt and Blake (2015) and Currie (2016) to define
the family of generalized age-period-cohort stochastic mortality models by mirroring the terminology of generalized linear models. We also introduce the R package StMoMo (Villegas
et al. 2017) which exploits the unifying framework of the generalized age-period-cohort family
combined with the powerful fitting function of the gnm package (Turner and Firth 2015) to
provide computational tools for implementing many of the stochastic mortality models proposed to date.1 The StMoMo package is available at
StMoMo. Version 0.4.0 has been used for this paper.
Several packages for mortality modeling are available in the R environment (R Core Team
2016). The package demography (Hyndman et al. 2014), whose usage is explained in detail
in Booth et al. (2014), implements, among other things, the original Lee-Carter model along
with the Lee and Miller (2001), Booth et al. (2002), and Hyndman and Ullah (2007) variants.
The ilc package (Butt et al. 2014) implements the Renshaw and Haberman (2006) cohort
extension of the Lee-Carter model together with the Lee-Carter model under a Poisson regression framework. The LifeMetrics R functions implement the original CBD model and the
three extended CBD models considered in Cairns et al. (2009), along with the Lee-Carter
model (using Poisson maximum likelihood), the traditional age-period-cohort model (see, Osmond (1985)) and the Renshaw and Haberman (2006) model. This package, which is not on
CRAN, is available at
1The acronym StMoMo, pronounced Saint Momo, stands for Stochastic Mortality Modeling. Momo is the
king of Carnivals in numerous Latin American festivities (Wikipedia 2014).
Andr´es M. Villegas, Vladimir K. Kaishev, Pietro Millossovich 3
There are however several drawbacks of the existing packages which our package StMoMo
seeks to overcome. First, the existing packages are based on model-specific fitting algorithms limiting the models available to those already predefined in the packages. By contrast,
StMoMo allows users to easily expand the number of models available. In addition, whilst
StMoMo provides forecasting and simulation functions for any model within the generalized age-period-cohort family, the existing packages only provide such functions for a limited
number of models. For instance, the package ilc only includes forecasting functions for the
Lee-Carter model. Similarly, simulation with the package LifeMetrics is limited to the LeeCarter and the standard CBD models. Finally, StMoMo provides functions which are not
available in existing packages, such as tools for analyzing the goodness-of-fit2 and evaluating
the impact of parameter uncertainty using bootstrapping techniques.
StMoMo comes with a set of functions for defining an abstract model — specifying for instance
the number of period terms, whether coefficients are parametric or not — and for fitting a
given model. This is particularly useful when estimating several models on a given dataset or
a given model to different datasets. The package also provides preset functions for defining
the most common models available in the mortality forecasting literature. In addition, other
models preferred by the user can be created in a very simple fashion, see Section 4 where
several examples are given. Therefore, the flexibility of the package allows a user to quickly
build up a battery of different models, and this is particularly useful when seeking the most
appropriate mortality model, comparing different models or assessing model risk. StMoMo is
particularly appealing for actuaries managing life and pensions portfolios exposed to longevity
risk. The code backing most functions implemented in the package has been extensively used
and tested for the development of multi-population mortality models for assessing basis risk
in longevity risk transactions, see Haberman et al. (2014).
In this paper we describe the statistical framework underlying StMoMo and illustrate its
usage. For this purpose, we use as a running example a comparison of several stochastic
mortality models fitted to the England and Wales population. This example is in the spirit
of the comparison exercises of Cairns et al. (2009, 2011), Haberman and Renshaw (2011) and
Lov´asz (2011), allowing us to show how several of the analysis performed in these papers can
easily be replicated using StMoMo. The structure of the paper is as follows. In Section 2 we
introduce our notation. In Section 3, we mirror the framework of generalized linear models to
define the family of generalized age-period-cohort (GAPC) stochastic mortality models and
demonstrate that many of the mortality models discussed in the literature can be framed
within this family. In Section 4 we explain how the GAPC family of models is implemented
in StMoMo. In Section 5, we describe the fitting of GAPC mortality models and illustrate
how this can be accomplished using StMoMo. In Section 6 we consider the evaluation of
the goodness-of-fit of GAPC models. In Section 7 we discuss the forecasting and simulation
of GAPC models using time series techniques. Section 8 describes the use of bootstrapping
techniques to incorporate parameter uncertainty in the estimation and forecasting of GAPC
mortality models. Finally, in Section 9 we provide some conclusions and discuss possible
extensions of the StMoMo package.
2We note that ilc also provides some graphical tools for assessing the goodness of fit of the models implemented in that package.
4 StMoMo: An R Package for Stochastic Mortality Modeling
2. Notation and data
Let the random variable Dxt denote the number of deaths in a population at age x last
birthday during calendar year t. Also let dxt denote the observed number of deaths, Ext c the
central exposed to risk at age x in year t, and Ext 0 the corresponding initial exposed to risk.
The one-year death probability for an individual aged x last birthday and in calendar year
t, denoted qxt, can be estimated as ˆ qxt = dxtExt 0 . The force of mortality and central death
rates are denoted by µxt and mxt, respectively, with the empirical estimate of the latter being
mˆ xt = dxt/Ext c . Under the assumption that the force of mortality is constant over each year
of age and calendar year, i.e., from age x to age x + 1 and year t to t + 1, then the force of
mortality µxt and the death rate mxt coincide. We assume that this is the case throughout.
In StMoMo and throughout this paper we assume that deaths, dxt, and either central exposures, Ext c , or initial exposures, Ext 0 , are available in a rectangular array format comprising
ages (on the rows) x = x1, x2, . . . , xk, and calendar years (on the columns) t = t1, t2, . . . , tn,
When only central exposures are available and initial exposures are required (or vice-versa),
one can approximate the initial exposures by adding half the matching reported numbers of
deaths to the central exposures, i.e, Ext 0 ≈ Ext c + 1 2dxt. When the context is clear, we may
write Ext to refer to Ext 0 or Ext c .
3. Generalized APC stochastic mortality models
Some authors have recently sought to identify the similarities among stochastic mortality
models. For instance, Hunt and Blake (2015) describe an age-period-cohort (APC) model
structure which encompasses the vast majority of stochastic mortality models. In another
interesting contribution, Currie (2016) shows that many common mortality models can be
expressed in the standard terminology of generalized linear or non-linear models. In this
section, we build upon the aforementioned papers to define the family of generalized ageperiod-cohort (GAPC) stochastic mortality models.
Akin to generalized linear models (see, e.g., McCullagh and Nelder (1989)), a GAPC stochastic
mortality model is comprised of four components:
1. The random component: the numbers of deaths Dxt follow a Poisson distribution or a
Binomial distribution, so that
Dxt ∼ Poisson(Ext c µxt)
Dxt ∼ Binomial(Ext 0 , qxt),
with E (Dxt/Ext c ) = µxt and E DxtExt 0 = qxt, respectively.
2. The systematic component: following Hunt and Blake (2015) the effects of age x, calendar year t and year-of-birth (cohort) c = t – x are captured through a predictor ηxt
given by:
ηxt = αx +
N X i
βx(i)κ( ti) + βx(0)γt-x.
Andr´es M. Villegas, Vladimir K. Kaishev, Pietro Millossovich 5
❼ The term αx is a static age function capturing the general shape of mortality by
❼ N ≥ 0 is an integer indicating the number of age-period terms describing the
mortality trends, with each time index κ( ti), i = 1, . . . , N, contributing in specifying
the mortality trend and βx(i) modulating its effect across ages.
❼ The term γt-x accounts for the cohort effect with βx(0) modulating its effect across
The age modulating terms βx(i), i = 0, 1, . . . , N, can be either pre-specified functions of
age, i.e., βx(i) ≡ fi(x), as in CBD type models, or non-parametric terms without any
prior structure which need to be estimated as in the Lee-Carter model. In the GAPC
family we assume that the period indexes κ( ti), i = 1, . . . , N, and the cohort index γt-x
are stochastic processes. This is the key feature that allows the stochastic projection
of GAPC models and thus the generation of probabilistic forecasts of future mortality
3. The link function g associating the random component and the systematic component
so that
g E DExt xt = ηxt.
Although a number of link functions would be possible, it is convenient to use the socalled canonical link and pair the Poisson distribution with the log link function and
the Binomial distribution with the logit link function (see, e.g., Currie (2016) for a discussion of this in the context of mortality models and McCullagh and Nelder (1989) in
the wider context of GLMs).
4. The set of parameter constraints: most stochastic mortality models are only identifiable
up to a transformation and thus require parameter constraints to ensure unique parameter estimates. These parameter constraints are applied through a constraint function
v which maps an arbitrary vector of parameters
θ := αx, βx(1), …, βx(N), κ(1) t , …, κ( tN), βx(0), γt-x
into a vector of transformed parameters
v(θ) = θ˜ = α˜x, β˜x(1), …, β˜x(N), κ˜(1) t , …, κ˜( tN), β˜x(0), γ˜t-x
satisfying the model constraints with no effect on the predictor ηxt (i.e., θ and θ˜ result
in the same ηxt).
Most stochastic mortality models proposed in the literature belong to the GAPC family.
This includes the original Lee-Carter model, the extensions of the Lee-Carter proposed in
6 StMoMo: An R Package for Stochastic Mortality Modeling
Renshaw and Haberman (2003, 2006), the original CBD model, and the extended CBD models of Cairns et al. (2009). In addition, all the model structures considered in Haberman and
Renshaw (2011), Lov´asz (2011) and van Berkum et al. (2014), as well as the models of Plat
(2009), Aro and Pennanen (2011), O’Hare and Li (2012), B¨ orger et al. (2013) and Alai and
Sherris (2014), are part of the GAPC class of models.3
Next, we describe in detail some of these models highlighting how they can be framed within
the GAPC family.
3.1. Lee-Carter model under a Poisson setting
Brouhns et al. (2002) have implemented the Lee-Carter model assuming a Poisson distribution
of the number of deaths and using the log link function with respect to the force of mortality
µxt. The predictor structure proposed by Lee and Carter (1992) assumes that there is a static
age function, αx, a unique non-parametric age-period term (N = 1), and no cohort effect.
Thus, the predictor is given by:
ηxt = αx + βx(1)κ(1) t (1)
In order to project mortality, the time index κ(1) t is modeled and forecasted using ARIMA
processes. Typically, a random walk with drift has been shown to provide a reasonable fit,
that is,
t = δ + κ(1) t-1 + ξt, ξt ∼ N(0, σκ2) i.i.d.,
where δ is the drift parameter and ξt is a Gaussian white noise process with variance σκ2.
The Lee-Carter model is only identifiable up to a transformation, as for arbitrary real constants c1 and c2 6= 0 the parameters in Equation 1 can be transformed in the following way
αx, βx(1), κ(1) t → αx + c1βx(1), c12 βx(1), c2(κ(1) t – c1) , (2)
leaving ηxt unchanged. To ensure identifiability of the model, Lee and Carter (1992) suggest
the following set of parameter constraints

βx(1) = 1,

t = 0, (3)
which can be imposed by choosing
c1 =
1 n
X t
t , c2 = X
βx(1) (4)
in transformation (2).
3.2. Renshaw and Haberman model: Lee-Carter with cohort effects
3We note however that models which rely on the smoothness of mortality over both age and time, such as the
graduation approach of Renshaw et al. (1996) and the P-Spline model of Currie et al. (2006), do not belong to
the GAPC family. The P-Spline approach is implemented in the MortalitySmooth R package (Camarda 2012).
Andr´es M. Villegas, Vladimir K. Kaishev, Pietro Millossovich 7
Renshaw and Haberman (2006) generalize the Lee-Carter model by incorporating a cohort
effect to obtain the predictor: