# Stochastic Mortality

StMoMo: An R Package for Stochastic Mortality

Modeling

Andr´es M. Villegas

UNSW Business School

UNSW Sydney

Pietro Millossovich

Cass Business School

City, University of London

DEAM, University of Trieste

Vladimir K. Kaishev

Cass Business School

City, University of London

Abstract

In this paper we mirror the framework of generalized (non-)linear models to define the

family of generalized age-period-cohort stochastic mortality models which encompasses

the vast majority of stochastic mortality projection models proposed to date, including

the well-known Lee-Carter and Cairns-Blake-Dowd models. We also introduce the R package StMoMo which exploits the unifying framework of the generalized age-period-cohort

family to provide tools for fitting stochastic mortality models, assessing their goodness

of fit and performing mortality projections. We illustrate some of the capabilities of the

package by performing a comparison of several stochastic mortality models applied to the

England and Wales population.

Keywords: mortality modeling; mortality forecasting; age-period-cohort; generalized nonlinear models.

1. Introduction

During the last two centuries developed countries experienced a persistent increase in life

expectancy. For instance, Oeppen and Vaupel (2002) estimate that during the last 160 years

the world record in female life expectancy at birth has increased at an approximate steady

pace of 3 months per year. This increase in life expectancy, though a sign of social progress,

poses a challenge to governments, private pension plans and life insurers because of its impact on pension and health costs. Actuaries and demographers have recognized the problems

caused by an aging population and rising longevity and have thus devoted significant attention

to the development of statistical techniques for the modeling and projection of mortality rates.

One of the most influential approaches to the stochastic modeling of mortality rates is the

parsimonious mortality model proposed by Lee and Carter (1992). This model uses principal

component analysis to decompose the age-time matrix of mortality rates into a bilinear combination of age and period parameters, with the latter being treated as time series to produce

mortality projections. The Lee-Carter model has inspired numerous variants and extensions.

For instance, Lee and Miller (2001), Booth et al. (2002), and Brouhns et al. (2002) have

proposed alternative estimation approaches in order to improve the goodness-of-fit and the

forecasting properties of the model. In particular, Brouhns et al. (2002) propose a more formal

2 StMoMo: An R Package for Stochastic Mortality Modeling

statistical approach to estimating the parameters by embedding the Lee-Carter model into a

Poisson regression setting. Other authors have extended the Lee-Carter model by including

additional terms, such as multiple bilinear age-period components (Renshaw and Haberman

2003; Hyndman and Ullah 2007), or a cohort effect term (Renshaw and Haberman 2006).

The two factor Cairns-Blake-Dowd (CBD) model introduced by Cairns et al. (2006) is one

of the most prominent variants of the Lee-Carter model. The CBD model relies on the linearity of the logit of one-year death probabilities at older ages. Specifically, it assumes that,

for a given, year the logit of the one-year death probability is a linear function of age, and

treats the intercept and slope parameters across years as stochastic processes. Cairns et al.

(2009) consider three extensions to the original CBD model by incorporating combinations

of a quadratic age term and a cohort effect term. Plat (2009) has combined features of the

CBD and the Lee-Carter models to produce a model that is suitable for full age ranges and

captures the cohort effect.

Given the abundance and rapid increase in the number of stochastic mortality models proposed in the literature, there have been some recent attempts to find the commonalities among

these models. Hunt and Blake (2015) review the structure of mortality models and describe

an age-period-cohort model structure which encompasses the vast majority of stochastic mortality models. Currie (2016) shows that many mortality models can be expressed in terms of

generalized linear models or generalized non-linear models.

In this paper, we build upon the works of Hunt and Blake (2015) and Currie (2016) to define

the family of generalized age-period-cohort stochastic mortality models by mirroring the terminology of generalized linear models. We also introduce the R package StMoMo (Villegas

et al. 2017) which exploits the unifying framework of the generalized age-period-cohort family

combined with the powerful fitting function of the gnm package (Turner and Firth 2015) to

provide computational tools for implementing many of the stochastic mortality models proposed to date.1 The StMoMo package is available at http://cran.r-project.org/package=

StMoMo. Version 0.4.0 has been used for this paper.

Several packages for mortality modeling are available in the R environment (R Core Team

2016). The package demography (Hyndman et al. 2014), whose usage is explained in detail

in Booth et al. (2014), implements, among other things, the original Lee-Carter model along

with the Lee and Miller (2001), Booth et al. (2002), and Hyndman and Ullah (2007) variants.

The ilc package (Butt et al. 2014) implements the Renshaw and Haberman (2006) cohort

extension of the Lee-Carter model together with the Lee-Carter model under a Poisson regression framework. The LifeMetrics R functions implement the original CBD model and the

three extended CBD models considered in Cairns et al. (2009), along with the Lee-Carter

model (using Poisson maximum likelihood), the traditional age-period-cohort model (see, Osmond (1985)) and the Renshaw and Haberman (2006) model. This package, which is not on

CRAN, is available at http://www.macs.hw.ac.uk/~andrewc/lifemetrics/.

1The acronym StMoMo, pronounced Saint Momo, stands for Stochastic Mortality Modeling. Momo is the

king of Carnivals in numerous Latin American festivities (Wikipedia 2014).

Andr´es M. Villegas, Vladimir K. Kaishev, Pietro Millossovich 3

There are however several drawbacks of the existing packages which our package StMoMo

seeks to overcome. First, the existing packages are based on model-specific fitting algorithms limiting the models available to those already predefined in the packages. By contrast,

StMoMo allows users to easily expand the number of models available. In addition, whilst

StMoMo provides forecasting and simulation functions for any model within the generalized age-period-cohort family, the existing packages only provide such functions for a limited

number of models. For instance, the package ilc only includes forecasting functions for the

Lee-Carter model. Similarly, simulation with the package LifeMetrics is limited to the LeeCarter and the standard CBD models. Finally, StMoMo provides functions which are not

available in existing packages, such as tools for analyzing the goodness-of-fit2 and evaluating

the impact of parameter uncertainty using bootstrapping techniques.

StMoMo comes with a set of functions for defining an abstract model — specifying for instance

the number of period terms, whether coefficients are parametric or not — and for fitting a

given model. This is particularly useful when estimating several models on a given dataset or

a given model to different datasets. The package also provides preset functions for defining

the most common models available in the mortality forecasting literature. In addition, other

models preferred by the user can be created in a very simple fashion, see Section 4 where

several examples are given. Therefore, the flexibility of the package allows a user to quickly

build up a battery of different models, and this is particularly useful when seeking the most

appropriate mortality model, comparing different models or assessing model risk. StMoMo is

particularly appealing for actuaries managing life and pensions portfolios exposed to longevity

risk. The code backing most functions implemented in the package has been extensively used

and tested for the development of multi-population mortality models for assessing basis risk

in longevity risk transactions, see Haberman et al. (2014).

In this paper we describe the statistical framework underlying StMoMo and illustrate its

usage. For this purpose, we use as a running example a comparison of several stochastic

mortality models fitted to the England and Wales population. This example is in the spirit

of the comparison exercises of Cairns et al. (2009, 2011), Haberman and Renshaw (2011) and

Lov´asz (2011), allowing us to show how several of the analysis performed in these papers can

easily be replicated using StMoMo. The structure of the paper is as follows. In Section 2 we

introduce our notation. In Section 3, we mirror the framework of generalized linear models to

define the family of generalized age-period-cohort (GAPC) stochastic mortality models and

demonstrate that many of the mortality models discussed in the literature can be framed

within this family. In Section 4 we explain how the GAPC family of models is implemented

in StMoMo. In Section 5, we describe the fitting of GAPC mortality models and illustrate

how this can be accomplished using StMoMo. In Section 6 we consider the evaluation of

the goodness-of-fit of GAPC models. In Section 7 we discuss the forecasting and simulation

of GAPC models using time series techniques. Section 8 describes the use of bootstrapping

techniques to incorporate parameter uncertainty in the estimation and forecasting of GAPC

mortality models. Finally, in Section 9 we provide some conclusions and discuss possible

extensions of the StMoMo package.

2We note that ilc also provides some graphical tools for assessing the goodness of fit of the models implemented in that package.

4 StMoMo: An R Package for Stochastic Mortality Modeling

2. Notation and data

Let the random variable Dxt denote the number of deaths in a population at age x last

birthday during calendar year t. Also let dxt denote the observed number of deaths, Ext c the

central exposed to risk at age x in year t, and Ext 0 the corresponding initial exposed to risk.

The one-year death probability for an individual aged x last birthday and in calendar year

t, denoted qxt, can be estimated as ˆ qxt = dxtExt 0 . The force of mortality and central death

rates are denoted by µxt and mxt, respectively, with the empirical estimate of the latter being

mˆ xt = dxt/Ext c . Under the assumption that the force of mortality is constant over each year

of age and calendar year, i.e., from age x to age x + 1 and year t to t + 1, then the force of

mortality µxt and the death rate mxt coincide. We assume that this is the case throughout.

In StMoMo and throughout this paper we assume that deaths, dxt, and either central exposures, Ext c , or initial exposures, Ext 0 , are available in a rectangular array format comprising

ages (on the rows) x = x1, x2, . . . , xk, and calendar years (on the columns) t = t1, t2, . . . , tn,

When only central exposures are available and initial exposures are required (or vice-versa),

one can approximate the initial exposures by adding half the matching reported numbers of

deaths to the central exposures, i.e, Ext 0 ≈ Ext c + 1 2dxt. When the context is clear, we may

write Ext to refer to Ext 0 or Ext c .

3. Generalized APC stochastic mortality models

Some authors have recently sought to identify the similarities among stochastic mortality

models. For instance, Hunt and Blake (2015) describe an age-period-cohort (APC) model

structure which encompasses the vast majority of stochastic mortality models. In another

interesting contribution, Currie (2016) shows that many common mortality models can be

expressed in the standard terminology of generalized linear or non-linear models. In this

section, we build upon the aforementioned papers to define the family of generalized ageperiod-cohort (GAPC) stochastic mortality models.

Akin to generalized linear models (see, e.g., McCullagh and Nelder (1989)), a GAPC stochastic

mortality model is comprised of four components:

1. The random component: the numbers of deaths Dxt follow a Poisson distribution or a

Binomial distribution, so that

Dxt ∼ Poisson(Ext c µxt)

or

Dxt ∼ Binomial(Ext 0 , qxt),

with E (Dxt/Ext c ) = µxt and E DxtExt 0 = qxt, respectively.

2. The systematic component: following Hunt and Blake (2015) the effects of age x, calendar year t and year-of-birth (cohort) c = t – x are captured through a predictor ηxt

given by:

ηxt = αx +

N X i

=1

βx(i)κ( ti) + βx(0)γt-x.

Andr´es M. Villegas, Vladimir K. Kaishev, Pietro Millossovich 5

Here:

❼ The term αx is a static age function capturing the general shape of mortality by

age.

❼ N ≥ 0 is an integer indicating the number of age-period terms describing the

mortality trends, with each time index κ( ti), i = 1, . . . , N, contributing in specifying

the mortality trend and βx(i) modulating its effect across ages.

❼ The term γt-x accounts for the cohort effect with βx(0) modulating its effect across

ages.

The age modulating terms βx(i), i = 0, 1, . . . , N, can be either pre-specified functions of

age, i.e., βx(i) ≡ fi(x), as in CBD type models, or non-parametric terms without any

prior structure which need to be estimated as in the Lee-Carter model. In the GAPC

family we assume that the period indexes κ( ti), i = 1, . . . , N, and the cohort index γt-x

are stochastic processes. This is the key feature that allows the stochastic projection

of GAPC models and thus the generation of probabilistic forecasts of future mortality

rates.

3. The link function g associating the random component and the systematic component

so that

g E DExt xt = ηxt.

Although a number of link functions would be possible, it is convenient to use the socalled canonical link and pair the Poisson distribution with the log link function and

the Binomial distribution with the logit link function (see, e.g., Currie (2016) for a discussion of this in the context of mortality models and McCullagh and Nelder (1989) in

the wider context of GLMs).

4. The set of parameter constraints: most stochastic mortality models are only identifiable

up to a transformation and thus require parameter constraints to ensure unique parameter estimates. These parameter constraints are applied through a constraint function

v which maps an arbitrary vector of parameters

θ := αx, βx(1), …, βx(N), κ(1) t , …, κ( tN), βx(0), γt-x

into a vector of transformed parameters

v(θ) = θ˜ = α˜x, β˜x(1), …, β˜x(N), κ˜(1) t , …, κ˜( tN), β˜x(0), γ˜t-x

satisfying the model constraints with no effect on the predictor ηxt (i.e., θ and θ˜ result

in the same ηxt).

Most stochastic mortality models proposed in the literature belong to the GAPC family.

This includes the original Lee-Carter model, the extensions of the Lee-Carter proposed in

6 StMoMo: An R Package for Stochastic Mortality Modeling

Renshaw and Haberman (2003, 2006), the original CBD model, and the extended CBD models of Cairns et al. (2009). In addition, all the model structures considered in Haberman and

Renshaw (2011), Lov´asz (2011) and van Berkum et al. (2014), as well as the models of Plat

(2009), Aro and Pennanen (2011), O’Hare and Li (2012), B¨ orger et al. (2013) and Alai and

Sherris (2014), are part of the GAPC class of models.3

Next, we describe in detail some of these models highlighting how they can be framed within

the GAPC family.

3.1. Lee-Carter model under a Poisson setting

Brouhns et al. (2002) have implemented the Lee-Carter model assuming a Poisson distribution

of the number of deaths and using the log link function with respect to the force of mortality

µxt. The predictor structure proposed by Lee and Carter (1992) assumes that there is a static

age function, αx, a unique non-parametric age-period term (N = 1), and no cohort effect.

Thus, the predictor is given by:

ηxt = αx + βx(1)κ(1) t (1)

In order to project mortality, the time index κ(1) t is modeled and forecasted using ARIMA

processes. Typically, a random walk with drift has been shown to provide a reasonable fit,

that is,

κ

(1)

t = δ + κ(1) t-1 + ξt, ξt ∼ N(0, σκ2) i.i.d.,

where δ is the drift parameter and ξt is a Gaussian white noise process with variance σκ2.

The Lee-Carter model is only identifiable up to a transformation, as for arbitrary real constants c1 and c2 6= 0 the parameters in Equation 1 can be transformed in the following way

αx, βx(1), κ(1) t → αx + c1βx(1), c12 βx(1), c2(κ(1) t – c1) , (2)

leaving ηxt unchanged. To ensure identifiability of the model, Lee and Carter (1992) suggest

the following set of parameter constraints

X βx(1) = 1, |
X |

x

t

κ

(1)

t = 0, (3)

which can be imposed by choosing

c1 =

1 n

X t

κ

(1)

t , c2 = X

x

βx(1) (4)

in transformation (2).

3.2. Renshaw and Haberman model: Lee-Carter with cohort effects

3We note however that models which rely on the smoothness of mortality over both age and time, such as the

graduation approach of Renshaw et al. (1996) and the P-Spline model of Currie et al. (2006), do not belong to

the GAPC family. The P-Spline approach is implemented in the MortalitySmooth R package (Camarda 2012).

Andr´es M. Villegas, Vladimir K. Kaishev, Pietro Millossovich 7

Renshaw and Haberman (2006) generalize the Lee-Carter model by incorporating a cohort

effect to obtain the predictor: