Bootstrap

Bias Correction with Bootstrap Bootstrap Confidence Intervals Bootstrap for Regression

Sampling distribution of θ^ˆis hard to derive.

⁾Or, we just don’t feel like taking the time to.

Our sample size is not tiny but is too small for “asymptopia”.

Nonparametric bootstrap: Let Y= [Y, . . . , Y] ^iidF

) (some unknown CDF).

1 ⁿ^∼Y

- F^ˆ(y)
n i=1 Let F^ˆ_n(y) = P^{^}r(Y≤ y) = ¹^ΣⁿI(Yi≤ y) (fraction of the sample ≤ y)

⇒ = The distribution of the sample approximates the distribution of the population as nincreases.

⇒ = Sampling from a sufficiently large sample approximates sampling from the population.

^P^a^rametri^c^bo^otstrap:^►^If^an^estimat^o^r^θ^ˆn^is^consistent^and^th^e^m^o^de^lⁱ^s^c^o^rrect,

^p⁽^·|^θ^ˆn⁾^ap^p^r^o^ximate^s^p⁽^·|^θ⁾^.

⁼^⇒^Sampling^fro^m^p⁽^·|^θ^ˆn⁾^ap^p^r^o^ximate^s^sampling^fro^m^th^e^p^opulation.

Notation: Y ^s= [Y ^s, . . . , Y ^s] the bth “resample” with replacement from observed

^bb1 bn _ˆ

_ˆ^vector^Y^or^p^(·|^θn⁾

≡ θ(FY) θ the true parameter value or some population quantity function of

Bootstrap

Bias Correction with Bootstrap Bootstrap Confidence Intervals Bootstrap for Regression

Let Y₁, . . . , Y

iid

fⁿ^∼Y

(y; θ) = θe⁻^θ^y, for y> 0 and θ > 0.

Σn

l(θ) = (log θ − θYi) = nlog θ − θnY^¯

i =1

Σ_► But,

n i=1

Yi∼ Ga(a= n, b= θ)

θ^ˆ= (Y^¯)⁻¹

Y^¯∼ Ga(a= n, b= nθ)
n 1 E(θ^ˆ) = E(1/Y^¯) = EInvGa(a= n, b= nθ) = ⁿθ > θ

Also, by Jensen’s Inequality, −

E(Y^¯) = 1/θ (a convex function)

∴ E(θ^ˆ) = E(1/Y^¯) > 1/ E(Y^¯) = θ.

Recall,

FY population from which Y are drawn

θ^ˆ(FY) population quantity

θ^ˆ_n(Y) some estimator of θ^ˆ(FY)

biasF_Y[θ^ˆ_n(Y)] = EF_Y[θ^ˆ_n(Y)] − θ^ˆ(FY)

An estimator θ^ˆ_n(Y) is biasedif biasF_Y[θ^ˆ_n(Y)] ƒ= 0
Note that Wakefield (2013, Sec. 2.7.1) implicitly defines it the other way, as

θ^ˆ(FY) − EF_Y[θ^ˆ_n(Y)].

Jensen’s inequality: If it happens that E(Y) = g(θ) for some g, and θ^ˆ= g⁻¹(Y^¯), and g(y) is concave or convex, θ^ˆwill be biased for θ.

0.5

n <- 20 # Samplesize

theta.true <- 1 #Exponentialdistributiontruerateparameter

S <- 1000 # Number of replications for the simulation

B <- 999 #Bootstrapresamplesize

#Afunctiontogenerateadataset

mk.data <- function(n, theta=theta.true) rexp(n,theta)

# MLE

theta.mle <- function(y) 1/mean(y)

# MLE sampling distribution

theta.mle.dist <- replicate(S, y.obs <- mk.data(n) theta.mle(y.obs)

)

mean(theta.mle.dist) # Mean of the sampling distribution

## [1] 1.047768

mean(theta.mle.dist)-theta.true #Bias

## [1] 0.04776839

1.0 1.5 2.0 * = mean

Suppose we have a biased estimator:

biasF_Y[θ^ˆ_n(Y)] = E[θ^ˆ_n(Y)] − θ^ˆ(FY) =ƒ 0.

⁾We don’t know either expectation.

^Idea:^trea^t^F^ˆn^as^F^.
^Y^s^∼^F^ˆn^,^so^θ^ˆ⁽^Y^s⁾^wil^l^b^e^biase^d^relativ^e^t^o^θ^ˆ⁽^Y⁾^≡^θ^ˆ⁽^F^ˆn⁾^b^y^ap^p^r^o^ximatel^y^the

b b

same amount:

bias_ˆ_n[θ^ˆ_n(Y^s)] = E_ˆ_n[θ^ˆ_n(Y^s)] − θ^ˆ_n(Y) ≈ EF[θ^ˆ_n(Y)] − θ^ˆ(FY) = biasF[θ^ˆ_n(Y)]

F ^bF ^bY

cancel some of the bias:

b=1

b=1
^≈¹^Σ^θ^ˆn⁽^Y^s⁾⁻^θ^ˆn⁽^Y⁾

⁼^⇒^Subtracting^off^θ^ˆn⁽^Y^s⁾⁻^θ^ˆn⁽^Y⁾⁼¹^Σ^B^θ^ˆn⁽^Y^s⁾⁻^θ^ˆn⁽^Y⁾^fro^m^θ^ˆn⁽^Y⁾^can

^θ^ˆn,u⁽^Y⁾⁼^θ^ˆn⁽^Y⁾⁻.^θ^ˆn⁽^Y^s⁾⁻^θ^ˆn⁽^Y⁾Σ ⁼²^θ^ˆn⁽^Y⁾⁻^θ^ˆn⁽^Y^s⁾

# Nonparamametric bootstrap sampling

myboot.np <- function(B, y.obs, theta.f)

replicate(B,

y.b <- sample(y.obs,replace=TRUE) theta.f(y.b)

)

* = mean

# Sampling distribution of the mean of the bootstrap samples

2.0 theta.boot.dist <- replicate(S, y.obs <- mk.data(n)

mean(myboot.np(B, y.obs, theta.mle))

)

1.5mean(theta.boot.dist)-theta.true #Biasedbyabouttwiceasmuch

## [1] 0.1177986

# Debiased estimate and simulation

1.0 theta.bc <- function(theta.obs, theta.boot) theta.obs – (mean(theta.boot) – theta.obs)

0.5

theta.bc.dist <- replicate(S, y.obs <- mk.data(n)

theta.obs <- theta.mle(y.obs)

theta.boot <- myboot.np(B, y.obs, theta.mle)

theta.bc(theta.obs,theta.boot)

)

mean(theta.bc.dist)-theta.true #Muchlessbiased.

## [1] -0.001975362

MLE

Bootstrap

Bias Correction with Bootstrap Bootstrap Confidence Intervals Bootstrap for Regression

Bootstrap

Bias Correction with Bootstrap Bootstrap Confidence Intervals

Basic approaches

Refined approaches Demonstration

Bootstrap for Regression

Wakefield (2013, Sec. 2.7.1); Carpenter and Bithell (2000)

Given

Y^s, . . . , Y^s

B resamples of the dataset

1 B

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ_q^sqth quantile of θ^ˆ(Y^s)s,

α = 1 − CL

^.θ − (θ ˆ ˆ^s

1−α/2

− θ^ˆ), θ^ˆ− (θ^ˆ^s

θ^ˆ)^Σ

α/2

− − − Assumes that distribution of θ^ˆ^sθ^ˆis the same as the distribution of θ^ˆθ.

0.975 Intuition: 97.5% of the time, θ^ˆwill overestimate θ by less than θ^ˆ^s

− θ^ˆ,

0.975 so θ will be less than θ^ˆ− (θ^ˆ^s− θ^ˆ) only 2.5% of the time.

− Often has poor coverage.

Y ^s, . . . , Y ^sB resamples of the dataset

1 B

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ_q^sqth quantile of θ^ˆ(Y^s)s,

α = 1 − CL

θ^ˆ(Y) − .θ^ˆ(Y^s) − θ^ˆ(Y)Σ ± z^s

b=1

^√v^arF(θ^ˆ),

where θ^ˆ(Y^s) = ¹^Σ^Bθ^ˆ(Y^s) and

1−α/2

F θ

b=1

− _B

b=1 ^θ

b )

. var (^ˆ) = ¹^Σ^B.^ˆ(Y^s) ¹^Σ^B^ˆ(Y ^sΣ2

− Assumes sampling distribution of θ^ˆis symmetric and not too weird.

Intuition: Use bootstrap to debias and estimate the variance.

+ Has good coverage if it is.

^ + Can be useful if var(θ^ˆ) is hard to get analytically.

Y ^s, . . . , Y ^sB resamples of the dataset

1 B

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ_q^sqth quantile of θ^ˆ(Y^s)s,

α = 1 − CL

+ Invariant to transformation.

^.θ^ˆ^s_α/₂, θ^ˆ^s^Σ

1−α/2 − Effectively assumes that there exists a monotone transformation g (θ) s.t.

g(θ^ˆ) ∼ N(g(θ), σ²)

g(θ^ˆ^s) ∼ N(g(θ^ˆ), σ²)

Intuition: If it exists we could get a 1 − α CI by taking

g⁻¹g(θ^ˆ) ± z^sσ, but g(θ^ˆ) + z^sσ ≡ g(θ^ˆ^s

), so we

1−α/2 _ˆ_s

1−α/2

can just use quantiles of θ directly.

⁾Performs poorly if θ^ˆdoesn’t behave that way (e.g., mean–variance relationship).

Bootstrap

Bias Correction with Bootstrap Bootstrap Confidence Intervals

Basic approaches

Refined approaches Demonstration

Bootstrap for Regression

Y ^s, . . . , Y ^sB resamples of the dataset

1 B

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ_q^sqth quantile of θ^ˆ(Y^s)s,

α = 1 − CL

A refinement of pivotal interval
√ Adjusts for difference between var(θ^ˆ^s− θ^ˆ) and var(θ^ˆ− θ).

Estimate σˆ ≈ var_θ(θ^ˆ).
For each θ^ˆ^s, estimate σˆ^s≈

^.var _ˆ_s (θ^ˆ)

^{b b}θ_b⁾E.g., run bootstrap for every θ^ˆ^s.

⁾Very computationally intensive, unless a formula exists for σˆ.

Calculate t^s(θ^ˆ^s) = (θ^ˆ^s− θ^ˆ)/σˆ^sfor b= 1..B, t^sits qth quantile.

1−α/2 α/2 The CI is then ^.θ^ˆ+ σˆt^s, θ^ˆ− σˆt^s^Σ.

Y ^s, . . . , Y ^sB resamples of the dataset

1 B

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ_q^sqth quantile of θ^ˆ(Y^s)s,

α = 1 − CL

A refinement of percentile interval

+ Adjusts for skewness in θ^ˆand in how it changes with θ.

⁾Loosely, assumes that a transformation g (θ) exists s.t.

g(θ^ˆ) ∼ N[g(θ) − b1 + ag(θ), 1 + ag(θ)²]

g(θ^ˆ^s) ∼ N[g(θ^ˆ) − b1 + ag(θ^ˆ), 1 + ag(θ^ˆ)²]

for a and b.

+ Works very well

− Can misbehave badly if α very small.

^Then,^qu

= Φ ,b−

s α/2

z 1+a(z^s

−b

−b)

,, q_l = Φ ,b−

_zs

1−α/2 1+a(z^s

−b

−b)

,, and the CI is

ˆ⁾More complicated afor parametric.

u^.θ^ˆ_q^s, θ^ˆ_q^s^Σ.

α/2

1−α/2

Bootstrap

Bias Correction with Bootstrap Bootstrap Confidence Intervals

Basic approaches

Refined approaches Demonstration

Bootstrap for Regression

Bootstrap

⇒ = The distribution of the sample approximates the distribution of the population as nincreases.

≡ θ(FY) θ the true parameter value or some population quantity function of

Σ_► But,

1.0 1.5 2.0 * = mean

same amount:

* = mean

Given

B resamples of the dataset

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

− θ^ˆ), θ^ˆ− (θ^ˆ^s

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

− Assumes sampling distribution of θ^ˆis symmetric and not too weird.

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

1−α/2 − Effectively assumes that there exists a monotone transformation g (θ) s.t.

Intuition: If it exists we could get a 1 − α CI by taking

), so we

can just use quantiles of θ directly.

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

+ Works very well

,, and the CI is

DISCLAIMER

CONTACT US

Our Special Guarantees

Ace your studies with our custom writing services! We've got your back for top grades and timely submissions, so you can say goodbye to the stress. Trust us to get you there!

⇒ = The distribution of the sample approximates the distribution of the population as nincreases.

≡ θ(FY) θ the true parameter value or some population quantity function of

Σ► But,

1.0 1.5 2.0 * = mean

same amount:

* = mean

Given

B resamples of the dataset

b θˆ(Ys), b= 1..Bunivariate parameter estimates of interest

− θˆ), θˆ− (θˆs

b θˆ(Ys), b= 1..Bunivariate parameter estimates of interest

− Assumes sampling distribution of θˆis symmetric and not too weird.

b θˆ(Ys), b= 1..Bunivariate parameter estimates of interest

1−α/2 − Effectively assumes that there exists a monotone transformation g (θ) s.t.

Intuition: If it exists we could get a 1 − α CI by taking

), so we

can just use quantiles of θ directly.

b θˆ(Ys), b= 1..Bunivariate parameter estimates of interest

b θˆ(Ys), b= 1..Bunivariate parameter estimates of interest

+ Works very well

,, and the CI is

Looking for top-notch essay writing services? We've got you covered! Connect with our writing experts today. Placing your order is easy, taking less than 5 minutes. Click below to get started.

DISCLAIMER

CONTACT US

Our Special Guarantees

Σ_► But,

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

− θ^ˆ), θ^ˆ− (θ^ˆ^s

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

− Assumes sampling distribution of θ^ˆis symmetric and not too weird.

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest

b θ^ˆ(Y^s), b= 1..Bunivariate parameter estimates of interest