A Simple Introduction to Bayesian Data Analysis

Christopher Gandrud

5 June 2014

Slides


This slide deck is available at:


http://christophergandrud.github.io/BasicBayesianPresent.

Overview


  • Theoretical: What does Bayesian Data Analysis bring to inference?

  • Examples from my research

  • A few practical tips

Motivation


We know that Bayesian data analysis (BDA) is hot in social science.


We also know that popularity \(\neq\) widespread mastery or even literacy.

Talk Aim




Begin a discussion aiming towards BDA literacy.

What is statistical inference?


Drawing conclusions based on data that is subject to random variation, such as observational errors and sampling variation. - Upton (2008), via Wikipedia


BDA and ‘frequentist’ methods simply provide different ways to draw conclusions from data and address random variation.

Basic Idea behind BDA




Update prior information with new data to create a posterior probability distribution.

Gelman et al. (2014, 3) Three step process


  1. Set up a full probability model (prior).

  2. Condition on observed data (new data).

  3. Evaluate the fit of the model and posterior distribution’s implications (posterior).

A Key Contribution: Emphasising uncertainty estimation


  • “The central feature of Bayesian inference [is] the direct quantification of uncertainty” (Gelman et al. 2014, 4).

    • Fewer coefficient tables! Less emphasis on p-value hypothesis testing. Rise of the confidence and probability intervals.
  • Many researchers actually interpret ‘frequentist’ confidence intervals as if they were Bayesian probability intervals.

Uncertainty in Frequentist and Bayesian Approaches (1)


Both involve the estimation of unknown quantities of interest, e.g. coefficient parameters \(\beta\).



The estimates they produce have different interpretations.

Uncertainty in Frequentist and Bayesian Approaches (2)


Frequentist

  • 95% Confidence interval: Repeated samples will contain the true parameter within the interval 95% of the time.



Bayesian ‘common sense interpretation’

  • 95% Probability (credible) interval: There is a 95% probability that the unknown parameter is actually in the interval.

Priors & Modelling flexibility


Can include prior information from different sources, including previous studies, while also incorporating uncertainty.


Very flexible:

  • Hierarchical models

  • Item response models

  • Missing data imputation

  • . . .

A middle ground? (1)


  • King, Tomz, and Wittenberg (2000) offer a middle ground for estimating common sense uncertainty with post-estimation simulations.

  • Often less computationally intensive.

  • However, not as flexible.

A middle ground? (2)


  1. Estimate parameters (using preferred model)

  2. Use post-estimation simulations to estimate uncertainty around quantities of interest.

    • Draw \(n\) values of the parameter point estimates (\(\hat{\beta}\)) from multivariate normal distributions with mean \(\hat{\beta}\) and variance specified by the parameters’ covariance.
  3. Plot central interval of the results.

A middle ground? (3)


Similar overall to Markov-Chain Monte Carlo techniques, but different in how the parameters are drawn.


Zelig has good capabilities for doing this and I’ve implemented it for interactions in survival models with simPH and dynamic autoregressive relationships in dynsim (with Laron Williams and Guy Whitten).

BDA Software (1)


Predefined Models (for R)



Open-ended Models (program + R interface)


  • JAGS + rjags (part of the Bugs family, but cross-platform)

  • STAN + RStan (potentially faster than Bugs family)

BDA Software (2)


Visualise diagnostics + results (for R)


My own research: Predefined models


Federal Reserve Inflation Forecast Errors (with Cassandra Grafström)


  • Research question: does US presidential party ID affect Federal Reserve staff inflation forecast errors?

  • Bayesian methods: predefined model in Zelig (normal.bayes) used primarily to examine model dependence.

  • All code available on GitHub.

My own research: Predefined models

My own research: Open-ended Models


Financial Transparency Index (with Mark Copelovitch and Mark Hallerberg)


  • Research question: what causes financial regulatory transparency and what are its consequences for financial stability?

  • Bayesian methods: created a Bayesian Item Response model to estimate an unobserved quantity (financial transparency) based on whether or not countries had reported data to the World Bank’s Global Financial Development Database.

  • All code available on GitHub.

Financial Transparency Index in 2011

Discrimination Plot

Practical tip (1)




Good idea to become familiar with BDA notation.


A nice place to start is: Gelman et al. (2014) pages 4-6.

Practical tip (2)


Critically use other’s code as a starting point


Practical tip (3)


Use cat to dynamically build probability models


  • Setting up a probability model is an interative process as you adjust the parameters to include in the model and the data.

  • So, it can be good to develop source code that dynamically creates the probability model.

  • Use your programming language’s version of concatenate.

    • cat in R

    • + in Python

ExampleSource.R

cat(paste0('
model{
    for (n in 1:', Num, '){\n',
      Xs, '\n',
      Ps, '\n',
      Qs, '\n',
      Vs,
    '\n }',
  '\n# Model priors\n',
    '
mu[1] <- 0
mu[2] <- 0

alpha[1,1] <- 0.01
alpha[1,2] <- 0
alpha[2,1] <- 0

. . .

  for (n in 1:', NCountry, '){
    transparency[n,1] <-transcentered[n]
    tau[n] ~ dgamma(20, 4)

    for (j in 2:', Nyear, '){
      transparency[n,j] ~ dnorm(transparency[n,(j-1)], tau[n])
    }
  }'),
'\n}'), file = 'BasicModel_V1.bug')

full source code



Practical tip (4)


Use RStudio Server on (something like) Amazon EC2


  • Conditioning a probability model on observed data can be computationally intensive, i.e. it can slow your desktop computer to a crawl.

  • So, it can be useful to run on a remote server.

  • Relatively easy to set up (see here and my FRT code).

    • Though there is some Linux command line work involved.
  • Nonetheless, aim for computational efficiency.

    • There may be a substantive problem causing your computational problem.

Resources


  • Gelman, Andrew, et al. 2014. Bayesian Data Analysis. 3rd Edition. Boca Raton, FL: CRC Press.

  • Enjoyable historical context: McGrayne, Sharon Bertsch. 2012. The Theory That Would Not Die. New Haven: Yale University Press.

  • Intro to Bayesian methods using Python.

  • Nice comparison of Bayesian vs. Frequentist methods in data driven business from the developers at Lyst.

  • Prediction intervals and confidence intervals: nice explanatory post by Rob Hyndman.

  • Pretty good practical blog post for setting up Jags: John Myles White. 2010. Using JAGS in R with the rjags Package