This slide deck is available at:
Theoretical: What does Bayesian Data Analysis bring to inference?
Examples from my research
A few practical tips
We know that Bayesian data analysis (BDA) is hot in social science.
We also know that popularity \(\neq\) widespread mastery or even literacy.
Begin a discussion aiming towards BDA literacy.
Drawing conclusions based on data that is subject to random variation, such as observational errors and sampling variation. - Upton (2008), via Wikipedia
BDA and ‘frequentist’ methods simply provide different ways to draw conclusions from data and address random variation.
Update prior information with new data to create a posterior probability distribution.
Set up a full probability model (prior).
Condition on observed data (new data).
Evaluate the fit of the model and posterior distribution’s implications (posterior).
“The central feature of Bayesian inference [is] the direct quantification of uncertainty” (Gelman et al. 2014, 4).
Many researchers actually interpret ‘frequentist’ confidence intervals as if they were Bayesian probability intervals.
Both involve the estimation of unknown quantities of interest, e.g. coefficient parameters \(\beta\).
The estimates they produce have different interpretations.
Can include prior information from different sources, including previous studies, while also incorporating uncertainty.
Very flexible:
Hierarchical models
Item response models
Missing data imputation
. . .
King, Tomz, and Wittenberg (2000) offer a middle ground for estimating common sense uncertainty with post-estimation simulations.
Often less computationally intensive.
However, not as flexible.
Estimate parameters (using preferred model)
Use post-estimation simulations to estimate uncertainty around quantities of interest.
Plot central interval of the results.
Similar overall to Markov-Chain Monte Carlo techniques, but different in how the parameters are drawn.
Zelig has good capabilities for doing this and I’ve implemented it for interactions in survival models with simPH and dynamic autoregressive relationships in dynsim (with Laron Williams and Guy Whitten).
Research question: does US presidential party ID affect Federal Reserve staff inflation forecast errors?
Bayesian methods: predefined model in Zelig (normal.bayes
) used primarily to examine model dependence.
All code available on GitHub.
Research question: what causes financial regulatory transparency and what are its consequences for financial stability?
Bayesian methods: created a Bayesian Item Response model to estimate an unobserved quantity (financial transparency) based on whether or not countries had reported data to the World Bank’s Global Financial Development Database.
All code available on GitHub.
Good idea to become familiar with BDA notation.
A nice place to start is: Gelman et al. (2014) pages 4-6.
Avoids effort duplication.
Code gets better with use.
Hollyer, Rosendorff, and Vreeland (2014) was the starting point of our FRT Index.
cat
to dynamically build probability modelsSetting up a probability model is an interative process as you adjust the parameters to include in the model and the data.
So, it can be good to develop source code that dynamically creates the probability model.
Use your programming language’s version of concatenate.
cat
in R
+
in Python
ExampleSource.R
cat(paste0('
model{
for (n in 1:', Num, '){\n',
Xs, '\n',
Ps, '\n',
Qs, '\n',
Vs,
'\n }',
'\n# Model priors\n',
'
mu[1] <- 0
mu[2] <- 0
alpha[1,1] <- 0.01
alpha[1,2] <- 0
alpha[2,1] <- 0
. . .
for (n in 1:', NCountry, '){
transparency[n,1] <-transcentered[n]
tau[n] ~ dgamma(20, 4)
for (j in 2:', Nyear, '){
transparency[n,j] ~ dnorm(transparency[n,(j-1)], tau[n])
}
}'),
'\n}'), file = 'BasicModel_V1.bug')
Conditioning a probability model on observed data can be computationally intensive, i.e. it can slow your desktop computer to a crawl.
So, it can be useful to run on a remote server.
Relatively easy to set up (see here and my FRT code).
Nonetheless, aim for computational efficiency.
Gelman, Andrew, et al. 2014. Bayesian Data Analysis. 3rd Edition. Boca Raton, FL: CRC Press.
Enjoyable historical context: McGrayne, Sharon Bertsch. 2012. The Theory That Would Not Die. New Haven: Yale University Press.
Nice comparison of Bayesian vs. Frequentist methods in data driven business from the developers at Lyst.
Prediction intervals and confidence intervals: nice explanatory post by Rob Hyndman.
Pretty good practical blog post for setting up Jags: John Myles White. 2010. Using JAGS in R with the rjags Package