Reproducible Research

with R & RStudio
2nd Edition

Christopher Gandrud

Reproducible Research with R & RStudio gives you tools for data gathering, analysis, and presentation of results so that you can create dynamic and highly reproducible research.

Tools you will learn as part of a reproducible research workflow:

R: a programming language primarily for statistics and graphics. We will focus on using it for dynamic data gathering and presenting results.

knitr and rmarkdown: R packages for literate programming, i.e. they allow you to combine your statistical analysis and the presentation of the results into one document. They work with R and a number of other languages such as Bash, Python, and Ruby.

Markup languages: instructions for how to format a presentation document. Specifically we cover LaTeX for creating PDF articles and slide shows, as well as Markdown, and a little HTML for presenting results on the web.

Unix-like shell programs: These tools are useful for working with large research projects. They also allow us to use command line tools including GNU Make for compiling projects and Pandoc, a program useful for converting documents from one markup language to another.

Cloud storage & versioning: Services such as Dropbox and Git/Github that can store data, code, and presentation files, save previous versions of these files, and make this information widely available.

RStudio: an integrated developer environment (IDE) for R that tightly integrates these reproducible research tools in one place.

Academic Researchers: This book is intended to be a practical guide for how to actually make your research reproducible. Even if you already use tools such as R and LaTeX you may not be leveraging their full potential. This book will teach you useful ways to get the most out of them as part of a reproducible research workflow.

Students: Upper-level undergraduate and graduate students conducting original computational research should make your research reproducible. Forcing yourself to clearly document the steps you took will also encourage you to think more clearly about what you are doing and reinforce what you are learning. It will hopefully give you a greater appreciation of research accountability and integrity early in your career.

Instructors: When instructors incorporate the tools of reproducible research into their assignments they not only build students’ understanding of research best practice, but are also better able to evaluate and provide meaningful feedback on students’ work. This book provides a resource that you can use with students to put reproducibility into practice.

Editors: Beyond a lack of reproducible research skills among researchers, an impediment to actually creating reproducible research is a lack of infrastructure to publish it. Hopefully, this book will be useful for editors at academic publishers who want to be better at evaluating reproducible research, editing it, and developing systems to make it more widely available.

Private Sector Researchers: Researchers in the private sector may or may not want to make their work easily reproducible outside of their organization. However, that does not mean that significant benefits cannot be gained from using the methods of reproducible research. Making your research reproducible to members of your organization can spread valuable information about how analyses were done and data was collected. This will help build your organization’s knowledge and avoid effort duplication.

Download the full table of contents:

Click the links below to download sample chapters or visit Google Books.

    Part I: Getting Started

  1. Introducing Reproducible Research
  2. Getting Started with Reproducible Research
  3. Getting Started with R, RStudio, and knitr/rmarkdown
  4. Getting Started with File Management

  5. Part II: Data Gathering and Storage

  6. Storing, Collaborating, Accessing Files, and Versioning
  7. Gathering Data with R
  8. Preparing Data for Analysis

  9. Part III: Analysis and Results

  10. Statistical Modeling and knitr
  11. Showing Results with Tables
  12. Showing Results with Figures

  13. Part IV: Presentation Documents

  14. Presenting with knitr/LaTeX
  15. Large knitr/LaTeX Documents: Theses, Books, and Batch Reports
  16. Presenting on the Web with R Markdown
  17. Conclusion

You can order it from:

You can freely download supplementary files used for examples discussed in the books chapters as well as a short, but complete reproducible research project.

I am a lecturer in quantitative international political economy at City University London and a research fellow at the Fiscal Governance Centre, Hertie School of Governance. My research focuses on the international political economy of public financial and monetary institutions, as well as applied social science statistics. My work has been published in the Journal of Common Market Studies, Review of International Political Economy, Political Science Research and Methods, Journal of Statistical Software, and International Political Science Review.

I have been a lecturer at Yonsei University and a Fellow in Government at the London School of Economics. In 2012 I completed a PhD in political science and quantitative methods at LSE.

When I stumble upon a tool that helps me do research better I share it on my blog or on Twitter at @ChrisGandrud.

The software underlying Reproducible Research with R and RStudio is constantly changing. Please see the Book's Updates page for a list of major software updates.

If you notice software changes that should be noted on the Updates page or have any suggestions for future editions of the book, including new reproducible research technologies to cover, feel free to leave a comment on the Book's development page.

Corrections to typos in the current printed version be found on the Errata page.