Illustration by Photo Royalty/Shutterstock

Iterated PDFs with R Markdown

Getting evidence into decisionmakers’ hands elevates the debate. As I outlined in Iterated fact sheets with R Markdown, R Markdown can be used to iterate fact sheets that distill large amounts of research into many smaller documents based on geographies, time periods, or organizations that are catered directly to individual decisionmakers’ interests.

The defaults and design of R Markdown are best for making styled HTML documents, but audiences are often better served by PDFs. Unfortunately, heavily-styled PDFs are tougher to make in R Markdown. To match the ease of making HTML documents, we created some tools for making branded PDF fact sheets with R Markdown at the Urban Institute.

HTML vs. PDF

But not everyone is comfortable with HTML documents. Even people who effortlessly browse the internet are sometimes uncomfortable managing and navigating HTML files outside a web browser. For this audience, a PDF’s inflexibility is a strength — these files have a single, set layout that does not vary when opened by different software or on different devices, and they are easy to print.

This is a major benefit for researchers and policymakers. So built tools for creating iterated, branded PDFs in R Markdown that are as simple as the tools that exist for creating iterated HTML documents.

The tools needed to achieve a few goals:

  • The tools needed to be simple enough that authors could focus on analytic R code and narrative instead of getting bogged down in complicated LaTeX.
  • The output needed to consistently match Urban Institute publication standards on project after project. That meant matching stylings like using Lato font and appropriate margins and tougher demands like perfectly positioned Urban Institute contact information, logo images, and boilerplate.
  • The document needed to be reproducible. The point is to save time iterating the fact sheets. Changes need to be made against tight deadlines, and fact-checkers need to be confident in the output.

The existing process for creating fact sheets uses Microsoft Word documents exported as PDFs. To create a template that can match the existing templates and be automated, we used a combination of a .Rmd template, a large LaTeX preamble, and LaTeX macros and environments. All of the resources are available in this public GitHub repository.

New tools

Template

Preamble

Some settings in our preamble are simple. \definecolor{urbanblue}{HTML}{1696D2} defines the color urbanblue as the hex color code #1696d2. \pagenumbering{gobble} disables page numbering. Some settings are more subtle. \usepackage[hang,flushmargin]{footmisc} drops the indentation of the footnotes.

All of this code is automatically included when Urban Institute researchers use the template because of the following code in the YAML header:

Macros and environments

If a researcher wants to add an Urban blue, centered, Lato 12-point subtitle, she need only wrap the text in urbansubtitle{}. This macro calls the following code:

We provide macros for contact information, title, subtitle, authors, two types of headers, figure numbers, figure titles, figure sources, figure notes, and the boilerplate that appears on every Urban Institute publication.

LaTeX environments, which are similar to macros, are used to style bulleted and number lists. For example, the following code adds blue bullet points:

TinyTeX

Yihui Xie’s new tinytex R package changes everything. It is lightweight, is low maintenance, and can be installed like any other R package using install.packages(). Furthermore, Xie is a gifted technical writer and prolific debugger, so the package is clearly documented and works well. If you don’t have a LaTeX distribution, run the following code and you won’t have to think about your LaTeX distribution again:

cairo_pdf

ggplot2 theme

PDF fact sheets cater to the wide and numerous insights of Urban Institute researchers and to the narrower needs of decisionmakers.

Under the hood, these tools are complex. In practice, users are pleasantly ignorant of most of the complexity. A researcher can copy the repository, edit the template, and style her fact sheet with macros reminiscent of R code.

Researchers can focus on their insights instead of LaTeX, the product closely mirrors the needs and desires of our communications team, and the output clearly communicates Urban Institute brands and styles while remaining reproducible and easy to scale.

-Aaron Williams

Want to learn more? Sign-up for the Data@Urban newsletter.

Data@Urban is a place to explore the code, data, products, and processes that bring Urban Institute research to life.