Note: I was not involved in creating the original figure or in how the data behind it was collected and processed, but I wanted to share my own take on showing the values for the different studies in a way that was easier to read given the importance of the issue.
I recently came across the following figure, tweeted by Katharine Hayhoe on March 9, 2017.
How much of our current warming is human-induced? Likely more than 100% - because according to natural factors, we should be cooling. pic.twitter.com/LyUbDjIxDn— Katharine Hayhoe (@KHayhoe) March 10, 2017
The figure (original source) is part of the graphical resources used in articles for the website Skeptical Science, a site dedicated to explaining climate change science and rebutting global warming misinformation
Looking at the figure and at the original caption, we can see that the bars represent an independent variable (% contribution) for two categories (natural/human), with a color scheme to represent different studies. I was puzzled as to why the natural vs. human bars weren’t side-by-side (dodged), but I suppose that going with colors for the different studies precluded the use of color to distinguish the bars.
I tweeted my version of the figure, after which Katharine Hayhoe shared it on her own account. The tweet has been shared widely so I thought I should write this brief post with a few extra details on the data visualization process.
I had previously written about diverging bar plots, and this looked like a good example to show some data visualization principles using my existing R code. In this case, a diverging stacked bar plot helps by:
- not having to match up the colors with the bars across two separate sections of the axis with the abbreviated studies from the legend
- showing more information on each study directly on the axis
- being able to compare positive and negative values directly
- getting rid of the gradients and drop shadows in the original figure
The code to reproduce the figure is below, and here are the steps I used to produce it.
- Digitize the original figure using PlotDigitizer and set up columns with the studies and the grouping variable.
- Round the values.
- Plot the data, keeping the title and axis labels consistent with the original figure. The plot was made using ggplot2 and several of my favorite packages to improve its overall appearance. I used one of Google’s Roboto fonts but if you don’t have it on your system you may use any other font family.
- Rotate the axes and reorder the x axis.
- Highlight the y axis in a lighter color for extra coolness.
Note how I used functions from the forcats package within the ggplot arguments to wrangle the factor levels.
Feel free to contact me with any questions or if the code doesn’t work for you.