Drawing polygons around groups of points in ggplot

Updated on February 2020 to include geoms provided by ggforce.
Esta publicación está disponible en español aqui

For various kinds of analyses, we often end up plotting point data in two dimensions for two or groups. This includes Principal Component Analyses, bioclimatic profiles, or any other combination of values on two axes. Here are some alternatives for drawing polygons around groups of points, with code and examples.

These methods are for ggplot, but I assume there are ways to do the same things using base or other plotting engines. I wanted to use real data, so the following examples use data from this paper on the physiology of the Japanese quail. After loading (or installing if necessary) the required packages and downloading the data from Dryad, we can wrangle the data so we can plot length and mass data from several individual birds at 30 vs 40 days of age.

Convex hulls

Convex hulls are one of the most common methods for grouping points. Convex hulls have a formal geometric definition, but basically they are like stretching a rubber band around the outermost points in the group. We can now calculate the convex hulls for many groups using ggforce.

Convex hulls often include large areas with no points in them. Tweaking the parameters can give us a tighter hull with nice round corners.

Ellipses

Another common alternative is to group points using ellipses. We can plot the ellpises with ggforce, although ggplot::stat_ellipse is also an option.

Encircle

This option is what I ended up using for my own figures. It usesgeom_encircle, a new geometry provided in the ggalt package. This geom uses polynomial splines to draw nice smoothed polygons around the groups of points. It has flexible options for color, fill, and the smoothness of the polygons that it draws. This method is nice for highlighting groups visually and indicate cohesion, and not necessarily for performing any further analyses on the polygons themselves (e.g. using the areas or the amount of overlap for other subsequent tests).

We can change the transparency and fill values of the different polygons for all the methods. This can be useful to highlight overlap between groups.