This is an update to a previous Spanish-language post for working with spatial raster and vector data in R, prompted by recent developments such as the stars package, its integration with sf and raster, and a particularly useful wrapper in nngeo.
Extracting the underlying values from a raster is a fairly common task, so here is a workflow to calculate and extract values (mean, max, min, etc.) from a raster file across the distribution of a few different primate species in Mexico and Central America (“Mesoamerica”), loaded from a .shp file into a simple feature collection.
We use a single raster file with gridded values for the Human Footprint index (a compound measure of direct and indirect human pressures on the environment globally), calculated for 2009 by Venter et al. (2016), and a multipolygon shapefile of mammalian distribution maps downloaded from the IUCN Red List Spatial Data and Mapping Resources page (free & open download for educational use but an account is needed). All the required packages can be installed from CRAN, so this example can be reproduced by everyone (willing to download these large amounts of data).
These are the main steps in the process:
Load raster and polygon data
First, read_stars() can read the geotiff raster file into a stars object and st_read() to load the shapefile. Country boundaries can be loaded from rnaturalearth. All the polygons were projected into Mollweide +proj to match the raster.
Mask and crop the raster layer
We only need raster data for our study region, so we can crop it using st_crop().
This is the cropped raster:
Subset the multipolygon feature collection
Immediately after loading the mammal range maps we used a simple dplyr operation to keep only the primates (our Order of interest), now we can spatially subset the data to keep only the polygons within our study area. Afterwards we group by species and summarize their geometries (this operation goes by many names in traditional GIS jargon, such as dissolve or aggregate).
Because they overlap, we can make a faceted plot of the seven species present in the study area to see their ranges.
Extract the underlying raster values for each feature in the polygon layer
The nngeo::raster_extract() function comes in handy to apply raster::extract() on stars objects. It’s used here inside mutate to add new variables with the extracted values for each species (it’s vectorized for each feature 😎).
We can pull the data sans the geometry using st_set_geometry(NULL) and have a look at the extracted values.
Lastly, we can plot everything together with ggplot2. This quick and simple map could be customized further with various grammar of graphics touches, but for now it looks pretty good.
One last plot, again using one facet per species to avoid overlap. Note the use of ggnewscale functions to use one fill for the raster and one for the polygon features.