Update - January 2020: The raster_ functions from nngeo were moved to geobgu. Thanks to @imaginary_nums for pointing this out.
This is an update to a previous Spanish-language post for working with spatial raster and vector data in R, prompted by recent developments such as the stars package, its integration with sf and raster, and a particularly useful wrapper in geobgu.
Extracting the underlying values from a raster is a fairly common task, so here is a workflow to calculate and extract values (mean, max, min, etc.) from a raster file across the distribution of a few different primate species in Mexico and Central America (“Mesoamerica”), loaded from a .shp file into a simple feature collection.
We use a single raster file with gridded values for the Human Footprint index (a compound measure of direct and indirect human pressures on the environment globally), calculated for 2009 by Venter et al. (2016), and a multipolygon shapefile of mammalian distribution maps downloaded from the IUCN Red List Spatial Data and Mapping Resources page (free & open download for educational use but an account is needed). All the required packages can be installed from CRAN, so this example can be reproduced by everyone (willing to download these large amounts of data).
These are the main steps in the process:
Load raster and polygon data
First, read_stars() can read the geotiff raster file into a stars object and st_read() to load the shapefile. Country boundaries can be loaded from rnaturalearth. All the polygons were projected into Mollweide +proj to match the raster.
Mask and crop the raster layer
We only need raster data for our study region, so we can crop it using st_crop().
This is the cropped raster:
Subset the multipolygon feature collection
Immediately after loading the mammal range maps we used a simple dplyr operation to keep only the primates (our Order of interest), now we can spatially subset the data to keep only the polygons within our study area. Afterwards we group by species and summarize their geometries (this operation goes by many names in traditional GIS jargon, such as dissolve or aggregate).
Because they overlap, we can make a faceted plot of the seven species present in the study area to see their ranges.
Extract the underlying raster values for each feature in the polygon layer
The raster_extract() function comes in handy to apply raster::extract() on stars objects. It’s used here inside mutate to add new variables with the extracted values for each species (it’s vectorized for each feature 😎).
We can pull the data sans the geometry using st_set_geometry(NULL) and have a look at the extracted values.
binomial
hfpMean
hfpMax
hfpMin
Alouatta palliata
9.087961
50.00000
0.000000
Alouatta pigra
5.716039
47.00000
0.000000
Aotus zonalis
7.390708
42.41053
1.000000
Ateles fusciceps
3.870575
18.04213
1.000000
Ateles geoffroyi
8.495973
50.00000
0.000000
Saguinus geoffroyi
6.306356
42.41053
1.000000
Saimiri oerstedii
10.727800
40.78853
3.002395
Lastly, we can plot everything together with ggplot2. This quick and simple map could be customized further with various grammar of graphics touches, but for now it looks pretty good.
One last plot, again using one facet per species to avoid overlap. Note the use of ggnewscale functions to use one fill for the raster and one for the polygon features.