Summary
- Zonal extraxtion processing is most commonly thought of using continuous raster data.
- Several insightful analysis can include categorical raster data, or using continous and categorical raster layers together.
- This vignette builds upon the exacextractr coastal populations tutorial by incoporating land cover analysis.
Introduction
This vignette is the second in a two-part series demonstrating the use of the extactextractr
package. In the first vignette, we demonstrated several considerations for performing zonal extractions on meandering coastal regions such as São Miguel. In this vignette, we illustrate the the use of exactextractr
with categorical land cover data.
We will use the following packages for this vignette.
library(exactextractr)
library(dplyr)
library(sf)
library(raster)
Loading the sample data
First, we load the CORINE land cover data and the concelho boundaries used in the first vignette. A sample of the CORINE 2018 raster dataset is included in exactextractr
.
<- raster::raster(system.file('sao_miguel/clc2018_v2020_20u1.tif', clc
package = 'exactextractr'))
<- sf::st_read(system.file('sao_miguel/concelhos.gpkg', concelhos
package = 'exactextractr'),
quiet = TRUE)
The land cover class descriptions are provided in a separate DBF file. We read this in to a data frame, then use levels()
to associate the class descriptions with the raster.
<- foreign::read.dbf(system.file('sao_miguel/clc2018_v2020_20u1.tif.vat.dbf', clc_classes
package = 'exactextractr'),
as.is = TRUE) %>%
::select(value = Value, dplyr
landcov = LABEL3)
levels(clc) <- list(data.frame(ID = clc_classes$value,
landcov = clc_classes$landcov))
This association provides us with a way to look up the description for a given ID. Alternatively, we can relate the values using merge
or one of dplyr
’s joins.
::factorValues(clc, c(2, 18, 24)) raster
landcov |
---|
Discontinuous urban fabric |
Pastures |
Coniferous forest |
Summarizing Land Cover Classifications
One of the first questions we might have is which land cover classification is predominant in each concelho. We can do this with exactextractr
’s mode
summary operation. The minority
and variety
operations are also applicable to categorical data and provide the least-common classification and number of distinct classifications, respectively.
<- exactextractr::exact_extract(clc, concelhos, 'mode', landcov_mode
append_cols = 'name', progress = FALSE) %>%
::inner_join(clc_classes, by=c(mode = 'value')) dplyr
name | landcov |
---|---|
Lagoa | Land principally occupied by agriculture, with significant areas of natural vegetation |
Nordeste | Coniferous forest |
Ponta Delgada | Pastures |
Povoação | Broad-leaved forest |
Ribeira Grande | Land principally occupied by agriculture, with significant areas of natural vegetation |
Vila Franca do Campo | Land principally occupied by agriculture, with significant areas of natural vegetation |
Summary Functions
While mode
provides an easy way to see the most common land cover category, we need to write a custom summary function to see the frequency of different land cover types in an area.
Summary functions are called once per feature from the input sf
object. They can return either:
A scalar, in which case the return value of
exact_extract
will be a vector whose entries correspond with the rows of the inputsf
object, orA data frame, in which case
exact_extract
will return a row-wise combination of the data frames for each feature. If the data frame returned by the summary function will have than a single row, it is useful for some identifying information to be included in the returned data frame.
If we are going to perform typical data frame operations on the raster values and coverage fractions, it can be more convenient for the summary function to accept a single data frame argument instead of separate arguments for the cell values and coverage fractions. This behavior can be enabled with the summarize_df
argument.
Using this method, we can calculate the fraction of each concelho that is covered by each land cover category:
<- exactextractr::exact_extract(clc, concelhos, function(df) { landcov_fracs
%>% df
::mutate(frac_total = coverage_fraction / sum(coverage_fraction)) %>% dplyr
::group_by(name, value) %>% dplyr
::summarize(freq = sum(frac_total)) dplyr
summarize_df = TRUE, include_cols = 'name', progress = FALSE) },
Here we use the include_cols
argument to include the name
column from concelhos
in the data frame passed to the summary function. Although the value of name
will be the same for all rows in df
, we include name
in the group_by
expression so that it is not removed by summarize
. Other available arguments include include_xy
to get the cell center coordinates, include_area
to get the cell area, and include_cell
to get the cell index used by the raster
package.
This creates a data frame with the frequency of land cover type in each concelho:
head(landcov_fracs)
name | value | freq |
---|---|---|
Lagoa | 2 | 0.0629700 |
Lagoa | 3 | 0.0135845 |
Lagoa | 7 | 0.0065947 |
Lagoa | 9 | 0.0063711 |
Lagoa | 12 | 0.1697042 |
Lagoa | 18 | 0.1543908 |
You can select the top 3 cover types in each region, and join the data frame with clc_classes
to create a descriptive summary table for each concelho using kableExtra
:
<- fracs_summary
%>% landcov_fracs
::inner_join(clc_classes, by = 'value') %>% dplyr
::group_by(name) %>% dplyr
::arrange(dplyr::desc(freq)) %>% dplyr
::slice_head(n = 3) %>% dplyr
::mutate(freq = sprintf('%0.1f%%', 100 * freq)) dplyr
::kbl(fracs_summary[, -1], kableExtra
col.names = c("Cover", "Frequency", "Cover Description"),
align = c('c',"c", "l"),
full_width = F,
position = "center") %>%
::pack_rows(index = table(fracs_summary$name)) kableExtra
Cover | Frequency | Cover Description |
---|---|---|
Lagoa | ||
21 | 26.8% | Land principally occupied by agriculture, with significant areas of natural vegetation |
12 | 17.0% | Non-irrigated arable land |
20 | 15.6% | Complex cultivation patterns |
Nordeste | ||
24 | 24.3% | Coniferous forest |
18 | 22.5% | Pastures |
21 | 14.2% | Land principally occupied by agriculture, with significant areas of natural vegetation |
Ponta Delgada | ||
18 | 29.9% | Pastures |
21 | 26.5% | Land principally occupied by agriculture, with significant areas of natural vegetation |
12 | 11.3% | Non-irrigated arable land |
Povoação | ||
23 | 24.2% | Broad-leaved forest |
18 | 20.1% | Pastures |
21 | 16.7% | Land principally occupied by agriculture, with significant areas of natural vegetation |
Ribeira Grande | ||
21 | 25.1% | Land principally occupied by agriculture, with significant areas of natural vegetation |
18 | 22.7% | Pastures |
12 | 16.4% | Non-irrigated arable land |
Vila Franca do Campo | ||
21 | 36.0% | Land principally occupied by agriculture, with significant areas of natural vegetation |
18 | 15.5% | Pastures |
27 | 15.0% | Moors and heathland |
Similarly, we can find the top land covers by area:
<- exactextractr::exact_extract(clc, concelhos, function(df) { landcov_areas
%>% df
::group_by(name, value) %>% dplyr
::summarize(area_km2 = round(sum(coverage_area) / 1e6)) dplyr
summarize_df = TRUE, coverage_area = TRUE, include_cols = 'name', progress = FALSE) },
Area (km2) | Cover Description |
---|---|
Lagoa | |
12 | Land principally occupied by agriculture, with significant areas of natural vegetation |
8 | Non-irrigated arable land |
7 | Pastures |
Nordeste | |
25 | Coniferous forest |
23 | Pastures |
14 | Land principally occupied by agriculture, with significant areas of natural vegetation |
Ponta Delgada | |
70 | Pastures |
62 | Land principally occupied by agriculture, with significant areas of natural vegetation |
26 | Non-irrigated arable land |
Povoação | |
26 | Broad-leaved forest |
21 | Pastures |
18 | Land principally occupied by agriculture, with significant areas of natural vegetation |
Ribeira Grande | |
45 | Land principally occupied by agriculture, with significant areas of natural vegetation |
41 | Pastures |
30 | Non-irrigated arable land |
Vila Franca do Campo | |
28 | Land principally occupied by agriculture, with significant areas of natural vegetation |
12 | Pastures |
12 | Moors and heathland |
Summarizing Population Land Cover
Taking this a step further, one may be interested in the correlation between population and land cover class in a given concelho; i.e. is the population primarily urban or rural? As described in the previous vignette, using the population density raster is more accurate when pixels are partially covered by the boundary layer.
<- raster::raster(system.file('sao_miguel/gpw_v411_2020_density_2020.tif', pop_density
package = 'exactextractr'))
This analysis is only possible because the sample CORINE dataset packaged with exactextractr
has been reprojected from its native Lambert Equal Area projection into geographic coordinates used for the Gridded Population of the World; the projections must match. If you are working with raster data in different projections, you must transform them to a common projection using a tool like raster::projectRaster
, the gdalwarp
command-line utility, or a projection tool bundled with a standalone GIS product like QGIS.
We can use a similar call to exact_extract
to incorporate the population data into our analysis. We set weights = pop_density
and, since we are using the summarize_df
option, a column called weight
will be added to the data frame passed to the summary function. We also set coverage_area = TRUE
so that we can multiply the density by the covered pixel area to get a population count.
<- exactextractr::exact_extract(clc, concelhos, function(df) { landcov_pop_areas
%>% df
::group_by(name, value) %>% dplyr
::summarize(pop = sum(coverage_area * weight / 1e6)) %>% dplyr
::mutate(pop_frac = pop / sum(pop)) dplyr
weights = pop_density, coverage_area = TRUE, },
summarize_df = TRUE, include_cols = 'name')
Looking at the highest-population land cover type in each concelho, we can can see that the western/central concelhos of Lagoa, Ponta Delgada, Ribeira Grande, and Vila Franca do Campo have a more urban population than Nordeste or Povoação to the east.
Concelho | Cover Description | Population | Fraction |
---|---|---|---|
Lagoa | Discontinuous urban fabric | 6548 | 0.415 |
Nordeste | Complex cultivation patterns | 1280 | 0.282 |
Ponta Delgada | Discontinuous urban fabric | 27224 | 0.381 |
Povoação | Complex cultivation patterns | 1568 | 0.261 |
Ribeira Grande | Discontinuous urban fabric | 13497 | 0.374 |
Vila Franca do Campo | Discontinuous urban fabric | 4442 | 0.377 |
Add new comment