Module 10: Raster Analysis

10.3 Focal and Zonal Statistics

Neighbourhood analysis and per-zone summaries — the workhorses of continuous-surface GIS.

Lesson 52 of 100·17 min read

Key takeaways

Focal statistics compute per-cell values from a moving-window neighbourhood.

Zonal statistics summarise raster values per polygon (or categorical raster zone).

Both are core building blocks of terrain analysis, landscape ecology, and environmental modelling.

Introduction

If local map algebra is arithmetic on rasters, focal and zonal operations are where geography asserts itself. Focal looks at a cell's neighbours; zonal summarises cells within a region. This lesson covers both in practical terms.

Focal statistics

A focal operation applies a function over a neighbourhood (window / kernel) centred on each cell, producing an output cell value. Typical functions:

Mean — smoothing / low-pass filtering.
Standard deviation — roughness.
Maximum / minimum — local extremes.
Sum — density (population per window).
Majority — per-cell most common class (denoising classified rasters).

Window shapes

Square (3×3, 5×5, 11×11) — most common.
Circle — rotationally symmetric, better for natural phenomena.
Custom kernel — weighted by distance or direction.

Implementation

Python

from scipy.ndimage import generic_filter

Or with a convolution kernel:

Python

from scipy.ndimage import convolve

Specific focal operations

Slope — first derivative of elevation (Module 11.2).
Aspect — direction of steepest descent.
Curvature — second derivative.
Hillshade — shaded relief from slope + aspect (Module 11.3).
Terrain Ruggedness Index (TRI) — mean absolute difference from neighbours.

Edge effects

At the raster edge, the neighbourhood extends off the array. Strategies:

Pad with nodata — edge cells inherit nodata.
Pad with edge value — reflect or repeat.
Shrink output — drop edge cells.

Document the choice; it matters for mosaicking.

Zonal statistics

A zonal operation summarises a raster's cells within each zone, where zones are defined by another layer:

A polygon feature class (each polygon is a zone).
A categorical raster (each class code is a zone).

Common statistics: mean, sum, count, min, max, median, standard deviation, majority.

With rasterstats

Python

1from rasterstats import zonal_stats
2stats = zonal_stats(
3'districts.gpkg',
4'population.tif',
5stats=['mean', 'sum', 'min', 'max', 'count'],
6nodata=-9999
7)
8stats is a list of dicts, one per polygon

Join the results back to the polygon layer:

Python

import geopandas as gpd

In QGIS

Processing → Zonal Statistics or the Zonal Statistics Plugin.

In PostGIS raster

SQL

1SELECT d.id, d.name, (ST_SummaryStats(ST_Clip(r.rast, d.geom))).*
2FROM districts d
3JOIN rasters r ON ST_Intersects(r.rast, d.geom);

In Google Earth Engine

JavaScript

1var stats = image.reduceRegions({
2  collection: districts,
3  reducer: ee.Reducer.mean(),
4  scale: 30
5});

GEE scales to continents trivially.

Coverage (fractional zonal stats)

Strict "a cell is fully inside a zone" ignores partial overlaps. For small zones and coarse rasters, this matters. Fractional / area-weighted zonal statistics weight each cell by the fraction inside the zone:

Python

1stats = zonal_stats('zones.gpkg', 'population.tif',
2                    stats='mean',
3                    all_touched=True)

Or use exactextract for exact fraction-weighted stats.

Majority / diversity

For categorical rasters (land cover):

Majority — most common class per zone.
Diversity — count of unique classes.
Shannon / Simpson indices — diversity metrics from ecology.

Python

1stats = zonal_stats('zones.gpkg', 'landcover.tif',
2                    categorical=True, stats=['count'])

Performance

Zonal stats on millions of zones and continent-scale rasters is slow naively. Tips:

Reproject zones to raster CRS (not vice versa — reprojecting polygons is much cheaper).
Pre-filter zones by bounding box intersection.
Tile the raster and process in parallel.
Earth Engine / Planetary Computer for large jobs.

Self-check exercises

1. When would you pick circular over square focal windows?

When the phenomenon has no directional bias — natural features (wind, fluid flow, wildlife habitat) are rotationally symmetric, so a square window introduces artificial corner bias at the 45° diagonals. Square is fine for general smoothing; circular is stricter. In practice, most tools default to square for simplicity.

2. Your zonal mean elevation per country gives nonsense in very small countries. Why?

If the raster resolution (say 1 km) is coarse relative to the zone (say a small island of 10 km²), only a few cells fall entirely inside and strict zonal stats may drop partial cells. Use fractional/area-weighted zonal stats (exactextract, all_touched=True), or resample the raster to a finer resolution that captures the zone.

3. Focal mean smooths a DEM. Why might you not want to smooth before slope calculation?

Smoothing reduces local variation — peaks shrink and valleys fill. Slope calculations made on smoothed DEMs underestimate real slopes. For hydrologic or erosion modelling, keep the original DEM for slope and only smooth for presentation purposes.

Summary

Focal ops compute per-cell statistics from a neighbourhood window.
Zonal ops summarise raster values per polygon or categorical zone.
Both scale from hectares to continents with the right tools.
Edge effects, alignment, and fractional cells are the real-world wrinkles.

10.3 Focal and Zonal Statistics

Introduction

Focal statistics

Window shapes

Implementation

Specific focal operations

Edge effects

Zonal statistics

With rasterstats

In QGIS

In PostGIS raster

In Google Earth Engine

Coverage (fractional zonal stats)

Majority / diversity

Performance

Self-check exercises

Summary

Further reading

Ready to level up your map-making process?