CoursesGIS Basics — A Complete Introduction10.5 Lab — End-to-End Raster Analysis
Module 10: Raster AnalysisHands-on Lab

10.5 Lab — End-to-End Raster Analysis

Combine reclassification, focal stats, zonal stats, and cost surfaces into a single realistic workflow.

Lesson 54 of 100·50 min read

Key takeaways

  • Real raster analyses chain the operations you've learned into multi-step pipelines.
  • Each step is testable; save intermediates and visualise them to catch bugs early.
  • The same pattern applies to conservation planning, utility corridors, agriculture, and disaster response.

Introduction

This lab replicates a realistic conservation-planning workflow. You'll identify the most cost-effective corridor for wildlife connecting two protected areas, using a chain of raster operations.

Scenario: Two national parks are separated by a human-modified landscape. Your task is to propose a 1 km-wide corridor that balances ecological value, avoids infrastructure, and minimises human conflict.

Prerequisites

  • QGIS and Python with rasterio, numpy, scikit-image installed.
  • Sample data: DEM, land cover raster, roads vector, existing protected areas vector. Use the bundled sample, any national park dataset, or synthesise with gdal_rasterize on OSM extracts.
  • About 1 hour.

Step 1 — Prepare aligned rasters

Shell
1# Reproject and align everything to a common grid
2gdalwarp -t_srs EPSG:3857 -tr 30 30 dem.tif dem_aligned.tif
3gdalwarp -t_srs EPSG:3857 -tr 30 30 -r near landcover.tif lc_aligned.tif
4gdal_rasterize -tr 30 30 -te <xmin ymin xmax ymax> -burn 1 roads.gpkg roads.tif

Alignment check:

Python
1import rasterio
2for fn in ['dem_aligned.tif', 'lc_aligned.tif', 'roads.tif']:
3    with rasterio.open(fn) as r:
4        print(fn, r.shape, r.transform, r.crs)

All three should print identical shapes and transforms.

Step 2 — Reclassify land cover to habitat suitability

Assume land cover codes: 1 urban, 2 cropland, 3 grassland, 4 forest, 5 wetland, 6 water.

Python
1import numpy as np
2with rasterio.open('lc_aligned.tif') as src:
3    lc = src.read(1)
4    profile = src.profile
5habitat = np.zeros_like(lc, dtype='uint8')
6habitat[lc == 1] = 1    # urban — low habitat
7habitat[lc == 2] = 3
8habitat[lc == 3] = 7
9habitat[lc == 4] = 10   # forest — high habitat
10habitat[lc == 5] = 9
11habitat[lc == 6] = 1    # water impassable for terrestrial species

Step 3 — Compute slope from DEM

Python
1from scipy.ndimage import sobel
2with rasterio.open('dem_aligned.tif') as src:
3dem = src.read(1).astype('float32')
4cell_size = src.transform.a
5dzdx = sobel(dem, axis=1) / (8 * cell_size)
6dzdy = sobel(dem, axis=0) / (8 * cell_size)
7slope_rad = np.arctan(np.sqrt(dzdx2 + dzdy2))
8slope_deg = np.degrees(slope_rad)

Step 4 — Build a cost surface

Higher cost = harder for wildlife. Combine habitat (inverted), slope, and proximity to roads.

Python
1from scipy.ndimage import distance_transform_edt
2Distance to roads (metres)
3with rasterio.open('roads.tif') as src:
4roads = src.read(1)
5dist_roads = distance_transform_edt(roads == 0) * cell_size
6Normalise components to 0-1
7habitat_inv = 1 - (habitat / 10.0)                # 0 = great, 1 = bad
8slope_norm = np.clip(slope_deg / 45.0, 0, 1)
9road_proximity = np.clip(1 - dist_roads / 2000.0, 0, 1)
10cost = 1.0 + 3.0 * habitat_inv + 2.0 * slope_norm + 4.0 * road_proximity

Step 5 — Compute least-cost path between park centroids

Python
1from skimage.graph import route_through_array
2park_a = (500, 200)   # (row, col) of Park A centroid in the raster
3park_b = (300, 1500)  # (row, col) of Park B centroid in the raster
4route, total_cost = route_through_array(
5cost, start=park_a, end=park_b, fully_connected=True
6)

Rasterise the route:

Python
1path_raster = np.zeros_like(cost, dtype='uint8')
2for r, c in route:
3    path_raster[r, c] = 1

Step 6 — Buffer the path into a 1 km corridor

Load path.tif in QGIS and buffer the resulting line by 500 m to form a 1 km-wide corridor. Or in Python:

Python
1from scipy.ndimage import distance_transform_edt
2dist_path = distance_transform_edt(path_raster == 0) * cell_size
3corridor = (dist_path <= 500).astype('uint8')

Step 7 — Zonal stats — what's inside the corridor?

Python
from rasterstats import zonal_stats

A high mean and forest majority indicate a good corridor.

Step 8 — Map it

Load DEM, land cover, the corridor, and the two parks into QGIS. Produce a print layout (reuse Lab 5.5 template). Add legend, scale bar, data sources.

Step 9 — Reflection

Answer in your notes:

  • Which criterion dominated the cost surface? How sensitive is the corridor to the weights?
  • What would happen if you doubled the slope weight? Halved it?
  • Where does the path take unexpected detours, and why?

Troubleshooting

  • Path hugs one row/column — enable fully_connected=True; normalise cost components before combining.
  • Path goes through water — ensure water has very high cost or nodata; don't use exactly inf (some algorithms don't handle it).
  • Raster alignment errors — verify shape and transform match across all inputs; reproject if not.
  • Corridor is too narrow / wide — adjust the buffer distance.

Self-check exercises

1. Why normalise each cost component before summing?

Without normalisation, a component with a larger absolute range (slope 0–45° vs habitat 0–1) dominates the total. Scaling each to 0–1 (or comparable units) and applying explicit weights makes the trade-offs clear and controllable. Weights should sum to a meaningful total (like 10 or 1) for interpretability.

2. Your corridor crosses a major highway. How would you force it to find an alternative?

Assign the highway cells a very high cost (say 100× the baseline). The least-cost algorithm will detour around them unless no alternative exists. For absolute barriers, set nodata on those cells — but then ensure there's a feasible path, or the algorithm fails. Verifying the cost surface visually before running the path is key.

3. You want a sensitivity analysis across different weighting schemes. What workflow?

Parameterise the weights (w_habitat, w_slope, w_roads) and run the pipeline in a loop, saving each resulting corridor with a descriptive name. Compare outcomes visually (overlay in QGIS) and quantitatively (total cost, length, mean habitat). Monte Carlo sampling of weights with summary statistics is the rigorous approach; a grid search is usually enough for three or four weights.

Summary

  • Chained raster operations model real decision problems: reclass → slope → cost → path → corridor → stats.
  • Alignment up front prevents dozens of bugs downstream.
  • Sensitivity analysis on weights is the difference between a defensible recommendation and hand-waving.

Further reading

  • Beier, P. & Majka, D. — Conservation Corridor Design.
  • Adriaensen et al. — The application of least-cost modeling as a functional landscape model.
  • Earth Engine documentation on cost-distance.
  • SCALGO Live — practical hydrological analysis at scale.
Module test

Module 10: Raster Analysis

Answer these quick multiple-choice questions to check your understanding before moving on.

1. Map algebra operates mainly on what?
2. What do zonal statistics summarize?
3. Cost surfaces are used to model what?