15.1 Spatial Interpolation — Overview
Estimating values where you have no observations — the family of methods and when to use each.
Key takeaways
- Interpolation estimates values at unsampled locations from nearby sampled points.
- Methods split into deterministic (IDW, spline) and geostatistical (kriging).
- Choice depends on sample density, spatial autocorrelation, and whether you need uncertainty.
Introduction
Real-world measurements are always sparse — a dozen rain gauges cover a county, a thousand trees sampled in a forest. Interpolation estimates the value of a variable at unobserved locations by leveraging nearby observations. This lesson overviews the method families; 15.2 and 15.3 dig into the specific algorithms.
The problem setup
Given N sampled points (x_i, y_i, z_i) and a target location (x, y), estimate z(x, y). Under Tobler's First Law, nearby observations should influence the estimate more than far ones.
Deterministic vs geostatistical
Deterministic methods
Use a fixed formula that depends only on distances to sample points. Produce a single prediction per location.
Examples:
- Inverse Distance Weighting (IDW).
- Natural neighbour.
- Spline / thin-plate spline.
- Nearest neighbour.
Geostatistical methods
Use a statistical model of spatial autocorrelation (the variogram). Produce predictions and uncertainty estimates.
Examples:
- Ordinary Kriging.
- Simple Kriging.
- Universal Kriging.
- Co-kriging.
Choosing a method
| Situation | Suggested method |
|---|---|
| Sparse points, rough estimate | IDW |
| Smooth physical phenomenon (elevation) | Spline |
| Need uncertainty quantification | Kriging |
| Multiple related variables | Co-kriging |
| Categorical / class data | Nearest neighbour, indicator kriging |
| Dense grid (e.g., satellite) | Bilinear / bicubic resampling (not "interpolation" in the sparse sense) |
Validation
Interpolation accuracy depends on sample density and the true spatial structure. Validate by:
- Hold-out — remove some points, predict them from the rest.
- Cross-validation — remove each point in turn and predict it.
- RMSE / MAE — numerical error metrics.
- Visual inspection — does the surface look plausible?
Every serious interpolation analysis reports cross-validated errors. "The surface looked right" is not enough.
Barriers and anisotropy
- Barriers — features (roads, ridges) that break spatial autocorrelation. Some methods can incorporate them.
- Anisotropy — spatial autocorrelation that's stronger in one direction than another (common with wind-driven phenomena).
Kriging handles both natively; IDW does not.
Smoothing vs exact
- Exact interpolators pass through every sample point exactly (honouring measurements).
- Smoothing interpolators approximate — useful when samples are noisy.
Choose based on whether samples are precise measurements (survey elevations: use exact) or noisy observations (rain gauges: smoothing may be appropriate).
Output resolution
Continuous surface output resolution should reflect sample density:
- 100 samples over 100 km² → 1 km pixel is reasonable.
- 1 000 samples over 1 km² → 10 m pixel is reasonable.
Predicting at finer resolution than your data supports is false precision.
Common pitfalls
- Extrapolation beyond the sampled area — all methods become unreliable.
- Non-stationarity — if the phenomenon behaves differently in different regions, global interpolation misleads.
- Outliers — a single wrong sample distorts IDW dramatically; robust methods mitigate.
- Unit mismatch — don't mix temperature in °C and °F in the same interpolation.
Tools
- gdal_grid — IDW, nearest neighbour, moving average, kriging.
- QGIS Interpolation plugin.
- Python:
scipy.interpolate,pykrige,gstat-python,verde. - R:
gstat,fields. - ArcGIS Geostatistical Analyst — comprehensive commercial suite.
A worked example
Estimating rainfall across a county from 50 gauge stations:
1from pykrige.ok import OrdinaryKriging
2import numpy as np
3[object Object]
4[object Object]
5[object Object]
6Self-check exercises
1. Why does the choice between IDW and kriging matter for scientific reports?
Kriging provides uncertainty estimates (kriging variance); IDW does not. For scientific or regulatory reports you often need to communicate not just "the value here is X" but "and we're 90 % confident it's within [X − σ, X + σ]". IDW gives you a surface but no honest way to quantify its reliability — limiting it to descriptive rather than inferential use.
2. You have 12 rainfall gauges over a 100 km² area. Is kriging appropriate?
Probably not reliably. Kriging's variogram fitting requires enough pairs to estimate the spatial autocorrelation structure; 12 points gives only 66 pairs, usually too few for a stable variogram. Use IDW or a physically-based interpolation (e.g., PRISM-style lapse rate) or add more gauges. Kriging shines with 30+ well-distributed samples.
3. Your interpolated surface has suspiciously smooth minima that don't match observed data. What's happening?
Some interpolators (especially TPS splines and smoothing kriging) regularise toward local means — the output is smoother than reality. For sharp features (peaks, valleys), exact interpolators (IDW with small neighbourhood, kriging without nugget smoothing) may better honour the data. Validate with cross-validation to confirm the surface is faithful.
Summary
- Interpolation estimates values at unobserved locations from nearby samples.
- Deterministic (IDW, spline) vs geostatistical (kriging, co-kriging).
- Kriging quantifies uncertainty; deterministic methods don't.
- Cross-validate, don't extrapolate, and match resolution to sample density.
Further reading
- Isaaks, E. H. & Srivastava, R. M. — An Introduction to Applied Geostatistics.
- Cressie, N. — Statistics for Spatial Data.
- gstat-python documentation.
- Verde documentation (regular-grid interpolation in Python).