15.2 IDW, Nearest Neighbour, and Spline Interpolation
Three deterministic interpolators — fast, transparent, and widely used.
Key takeaways
- IDW weights neighbours by inverse distance; power and neighbourhood size tune behaviour.
- Nearest-neighbour gives every point the value of its closest sample — blocky but honest.
- Splines produce smooth surfaces through control points — good for continuous physical phenomena.
Introduction
Deterministic interpolators rely on fixed formulas rather than statistical models. They're fast, transparent, and often "good enough". This lesson covers three: IDW, nearest neighbour, and spline.
Inverse Distance Weighting (IDW)
The most popular deterministic interpolator:
$$\hat{z}(x) = \frac{\sum_i w_i z_i}{\sum_i w_i} \quad w_i = \frac{1}{d_i^p}$$
d_i— distance from target to sample i.p— power parameter (usually 1–3).- Sum over a neighbourhood (n nearest or all within a radius).
Higher p → closer neighbours dominate (sharper surface). Lower p → more equal weighting (smoother).
1from scipy.spatial import cKDTree
2import numpy as npStrengths
- Simple, fast, intuitive.
- No distribution assumptions.
- Predictions at samples ≈ sample values (exact or near-exact).
Weaknesses
- No uncertainty estimates.
- Bull's-eye artefacts around samples.
- Poor extrapolation.
- Struggles with anisotropy.
Good uses
- Quick visualisations.
- Dense, well-distributed samples.
- When speed matters.
Nearest Neighbour
Simplest interpolator — assign to each target the value of its nearest sample.
- Voronoi tessellation: every cell within each Voronoi polygon gets the same value.
- Blocky output.
- Preserves categorical values (use for land-cover, classified rasters).
- Resampling rasters with
nearest: the right choice for categorical data.
Spline / Thin-plate spline (TPS)
Fits a smooth surface that passes through (or near) all sample points, minimising the total "bending energy" of the surface:
$$\text{minimise} \iint \left(\frac{\partial^2 f}{\partial x^2}\right)^2 + 2\left(\frac{\partial^2 f}{\partial x \partial y}\right)^2 + \left(\frac{\partial^2 f}{\partial y^2}\right)^2 , dx , dy$$
Think of a thin sheet of metal bent to touch each sample point with minimal curvature.
Strengths
- Smooth, visually pleasing surfaces.
- Natural for continuous phenomena (elevation, temperature).
- Regular, well-defined behaviour.
Weaknesses
- Can "overshoot" — predicted values outside the sample range.
- Slower than IDW for large N.
- No uncertainty estimates.
- Struggles with noisy data.
1from scipy.interpolate import Rbf
2rbfi = Rbf(x, y, z, function='thin_plate', smooth=0)
3z_pred = rbfi(xq, yq)Regularised vs exact splines
- Exact spline — passes through every sample.
- Regularised / smoothing spline — approximates; good for noisy measurements.
Control with a smoothing parameter. QGIS and ArcGIS both offer both variants.
Natural neighbour
A refinement of Voronoi-based interpolation that weights nearby samples by the fraction of a new Voronoi polygon they "would have occupied" if the target were included. Smoother than nearest neighbour; honours sample values exactly; handles irregular distributions well.
Pros: smooth, exact, no parameters to tune. Cons: computationally heavier; limited software support.
When deterministic is right
- You need a surface quickly.
- You don't need uncertainty.
- Sample distribution is reasonable.
- The phenomenon has no known statistical structure you'd model.
For rigorous uncertainty and anisotropy, move to kriging (Module 15.3).
Common pitfalls
- IDW power too low / high — experiment; visualise.
- Neighbourhood size — too small, patchy; too large, overly smooth.
- Extrapolation — all deterministic methods fail outside the convex hull of samples.
- Noisy data — exact interpolators honour every measurement, including outliers. Smooth first.
Cross-validation
Leave-one-out cross-validation (LOOCV):
1from sklearn.metrics import mean_squared_error
2preds = []
3for i in range(len(x)):
4 rest_x = np.delete(x, i)
5 rest_y = np.delete(y, i)
6 rest_z = np.delete(z, i)
7 # ... fit IDW / spline on rest, predict z at (x[i], y[i])
8rmse = mean_squared_error(z, preds, squared=False)Self-check exercises
1. Why do IDW maps often show bull's-eye artefacts around sample locations?
Near a sample point, IDW's weighting function gives that sample near-infinite weight — the surface must pass very close to it. Between samples the surface interpolates smoothly, but close to samples the predicted value is essentially that sample's value, producing concentric bands of the sample's value — the "bull's-eye". Higher IDW power amplifies this effect.
2. Which interpolator should you use for categorical data (land cover classes)?
Nearest neighbour — it assigns each target location the class value of its nearest sample without averaging, which would invent meaningless intermediate class values. Using IDW or spline on categorical data (e.g., soil class codes) produces nonsense like "class 3.7" that no taxonomy recognises.
3. Your thin-plate spline produces predictions exceeding all observed values by 20 %. What's happening?
Overshooting — exact splines can produce oscillations between sample points, particularly in areas of sparse samples or steep gradients. Mitigations: use a smoothing (regularised) spline; clip outputs to the observed range if the physical quantity is bounded; or switch to IDW or kriging, which don't overshoot this way.
Summary
- IDW: fast, tunable, bull's-eye prone. No uncertainty.
- Nearest neighbour: blocky, honest; right choice for categorical data.
- Spline: smooth, natural for continuous phenomena; may overshoot.
- Cross-validate to compare methods on your data.
Further reading
- Burrough & McDonnell — Principles of Geographical Information Systems (interpolation chapters).
- ArcGIS Geostatistical Analyst help — IDW and spline documentation.
- scipy.interpolate documentation.
- Hengl, T. — Practical Guide to Geostatistical Mapping.