15.3 Kriging Fundamentals
The geostatistical interpolator — how it models spatial autocorrelation and delivers uncertainty estimates.
Key takeaways
- Kriging models spatial autocorrelation with a variogram, then uses it to produce optimal unbiased predictions.
- The kriging variance quantifies uncertainty — a feature no deterministic method offers.
- Variogram choice and fit are the practical art of kriging.
Introduction
Kriging — named for South African mining engineer Danie Krige — is the gold standard of spatial interpolation. Unlike IDW, it uses the observed spatial autocorrelation structure in the data to compute weights, and it returns both a prediction and its uncertainty. This lesson covers the basics.
The variogram
The variogram (also semivariogram) describes how similarity between two measurements falls off with distance:
$$\gamma(h) = \frac{1}{2} E[(z(x) - z(x+h))^2]$$
Plot γ(h) against lag distance h:
- Nugget — γ at h = 0, representing measurement error and microscale variability.
- Range — distance beyond which γ levels off; points further apart are effectively uncorrelated.
- Sill — maximum γ, equal to the overall variance.
Fitted models: spherical, exponential, Gaussian, Matérn. Different shapes suit different phenomena.
Empirical vs model variograms
- Empirical variogram — computed from data, binned by distance.
- Model variogram — a parametric curve fit to the empirical values.
Only the model is used in kriging — it must be positive-definite to guarantee valid predictions.
Simple kriging
Assumes the mean is known and constant. Rarely used in practice because we don't usually know the mean.
Ordinary kriging
The most common flavour. Assumes a local constant mean (stationary). Predicts:
$$\hat{z}(x_0) = \sum_i \lambda_i z(x_i)$$
With weights λ_i chosen to minimise prediction variance subject to the constraint that they sum to 1 (unbiasedness).
Kriging variance:
$$\sigma^2(x_0) = \sum_i \lambda_i \gamma(x_0, x_i) + \mu$$
This is the uncertainty estimate — higher variance = less confident prediction.
Universal kriging
Handles a trend: z(x) = m(x) + ε(x) where m(x) is a deterministic trend (e.g., linear drift) and ε is a zero-mean stationary random field. Use when data shows a clear gradient (temperature decreasing with altitude, pollution increasing toward an emission source).
Co-kriging
Exploits correlated auxiliary variables. If temperature correlates with elevation, you can include elevation as a covariate when kriging temperature — improving predictions where temperature samples are sparse but elevation is well-mapped.
Indicator kriging
For categorical or exceedance probability mapping — "what's the probability that concentration exceeds 10 ppm?"
Practical workflow
1import numpy as np
2from pykrige.ok import OrdinaryKriging
3[object Object]
4[object Object]
5[object Object]
6[object Object]
7[object Object]
8[object Object]
9[object Object]
10Fitting a variogram
Key decisions:
- Binning lags — narrow enough to capture structure, wide enough to have enough pairs per bin.
- Model choice — spherical, exponential, Gaussian, Matérn. Match the visual shape.
- Nugget — fit or fix based on known measurement error.
- Anisotropy — if structure differs by direction, fit directional variograms.
1from pykrige.core import calculate_variogram_model
2semivar = OK.semivariogram_model
3# Inspect fitted parametersVisually check the fit. If the model systematically under/over-shoots the empirical points, reconsider the model family.
Anisotropy
Spatial autocorrelation often has direction. Wind-blown pollution has longer correlation in the downwind direction. Kriging handles this by fitting anisotropic variograms:
- Major axis — direction of maximum range.
- Minor axis — perpendicular.
- Anisotropy ratio — major / minor.
Most software automates fitting; you verify direction with a directional variogram plot.
Validation
Cross-validation is essential:
1from pykrige.compat import OrdinaryKriging
2[object Object]
3[object Object]
4The ratio of LOO error to kriging standard deviation (the z-score) should be approximately standard normal if the variogram fits well.
When kriging fails or misleads
- Too few samples (under ~30) — variogram is noisy, predictions unstable.
- Strong non-stationarity — one variogram doesn't capture regional changes; split into zones or use non-stationary methods.
- Wrong variogram model — a Gaussian variogram implies smoothness that reality doesn't have.
- Sparse data in the target area — the variance is large, which is honest but not satisfying.
When to use what
| Situation | Recommendation |
|---|---|
| Quick visualisation, dense samples | IDW |
| Need a smooth surface, continuous physical | Spline |
| Need uncertainty estimates | Kriging |
| Have auxiliary variables | Co-kriging |
| Strong trend across data | Universal kriging |
| Categorical data | Nearest neighbour or indicator kriging |
Self-check exercises
1. What does the variogram tell you that IDW implicitly ignores?
The variogram reveals the actual spatial autocorrelation structure — how quickly values become uncorrelated with distance, the range of correlation, and the anisotropy. IDW assumes a fixed power-of-distance weighting regardless of true structure; kriging adapts weights to what the data actually shows, producing better predictions and — uniquely — an uncertainty estimate.
2. Why does kriging provide uncertainty estimates but IDW doesn't?
Kriging treats values as realisations of a random field with a modelled covariance structure. That model implies prediction variance: closer to many samples = lower variance; far from samples = higher variance. IDW is a deterministic formula that doesn't model randomness, so it can only produce a point estimate. The underlying statistical framing is what makes the uncertainty available.
3. Your kriging variance is highest in a corner of the study area. Is that sensible?
Almost certainly — kriging variance is large wherever samples are sparse or distant. Map corners often have few samples (edge effects) and so should have higher uncertainty. A surface with uniformly low uncertainty suggests the variogram model may be too optimistic. Always overlay predictions and variance maps side by side; high-confidence predictions should cluster near sample locations.
Summary
- Kriging = statistical interpolation with variogram-based weights.
- Produces both predictions and uncertainty.
- Ordinary kriging is the workhorse; universal and co-kriging add flexibility.
- Variogram fitting and validation are the essential skill.
Further reading
- Isaaks & Srivastava — An Introduction to Applied Geostatistics.
- Chiles & Delfiner — Geostatistics: Modeling Spatial Uncertainty.
- pykrige documentation.
- Hengl, T. — Practical Guide to Geostatistical Mapping.
Module 15: Spatial Interpolation & Geostatistics
Answer these quick multiple-choice questions to check your understanding before moving on.