Module 15: Spatial Interpolation & Geostatistics

15.3 Kriging Fundamentals

The geostatistical interpolator — how it models spatial autocorrelation and delivers uncertainty estimates.

Lesson 76 of 100·20 min read

Key takeaways

Kriging models spatial autocorrelation with a variogram, then uses it to produce optimal unbiased predictions.

The kriging variance quantifies uncertainty — a feature no deterministic method offers.

Variogram choice and fit are the practical art of kriging.

Introduction

Kriging — named for South African mining engineer Danie Krige — is the gold standard of spatial interpolation. Unlike IDW, it uses the observed spatial autocorrelation structure in the data to compute weights, and it returns both a prediction and its uncertainty. This lesson covers the basics.

The variogram

The variogram (also semivariogram) describes how similarity between two measurements falls off with distance:

$$\gamma(h) = \frac{1}{2} E[(z(x) - z(x+h))^2]$$

Plot γ(h) against lag distance h:

Nugget — γ at h = 0, representing measurement error and microscale variability.
Range — distance beyond which γ levels off; points further apart are effectively uncorrelated.
Sill — maximum γ, equal to the overall variance.

Fitted models: spherical, exponential, Gaussian, Matérn. Different shapes suit different phenomena.

Empirical vs model variograms

Empirical variogram — computed from data, binned by distance.
Model variogram — a parametric curve fit to the empirical values.

Only the model is used in kriging — it must be positive-definite to guarantee valid predictions.

Simple kriging

Assumes the mean is known and constant. Rarely used in practice because we don't usually know the mean.

Ordinary kriging

The most common flavour. Assumes a local constant mean (stationary). Predicts:

$$\hat{z}(x_0) = \sum_i \lambda_i z(x_i)$$

With weights λ_i chosen to minimise prediction variance subject to the constraint that they sum to 1 (unbiasedness).

Kriging variance:

$$\sigma^2(x_0) = \sum_i \lambda_i \gamma(x_0, x_i) + \mu$$

This is the uncertainty estimate — higher variance = less confident prediction.

Universal kriging

Handles a trend: z(x) = m(x) + ε(x) where m(x) is a deterministic trend (e.g., linear drift) and ε is a zero-mean stationary random field. Use when data shows a clear gradient (temperature decreasing with altitude, pollution increasing toward an emission source).

Co-kriging

Exploits correlated auxiliary variables. If temperature correlates with elevation, you can include elevation as a covariate when kriging temperature — improving predictions where temperature samples are sparse but elevation is well-mapped.

Indicator kriging

For categorical or exceedance probability mapping — "what's the probability that concentration exceeds 10 ppm?"

Practical workflow

Python

1import numpy as np
2from pykrige.ok import OrdinaryKriging
3x, y, z = data['x'], data['y'], data['values']
41. Fit the variogram automatically (or manually)
5OK = OrdinaryKriging(
6x, y, z,
7variogram_model='spherical',
8nlags=20,
9verbose=True,
10enable_plotting=True
11)
122. Predict on a grid
13gx = np.linspace(x.min(), x.max(), 200)
14gy = np.linspace(y.min(), y.max(), 200)
15z_pred, variance = OK.execute('grid', gx, gy)
163. z_pred is the interpolated surface
17variance is the uncertainty (higher = less certain)

Fitting a variogram

Key decisions:

Binning lags — narrow enough to capture structure, wide enough to have enough pairs per bin.
Model choice — spherical, exponential, Gaussian, Matérn. Match the visual shape.
Nugget — fit or fix based on known measurement error.
Anisotropy — if structure differs by direction, fit directional variograms.

Python

1from pykrige.core import calculate_variogram_model
2semivar = OK.semivariogram_model
3# Inspect fitted parameters

Visually check the fit. If the model systematically under/over-shoots the empirical points, reconsider the model family.

Anisotropy

Spatial autocorrelation often has direction. Wind-blown pollution has longer correlation in the downwind direction. Kriging handles this by fitting anisotropic variograms:

Major axis — direction of maximum range.
Minor axis — perpendicular.
Anisotropy ratio — major / minor.

Most software automates fitting; you verify direction with a directional variogram plot.

Validation

Cross-validation is essential:

Python

1from pykrige.compat import OrdinaryKriging
2Leave-one-out
3errors = []
4for i in range(len(x)):
5rest = np.delete(np.arange(len(x)), i)
6OK = OrdinaryKriging(x[rest], y[rest], z[rest], variogram_model='spherical')
7pred, var = OK.execute('points', [x[i]], [y[i]])
8errors.append(z[i] - pred[0])

The ratio of LOO error to kriging standard deviation (the z-score) should be approximately standard normal if the variogram fits well.

When kriging fails or misleads

Too few samples (under ~30) — variogram is noisy, predictions unstable.
Strong non-stationarity — one variogram doesn't capture regional changes; split into zones or use non-stationary methods.
Wrong variogram model — a Gaussian variogram implies smoothness that reality doesn't have.
Sparse data in the target area — the variance is large, which is honest but not satisfying.

When to use what

Situation	Recommendation
Quick visualisation, dense samples	IDW
Need a smooth surface, continuous physical	Spline
Need uncertainty estimates	Kriging
Have auxiliary variables	Co-kriging
Strong trend across data	Universal kriging
Categorical data	Nearest neighbour or indicator kriging

Self-check exercises

1. What does the variogram tell you that IDW implicitly ignores?

The variogram reveals the actual spatial autocorrelation structure — how quickly values become uncorrelated with distance, the range of correlation, and the anisotropy. IDW assumes a fixed power-of-distance weighting regardless of true structure; kriging adapts weights to what the data actually shows, producing better predictions and — uniquely — an uncertainty estimate.

2. Why does kriging provide uncertainty estimates but IDW doesn't?

Kriging treats values as realisations of a random field with a modelled covariance structure. That model implies prediction variance: closer to many samples = lower variance; far from samples = higher variance. IDW is a deterministic formula that doesn't model randomness, so it can only produce a point estimate. The underlying statistical framing is what makes the uncertainty available.

3. Your kriging variance is highest in a corner of the study area. Is that sensible?

Almost certainly — kriging variance is large wherever samples are sparse or distant. Map corners often have few samples (edge effects) and so should have higher uncertainty. A surface with uniformly low uncertainty suggests the variogram model may be too optimistic. Always overlay predictions and variance maps side by side; high-confidence predictions should cluster near sample locations.

Summary

Kriging = statistical interpolation with variogram-based weights.
Produces both predictions and uncertainty.
Ordinary kriging is the workhorse; universal and co-kriging add flexibility.
Variogram fitting and validation are the essential skill.

15.3 Kriging Fundamentals

Introduction

The variogram

Empirical vs model variograms

Simple kriging

Ordinary kriging

Universal kriging

Co-kriging

Indicator kriging

Practical workflow

Fitting a variogram

Anisotropy

Validation

When kriging fails or misleads

When to use what

Self-check exercises

Summary

Further reading

Module 15: Spatial Interpolation & Geostatistics

Ready to level up your map-making process?