GDALRaster Processing

gdal_grid

What is gdal_grid?

gdal_grid converts scattered vector point data into a regularly gridded raster by interpolating pixel values from nearby samples. It reads any OGR source (CSV, Shapefile, GeoPackage, PostGIS) and writes any GDAL raster driver, supporting multiple interpolation algorithms: inverse distance weighting (IDW with or without nearest-neighbour search), moving average, nearest neighbour, linear, and a family of data-metric outputs (minimum, maximum, range, count, standard deviation) for QA.

Shell
gdal_grid [options] -a <algorithm>[:<param>=<value>...] <src_datasource> <dst_filename>

Commonly used options:

  • -a <algorithm>[:params] — algorithm: invdist, invdistnn, average, nearest, linear, minimum, maximum, range, count, average_distance, average_distance_pts
  • -zfield <field> — vector attribute carrying the Z value (else uses geometry Z)
  • -txe <xmin> <xmax> / -tye <ymin> <ymax> — output extent
  • -outsize <xsize> <ysize> — output raster dimensions
  • -tr <xres> <yres> — output pixel size (alternative to -outsize)
  • -of <format>, -ot <type>, -co <NAME>=<VALUE>
  • -l <layer> / -where <expr> / -sql <statement> — source selection

Algorithm parameters are passed as -a invdist:power=2.0:smoothing=0.0:radius1=100:radius2=100.

When would you use gdal_grid?

Use gdal_grid whenever you have point measurements and need a continuous raster. Typical jobs: interpolating soil sample pH across a farm, converting bathymetric soundings into a seabed DEM, producing a temperature raster from weather station observations, or rasterising LiDAR returns when you need a per-cell average elevation. Example: gdal_grid -a invdistnn:power=2:radius=500:max_points=12 -zfield ph -txe 390000 395000 -tye 6100000 6105000 -outsize 500 500 samples.gpkg ph.tif.

The data-metric algorithms are excellent for QA before interpolation: -a count shows point density per cell so you can spot sparse areas, and -a range flags cells with suspiciously high variance. Use those to size your search radius before committing to an invdist output.

FAQs

Which interpolation algorithm should I pick?

invdistnn (inverse distance with nearest-neighbour search) is the pragmatic default: deterministic, tuneable, and bounded in compute time via max_points. Plain invdist has quadratic cost on large datasets; avoid it above a few thousand points. nearest is fast and appropriate for categorical data. For geostatistically rigorous work, gdal_grid is not enough — use gstat, PyKrige, or whitebox_tools for proper kriging with variograms.

How do I avoid bullseye artefacts in IDW output?

Increase smoothing (e.g. smoothing=1.0) to soften local peaks, raise max_points so more neighbours contribute, and widen radius1/radius2. Bullseyes are intrinsic to IDW when samples are sparse relative to the chosen radius; consider a larger search radius or a different algorithm.

Can I interpolate multi-attribute points in one pass?

No — gdal_grid outputs a single band per invocation with one -zfield. Loop over attributes in a shell or Python wrapper, then stack the resulting bands with gdalbuildvrt -separate if you need a multi-band product.

Why is gdal_grid slow on large datasets?

IDW with an unbounded search is O(n·m) where n is pixel count and m is point count. Switch to invdistnn which uses a spatial index and caps neighbours via max_points. Also reduce output resolution during prototyping, and consider pre-splitting the input into tiles you interpolate in parallel.