Module 9: Vector Spatial Analysis

9.2 Overlay Operations

Union, intersection, difference, and symmetric difference — combining layers by geometry.

Lesson 45 of 100·18 min read

Key takeaways

  • Overlay operations combine layers to reveal where features overlap or differ.
  • Union, intersection, difference, and symmetric difference cover every two-layer combination.
  • Choose the operation based on exactly what you want the result to contain.

Introduction

Overlay is the act of combining two or more spatial layers to produce a new layer whose geometry encodes the spatial relationship between inputs. The classic four operations — union, intersection, difference, symmetric difference — correspond precisely to set operations on point-sets, extended to 2D.

The four operations

Given two polygon layers A and B:

OperationProduces
Intersection (A ∩ B)Parts of A that are also in B.
Union (A ∪ B)Everything in A, B, or both — with attribute rows for each piece.
Difference (A − B)Parts of A not in B.
Symmetric difference (A ⊖ B)Parts in A or B but not both.

In PostGIS:

  • ST_Intersection(a, b)
  • ST_Union(a, b) (pairwise) or ST_Union(geom) (aggregate)
  • ST_Difference(a, b)
  • ST_SymDifference(a, b)

Worked examples

Intersection — land use × floodplain

"How much of each land-use polygon is inside the floodplain?"

SQL
1SELECT l.landuse,
2       SUM(ST_Area(ST_Intersection(l.geom, f.geom))) AS flooded_m2
3FROM landuse l
4JOIN floodplain f ON ST_Intersects(l.geom, f.geom)
5GROUP BY l.landuse;

Union — full overlay of two layers

"Produce a polygon layer with both landuse and flood class attributes":

SQL
1CREATE TABLE combined AS
2SELECT l.landuse, f.flood_class, ST_Intersection(l.geom, f.geom) AS geom
3FROM landuse l
4JOIN floodplain f ON ST_Intersects(l.geom, f.geom);

(Technically intersection; true "union" produces pieces from all of A, B, and overlap. Many GIS tools use the term "Identity" or "Update" for these subtler variants.)

Difference — developable land

"Land that is not inside any protected area":

SQL
1SELECT ST_Difference(l.geom, p.all_protected) AS developable_geom
2FROM landuse l
3CROSS JOIN (
4  SELECT ST_Union(geom) AS all_protected FROM protected_areas
5) p;

Pre-unionising the protected-area polygons into a single geometry makes the difference simpler and faster.

Symmetric difference — only-A-or-only-B

Rarely used directly; useful for change detection (where layers differ).

Aggregate ST_Union

Combining many geometries into one is common:

SQL
SELECT ST_Union(geom) FROM parks;

PostGIS is clever about this — it uses a fast cascaded union algorithm internally.

Area-weighted attribute transfer

A common follow-up: estimate an attribute of one layer based on overlap with another.

Example: population by district, but you have it by census tract (which don't align with districts).

SQL
1SELECT d.id, d.name,
2       SUM(
3         t.population *
4         ST_Area(ST_Intersection(t.geom, d.geom)) /
5         ST_Area(t.geom)
6       ) AS estimated_pop
7FROM districts d
8JOIN census_tracts t ON ST_Intersects(d.geom, t.geom)
9GROUP BY d.id, d.name;

This is areal interpolation under a "homogeneous population density" assumption — imperfect but widely used. Module 16 covers more sophisticated alternatives.

Validity and precision

Overlay operations amplify invalid geometries. If input polygons self-intersect or have malformed rings, the intersection may fail or produce meaningless output. Clean first:

SQL
UPDATE landuse SET geom = ST_MakeValid(geom) WHERE NOT ST_IsValid(geom);

Floating-point coordinate precision also causes slivers — extremely thin polygons along shared boundaries. Snap before overlay:

SQL
ST_Snap(a.geom, b.geom, 0.001)

Then intersect.

Performance

For large layers, overlay operations are expensive. Strategies:

  • Pre-filter with a bounding box join (ST_Intersects) before computing the exact intersection.
  • Use spatial indexes on both inputs.
  • Parallelise by tiling if available (PostGIS 15+ supports parallel workers).
  • For huge jobs, use GPU-backed tools (CPU → GPU: cuSpatial) or distributed compute (GeoMesa, Sedona).

Topology-aware overlays

Given a topologically-managed dataset (shared edges), overlays can produce output without slivers by reusing exact edges. PostGIS topology extension, GRASS's vector engine, and Esri's geodatabase topology all support this.

In GeoPandas

Python
1import geopandas as gpd
2[object Object]
3

gpd.overlay handles all four operations via the how parameter.

After an overlay, inspect the output visually before trusting the numbers. Uploading the result to Atlas is a lightweight way to click through attributes, look for slivers, and compare the output against the original input layers with a shared reviewer.

Self-check exercises

1. What's the difference between ST_Intersects and ST_Intersection?

ST_Intersects is a predicate — returns true / false for whether two geometries share any point. ST_Intersection is a geometry constructor — returns the geometry that is the intersection (possibly empty). Predicates are cheap and index-aware; constructors are expensive. Use predicates for filtering and constructors when you actually need the overlapping shape.

2. Why pre-union protected areas before computing difference from landuse?

Without pre-union you'd compute a difference against each protected-area polygon one at a time, each time reducing the landuse geometry — quadratic in both layers. Pre-unionising into one "all protected" geometry turns it into a single difference call per landuse polygon, and lets the spatial index operate on one summary geometry.

3. You overlay two polygon layers and get slivers along the boundaries. What's the fix?

Two common tactics: (1) Snap the boundaries of the two layers together (ST_Snap with a small tolerance) before overlay. (2) Use a topology-aware overlay if your data is in a topological structure (shared edges). After overlay you can also remove slivers by dropping polygons below a minimum area threshold, though that's treating the symptom, not the cause.

Summary

  • Intersection, union, difference, and symmetric difference cover two-layer overlays.
  • Aggregate ST_Union pre-combines many geometries; useful to speed downstream operations.
  • Area-weighted attribute transfer is the simplest areal interpolation method.
  • Clean invalid geometries before overlay; snap to avoid slivers.

Further reading

  • PostGIS documentation — geometry constructors.
  • PostGIS in Action — overlay operations chapter.
  • Geocomputation with R — overlay chapter.
  • GEOS documentation — the library implementing most overlay math.