Module 2: Spatial Thinking & Geographic Concepts

2.4 Spatial Relationships and Topology

The DE-9IM model, nine predicates, and why topology lets GIS answer questions SQL cannot.

Lesson 9 of 100·22 min read

Key takeaways

Spatial relationships (contains, touches, intersects, etc.) have precise mathematical definitions.

The DE-9IM matrix formalises every possible relationship between two geometries.

Topology allows a GIS to reason about connectivity, containment, and neighbours — the core of spatial analysis.

Introduction

"Does this road cross that river?" "Is the factory inside the buffer zone?" "Do these two parcels share a boundary?" Every such question is a spatial predicate — a function that returns true or false based on the geometric relationship between two features. This lesson introduces the nine standard predicates, the DE-9IM model that underpins them, and why topology is the secret sauce of GIS.

The nine standard spatial predicates

Every spatial library (PostGIS, Shapely, Turf.js, JTS, GEOS) exposes essentially the same set:

Predicate	Meaning (informal)
Equals	Geometries occupy the same space.
Disjoint	No points in common.
Intersects	At least one point in common (inverse of Disjoint).
Touches	Boundaries touch; interiors don't intersect.
Crosses	Interiors intersect, but A is not contained in B.
Within	A lies entirely inside B.
Contains	B lies entirely inside A.
Overlaps	Same dimension, partial overlap.
Covers / CoveredBy	Like contains/within but also permits boundary overlap.

A concrete example: a river polygon and a city polygon. If they share a boundary but no interior point, they touch. If part of the river is inside the city, they overlap. If the whole river is inside the city, the river is within the city.

The DE-9IM model

Behind those plain-English predicates sits a precise model: the Dimensionally Extended 9-Intersection Model (DE-9IM), introduced by Clementini and Felice in the 1990s and standardised by the OGC.

Every geometry has three parts:

Interior (I) — points strictly inside.
Boundary (B) — the edge of the geometry (empty for a single point).
Exterior (E) — everything else in the plane.

For two geometries A and B, you can compute the intersection of each part with each part: I(A)∩I(B), I(A)∩B(B), I(A)∩E(B), and so on — nine combinations. The dimension of each intersection (−1 for empty, 0 for point, 1 for line, 2 for polygon) fills a 3×3 matrix:

Code

1       I(B)  B(B)  E(B)
2I(A) [  2     1     2  ]
3B(A) [  1     0     1  ]
4E(A) [  2     1     2  ]

The nine predicates are defined as specific patterns over this matrix. For example, touches requires the interior-interior cell to be empty while one of the boundary cells is non-empty.

You rarely compute the matrix by hand — libraries expose the predicates directly — but the DE-9IM framework is what guarantees they behave consistently across software.

Topology as data structure

Beyond predicates, topology is also a way of storing geometry. Instead of each polygon carrying its own boundary, polygons share edges with their neighbours. Changing one shared edge updates all adjacent polygons.

Advantages:

Consistency — two neighbouring parcels cannot develop a gap or an overlap.
Efficiency — shared boundaries are stored once.
Rich queries — "which polygons share an edge with this one?" is instant.

ESRI's Geodatabase and OpenStreetMap both use topological models internally. Plain shapefiles don't — each polygon stores its own boundary, making OSM-scale consistency a challenge without additional tooling.

Why topology matters practically

Consider a cadastre (land-parcel registry). If parcels are stored independently and you digitise a new boundary slightly differently from its neighbour, you'll get either a sliver (a tiny polygon between them) or an overlap. Both are data errors that cascade into tax bills, legal disputes, and analyses.

A topologically clean cadastre:

Has no gaps or overlaps between adjacent parcels.
Guarantees that lot lines meet exactly at shared nodes.
Updates neighbouring polygons automatically when a boundary changes.

Tools like QGIS, PostGIS topology, and Esri's topology rules provide validation to catch these errors.

Spatial indexes and predicate performance

Checking "does feature A intersect any of these 10 million features?" naively is O(N) per query — unusable. Every GIS uses a spatial index to accelerate predicates:

R-tree — nested bounding boxes (the most common general-purpose index).
Quadtree — recursively subdivided grid, good for point-heavy workloads.
Geohash / H3 / S2 — hierarchical cell systems for global indexing.
BRIN — block range indexes for clustered data in PostgreSQL.

You rarely build the index yourself; you let PostGIS or GeoPandas create it. But knowing the index exists explains why spatial queries on millions of features complete in milliseconds.

A worked SQL example

SQL

1-- Which hydrant is closest to each fire, and are they within 300 m?
2SELECT
3  f.fire_id,
4  h.hydrant_id,
5  ST_Distance(f.geom::geography, h.geom::geography) AS metres
6FROM fires f
7LEFT JOIN LATERAL (
8  SELECT hydrant_id, geom
9  FROM hydrants
10  WHERE ST_DWithin(f.geom::geography, geom::geography, 300)
11  ORDER BY f.geom <-> geom    -- <-> uses the spatial index
12  LIMIT 1
13) h ON true;

The <-> operator returns bounding-box distance from the index, dramatically faster than sorting all hydrants by true distance.

Common predicate gotchas

Touches vs intersects. Intersects is true even when only boundaries touch. Touches excludes interior overlap. Pick the right one.
Within vs contains. A within B ≡ B contains A. They're identical up to argument order.
Open vs closed geometries. Most GIS treat polygons as closed (boundary is part of the geometry); a point on the edge is within.
Precision. Two polygons that look adjacent on screen may not touch precisely due to floating-point representation. Snap or round coordinates before testing.

Self-check exercises

1. What's the difference between touches and overlaps?

Touches requires the interiors to be disjoint but the boundaries to share at least one point — like two polygons sharing an edge but not an interior cell. Overlaps requires both interiors to intersect but neither geometry to be contained in the other.

2. Why do shapefiles sometimes produce slivers when you dissolve neighbouring parcels?

Each shapefile polygon stores its own boundary, so neighbouring polygons may have slightly different vertex positions (due to digitising error or floating-point precision). A dissolve operation then leaves tiny sliver polygons where the edges should have been identical. Topologically managed data structures avoid this by storing shared edges once.

3. A spatial query needs to find the 5 nearest hospitals to each of 1 million patients. Naive scan is 10^12 comparisons. What data structure makes this tractable?

A spatial index — typically an R-tree or a quadtree. With a KNN-aware operator (like PostGIS's <->), each nearest-neighbour lookup becomes roughly O(log N) rather than O(N), reducing the total to ~20 million index probes — quite tractable on a laptop.

Summary

Spatial predicates (equals, disjoint, intersects, touches, crosses, within, contains, overlaps, covers) cover every two-geometry relationship.
The DE-9IM model provides the mathematical backbone.
Topological data structures guarantee consistency and enable rich neighbour queries.
Spatial indexes make predicate-heavy workloads feasible at scale.