9.3 Clip, Erase, and Dissolve
Three everyday operations that shape vector data for analysis and display.
Key takeaways
- Clip keeps the part of a layer that overlaps another; erase removes it.
- Dissolve merges adjacent features sharing an attribute value.
- Together they prepare data for cartography and modelling.
Introduction
Three vector operations appear in almost every workflow: clip, erase, and dissolve. They're conceptually simple but deceptively powerful. This short lesson covers each, their subtleties, and how they interact.
Clip
Clip cuts the input layer by a polygon mask, keeping only the parts that overlap.
SQL:
1SELECT ST_Intersection(r.geom, b.geom) AS geom, r.attributes
2FROM rivers r, boundary b
3WHERE ST_Intersects(r.geom, b.geom);QGIS: Vector → Geoprocessing Tools → Clip. GeoPandas: gpd.clip(roads, city_boundary).
Use cases:
- Reduce a continental dataset to the area of interest.
- Produce a map-ready layer that doesn't extend beyond the study region.
Note: clipping doesn't split features at clip boundaries into multiple rows unless you explicitly want that — the attribute table remains the same, only geometry is trimmed.
Erase (difference)
Erase removes the parts of the input that overlap the mask.
SQL:
1SELECT ST_Difference(l.geom, p.all_protected) AS geom
2FROM landuse l
3CROSS JOIN (SELECT ST_Union(geom) AS all_protected FROM protected) p;Use cases:
- Identify developable land (all - protected).
- Produce the "rest of the world" outside a study area.
Dissolve
Dissolve merges adjacent or overlapping features that share an attribute value, producing fewer, larger features.
SQL:
1SELECT class, ST_Union(geom) AS geom
2FROM landuse
3GROUP BY class;GeoPandas:
dissolved = landuse.dissolve(by='class', aggfunc='sum')Use cases:
- Simplify a cadastre by owner (all parcels owned by one entity become one multipolygon).
- Merge census blocks into districts by aggregating attributes.
- Remove internal boundaries within a country.
Dissolved features typically carry aggregated attributes (sum of population, count of members, average of density). Decide aggregation functions intentionally.
Attribute handling
When dissolving:
- Sum / count — straightforward for numeric attributes.
- Average — may need weighting (area-weighted average).
- Mode / first — pick one value for text attributes.
- Drop — for attributes that don't aggregate meaningfully.
Combining the three
A common pipeline: clip → dissolve → erase → display.
Example — "forests inside our study area, merged by species, minus protected zones":
1WITH forests AS (
2 SELECT species, ST_Intersection(geom, (SELECT geom FROM study_area)) AS geom
3 FROM forests_layer
4 WHERE ST_Intersects(geom, (SELECT geom FROM study_area))
5),
6merged AS (
7 SELECT species, ST_Union(geom) AS geom
8 FROM forests
9 GROUP BY species
10),
11unprotected AS (
12 SELECT m.species,
13 ST_Difference(m.geom, (SELECT ST_Union(geom) FROM protected)) AS geom
14 FROM merged m
15)
16SELECT * FROM unprotected WHERE NOT ST_IsEmpty(geom);Four lines of logic, written one step at a time, debuggable at each.
Pitfalls
- Dissolve with invalid geometries silently produces wrong results. Validate first.
- Clip without preserving attributes — make sure you select the attributes you need.
- Erase of huge layers — can be slow without pre-union and indexes.
- Dissolve doesn't always merge adjacent features if they don't touch exactly (slivers, snapping issues).
In GeoPandas
1# Clip
2clipped = gpd.clip(roads, boundary)
3[object Object]
4[object Object]
5[object Object]
6Self-check exercises
1. Your dissolve produces non-merged adjacent polygons. What's likely wrong?
The polygons aren't exactly touching — there's a micro-gap or sliver from earlier digitising or floating-point precision. Dissolve merges topologically connected features. Fix by snapping vertices with a small tolerance (ST_Snap, QGIS Snap Geometries) or buffering by a tiny amount to ensure overlap before dissolving.
2. What's the difference between clip and erase?
Clip keeps only the parts of the input inside the mask (intersection). Erase keeps only the parts outside the mask (difference). They're complementary — clip + erase with the same mask gives you back the original input.
3. Why specify aggregation functions explicitly when dissolving?
Each dissolved feature represents many original features; attribute values must be combined somehow. Without explicit functions, tools use defaults (typically "first" or "sum") that may not be what you want. Specifying {'population': 'sum', 'density': 'mean'} makes your intent clear and your result predictable.
Summary
- Clip keeps the overlapping portion; erase removes it.
- Dissolve merges features by attribute, aggregating the rest.
- A typical pipeline chains clip → dissolve → erase; work step by step.
- Validate and snap before large operations to avoid surprises.
Further reading
- QGIS Documentation — geoprocessing tools reference.
- GeoPandas Documentation —
overlay,clip,dissolve. - Geocomputation with R — vector operations chapter.
- PostGIS cheat sheets for common spatial SQL patterns.