7.5 Digitizing and Georeferencing
Creating vector data by tracing, and tying historical maps or scans to real-world coordinates.
Key takeaways
- Digitising creates vector features by tracing imagery or paper maps.
- Georeferencing places a raster (scan) in real-world coordinates using control points.
- Both are everyday tasks in historical GIS, field mapping, and data creation.
Introduction
Most GIS data exists because someone created it. When no open dataset covers your area, you make your own — usually by tracing features from imagery (digitising) or anchoring a scanned map to coordinates (georeferencing). This short lesson covers both.
Digitising
The workflow: load a base image (satellite, drone, or aerial photo) in QGIS / Atlas / ArcGIS, enable editing mode on an empty vector layer, and click along feature boundaries.
Best practices
- Fixed zoom — digitise at a consistent scale so feature sizes remain comparable.
- Snap — enable vertex snapping with a small tolerance to avoid slivers between adjacent polygons.
- Topology editing — use tools that share edges between neighbours, so you never have two polygons with slightly different boundaries.
- Attribute as you go — fill in required attributes for each feature before moving on; easy to forget later.
- Pause breaks — accuracy drops after an hour. Take coffee breaks.
- Peer review — random sample of features checked by a second digitiser reduces bias.
Heads-up vs heads-down digitising
Old-school "heads-down" used a digitising table (the paper map on a physical tablet, a puck with crosshairs, tracing onto it). "Heads-up" digitising is onscreen — universally the default now.
AI-assisted digitising
Segment-Anything, Meta's SAM, and vendor-specific tools propose feature boundaries from imagery. You click a building and it outlines the polygon automatically. Greatly speeds work but requires validation — AI makes confident mistakes.
Quality metrics
For any digitised dataset, record:
- Source imagery and date.
- Operator(s).
- Scale digitised at.
- Positional accuracy estimate (e.g., ±1 m at the chosen scale).
- QA procedure (spot checks, full second pass, etc.).
Georeferencing
Georeferencing ties a raster (a scanned historical map, a PDF of a building floor plan, an aerial photo without geotags) to real-world coordinates.
The process
- Load the scan.
- Identify control points — places whose real coordinates are known. Typical sources: road intersections, building corners, survey monuments, or corresponding points in a georeferenced reference image.
- Add control points — click the scan location + the real-world location. Repeat for 8–20 points distributed across the image.
- Choose a transformation:
- Polynomial order 1 (affine) — translation, scale, rotation; 3+ control points; good when scan is already approximately rectilinear.
- Polynomial order 2 / 3 — allows local warping; needs more points; can overfit.
- Thin-plate spline (TPS) — smooth non-linear warping; flexible, great for historical maps with known distortion.
- Check residuals — the per-point error. Remove outliers.
- Resample to a new georeferenced raster.
Expected accuracy
Depends on source:
- Modern survey-grade scan: centimetres.
- 20th-century national topographic map: metres.
- 19th-century city plan: tens of metres.
- Hand-drawn sketch: highly variable.
Pitfalls
- Uneven distribution of control points causes high error in sparsely controlled areas.
- Collinear points (all along a road) underconstrain the transformation.
- Wrong projection assumption — the historical map might use a local projection you haven't recognised.
- Paper shrinkage / warping of scans — introduces non-linear distortion that only TPS handles.
Tools
- QGIS Georeferencer — excellent free tool; handles polynomial 1–3 and TPS.
- Atlas — browser-based georeferencing.
- GDAL —
gdal_translatewith GCPs +gdalwarpfor batch workflows. - ArcGIS Pro Georeferencing — commercial equivalent.
A worked example: georeferencing a historical city map
- Find a 1890 plan of your city as TIFF/JPEG.
- Load in QGIS → Raster → Georeferencer.
- Drop 10 control points at road intersections you can identify in both the old plan and current OSM.
- Choose polynomial order 2.
- Resample and add to the map — you can now overlay historical streets on modern data.
Typical residuals: 5–15 m in well-controlled areas, 30+ m in sparsely controlled edges.
Rubber sheeting
A specific form of georeferencing used to align two vector datasets: pick matching points, warp one to fit the other. Useful for integrating legacy and modern surveys.
Self-check exercises
1. Why are well-distributed control points more important than many control points?
A transformation is fit to the control points; between them it interpolates. Many points clustered in one region leave other areas unconstrained and poorly aligned. Eight or ten points distributed evenly usually beat fifty points bunched in one corner.
2. When would you choose thin-plate spline over polynomial order 1?
When the scan has non-linear distortion you need to correct — for example, a paper map that has shrunk unevenly with age, or a hand-drawn sketch that isn't to a consistent scale. TPS smoothly warps the image to match all control points exactly. Polynomial order 1 (affine) can only translate, rotate, and scale uniformly.
3. A digitised polygon dataset has ~30 cm of sliver overlap between neighbouring parcels. What went wrong, and how do you fix it?
Snapping wasn't enabled during digitising, or its tolerance was too small; each polygon was traced independently. Fix by enabling topology editing or sharing boundaries between neighbours. Post hoc, use a topology-aware tool to snap neighbours' shared boundaries within a tolerance (QGIS Topology Checker + Fix).
Summary
- Digitising creates vector data by tracing; enforce topology and attributes during capture.
- Georeferencing anchors rasters to coordinates via control points.
- Control point distribution matters more than raw count.
- Modern AI-assisted tooling speeds both but still needs validation.
Further reading
- QGIS Documentation — Georeferencer Plugin.
- Harley, J. B. — The New Nature of Maps (historical cartography).
- GDAL
gdaltransformandgdalwarpGCP workflows. - OpenHistoricalMap — collaborative historical cartography project.
Module 7: Data Sources & Acquisition
Answer these quick multiple-choice questions to check your understanding before moving on.