3.4 Vector vs Raster — When to Use Which
A practical decision guide for choosing between vector and raster representations.
Key takeaways
- Vector excels for discrete features with well-defined boundaries; raster excels for continuous phenomena.
- The right choice depends on the question being asked, not just the data at hand.
- Many real workflows mix both — convert thoughtfully, aware of what's lost.
Introduction
You'll hear the debate "vector vs raster" so often it starts to sound like Coke vs Pepsi. It isn't — they're tools for different jobs. This lesson gives you a fast decision framework, a comparison table, and guidance for the inevitable vector/raster conversions.
Quick comparison
| Criterion | Vector | Raster |
|---|---|---|
| Represents | Discrete features | Continuous surfaces |
| Geometry | Points, lines, polygons | Grid of cells |
| Resolution | Defined by vertex precision | Defined by cell size |
| File size (small features) | Smaller | Larger (empty cells) |
| File size (full-coverage imagery) | Larger (many polygons) | Smaller (dense grid is natural) |
| Accuracy | Arbitrary precision | Limited by cell size |
| Topology | First-class | Implicit |
| Common operations | Overlays, buffers, spatial joins | Map algebra, focal stats |
| Good at cartography for | Roads, parcels, points of interest | Hillshade, imagery, heatmaps |
| Typical formats | Shapefile, GeoJSON, GeoPackage | GeoTIFF, COG, NetCDF |
When vector is the right choice
- Features are discrete and have meaningful boundaries. Roads, buildings, parcels, administrative units.
- Attributes vary per-feature, not per-location.
- You need precise geometric queries (containment, adjacency, topology).
- Data is sparse in space (a million points across a continent).
- Storage must scale with feature complexity, not area.
- You need to preserve exact coordinates from a survey or authoritative source.
When raster is the right choice
- Data is a continuous surface — elevation, temperature, reflectance, population density.
- Every cell has a value (or nodata).
- You need fast cell-by-cell arithmetic (NDVI = (NIR − Red) / (NIR + Red)).
- Resolution is uniform and known.
- The analysis is inherently local-neighbourhood (slope, focal mean).
Grey areas
Some phenomena can be modelled either way. A land-cover map can be:
- Vector — polygons with a
classattribute. Compact for small areas, easy to edit manually, exact boundaries. - Raster — each cell carries a class code. Scales to large areas, fits well in raster-analytic pipelines.
Similarly population can be polygons (census tracts) or a raster surface (WorldPop-style 100 m grid). Pick based on downstream analysis.
Conversion pitfalls
Vector → Raster (rasterisation)
Aliasing: small polygons snap to cell boundaries; thin lines may disappear if they're narrower than a cell. Always set a resolution fine enough for the smallest feature you care about.
gdal_rasterize -a class -tr 10 10 -l landuse landuse.gpkg landuse.tifRaster → Vector (polygonisation)
Contiguous same-valued cells become polygons. The result has cell-aligned (stair-step) boundaries, which can be unrealistic for natural features. Smoothing after vectorisation is common — use Douglas-Peucker with a small tolerance.
gdal_polygonize.py classified.tif landuse.gpkgPerformance considerations
Vector performance
- Scales with feature count and vertex count, not with geographic extent.
- A million point features is routine; a billion stretches even modern databases.
- Spatial indexes (R-tree) make common queries O(log N).
Raster performance
- Scales with total cell count (width × height × bands × time steps).
- 10 m resolution Europe-wide = ~100 billion cells per band. Plan storage carefully.
- Use tiled/pyramided formats (COG), cloud-native access, and lazy computation (Dask, Xarray).
Storage: a rough cost comparison
A 1 km × 1 km area mapped at high fidelity:
- Vector — 100 building polygons × ~20 vertices × 8 bytes ≈ 16 KB.
- Raster (1 m cells, 8-bit) — 1 000 × 1 000 cells ≈ 1 MB per band.
For a 10 km × 10 km area:
- Vector — ~10 000 buildings × 20 vertices × 8 bytes ≈ 1.6 MB.
- Raster (1 m cells, 8-bit) — 10 000 × 10 000 cells ≈ 100 MB.
Vector wins for sparse features; raster wins for true full-coverage surfaces.
Common analysis patterns mixing both
- Zonal statistics. Vector polygons + raster values → per-polygon mean/sum/majority. (Module 10.3.)
- Point sampling. Vector points + raster → sample raster at point coords.
- Distance surfaces. Vector features → raster of distance to nearest feature.
- Rasterised overlays. Convert vector categories to raster, then combine with other rasters via map algebra.
- Classified imagery → vector polygons. Turn a land-cover classification into editable polygons for reporting.
The best analysts move fluidly between models; the worst insist their favourite does everything.
Decision checklist
When you get a new question:
- Is the phenomenon discrete or continuous? → vector / raster.
- How big is the area? → small favours vector, continental favours raster.
- What downstream analyses? → overlays lean vector; neighbourhood operations lean raster.
- Do I need exact source coordinates? → vector preserves them; raster snaps to grid.
- What are my performance constraints? → sparse features favour vector; dense uniform data favours raster.
Self-check exercises
1. Would you store a world land-cover dataset at 100 m resolution as vector or raster?
Raster. At 100 m, a world raster is ~400 billion cells — huge, but tractable with tiled COGs. Converting to vector would produce hundreds of millions of polygons with stair-stepped boundaries — slower to query and bulkier despite being "simpler" in the abstract.
2. A drone produces a 2 cm resolution orthophoto of a 5 ha field. How much does that raster weigh (8-bit, 3 bands)?
5 ha = 50 000 m²; at 2 cm × 2 cm, that's 50 000 / 0.0004 = 125 million cells per band; × 3 bands × 1 byte = 375 MB uncompressed. Compression (LZW, JPEG in GeoTIFF) typically halves or quarters this. Still large — plan storage and cloud transfer accordingly.
3. You want to compute "average elevation per watershed" from a DEM and a watershed polygon layer. Which analysis pattern is that?
Zonal statistics — a classic vector/raster hybrid operation. Each polygon becomes a zone; the raster provides per-cell values; the result is one statistic per polygon. Available in QGIS, rasterstats (Python), and ST_SummaryStats (PostGIS raster).
Summary
- Vector for discrete features; raster for continuous surfaces.
- The question drives the model more than the data source does.
- Conversions lose information — be explicit about what's lost and why.
- Real workflows blend both; zonal statistics is the most common bridge.
Further reading
- Burrough & McDonnell — Principles of Geographical Information Systems, vector/raster chapters.
- Gorelick et al. — Google Earth Engine paper.
- Geocomputation with R (Lovelace, Nowosad, Muenchow) — open textbook with parallel vector/raster treatments.
- Esri — Fundamentals of Raster Data.