CoursesGIS Basics — A Complete Introduction3.4 Vector vs Raster — When to Use Which
Module 3: Spatial Data Models

3.4 Vector vs Raster — When to Use Which

A practical decision guide for choosing between vector and raster representations.

Lesson 13 of 100·18 min read

Key takeaways

  • Vector excels for discrete features with well-defined boundaries; raster excels for continuous phenomena.
  • The right choice depends on the question being asked, not just the data at hand.
  • Many real workflows mix both — convert thoughtfully, aware of what's lost.

Introduction

You'll hear the debate "vector vs raster" so often it starts to sound like Coke vs Pepsi. It isn't — they're tools for different jobs. This lesson gives you a fast decision framework, a comparison table, and guidance for the inevitable vector/raster conversions.

Quick comparison

CriterionVectorRaster
RepresentsDiscrete featuresContinuous surfaces
GeometryPoints, lines, polygonsGrid of cells
ResolutionDefined by vertex precisionDefined by cell size
File size (small features)SmallerLarger (empty cells)
File size (full-coverage imagery)Larger (many polygons)Smaller (dense grid is natural)
AccuracyArbitrary precisionLimited by cell size
TopologyFirst-classImplicit
Common operationsOverlays, buffers, spatial joinsMap algebra, focal stats
Good at cartography forRoads, parcels, points of interestHillshade, imagery, heatmaps
Typical formatsShapefile, GeoJSON, GeoPackageGeoTIFF, COG, NetCDF

When vector is the right choice

  • Features are discrete and have meaningful boundaries. Roads, buildings, parcels, administrative units.
  • Attributes vary per-feature, not per-location.
  • You need precise geometric queries (containment, adjacency, topology).
  • Data is sparse in space (a million points across a continent).
  • Storage must scale with feature complexity, not area.
  • You need to preserve exact coordinates from a survey or authoritative source.

When raster is the right choice

  • Data is a continuous surface — elevation, temperature, reflectance, population density.
  • Every cell has a value (or nodata).
  • You need fast cell-by-cell arithmetic (NDVI = (NIR − Red) / (NIR + Red)).
  • Resolution is uniform and known.
  • The analysis is inherently local-neighbourhood (slope, focal mean).

Grey areas

Some phenomena can be modelled either way. A land-cover map can be:

  • Vector — polygons with a class attribute. Compact for small areas, easy to edit manually, exact boundaries.
  • Raster — each cell carries a class code. Scales to large areas, fits well in raster-analytic pipelines.

Similarly population can be polygons (census tracts) or a raster surface (WorldPop-style 100 m grid). Pick based on downstream analysis.

Conversion pitfalls

Vector → Raster (rasterisation)

Aliasing: small polygons snap to cell boundaries; thin lines may disappear if they're narrower than a cell. Always set a resolution fine enough for the smallest feature you care about.

Shell
gdal_rasterize -a class -tr 10 10 -l landuse landuse.gpkg landuse.tif

Raster → Vector (polygonisation)

Contiguous same-valued cells become polygons. The result has cell-aligned (stair-step) boundaries, which can be unrealistic for natural features. Smoothing after vectorisation is common — use Douglas-Peucker with a small tolerance.

Shell
gdal_polygonize.py classified.tif landuse.gpkg

Performance considerations

Vector performance

  • Scales with feature count and vertex count, not with geographic extent.
  • A million point features is routine; a billion stretches even modern databases.
  • Spatial indexes (R-tree) make common queries O(log N).

Raster performance

  • Scales with total cell count (width × height × bands × time steps).
  • 10 m resolution Europe-wide = ~100 billion cells per band. Plan storage carefully.
  • Use tiled/pyramided formats (COG), cloud-native access, and lazy computation (Dask, Xarray).

Storage: a rough cost comparison

A 1 km × 1 km area mapped at high fidelity:

  • Vector — 100 building polygons × ~20 vertices × 8 bytes ≈ 16 KB.
  • Raster (1 m cells, 8-bit) — 1 000 × 1 000 cells ≈ 1 MB per band.

For a 10 km × 10 km area:

  • Vector — ~10 000 buildings × 20 vertices × 8 bytes ≈ 1.6 MB.
  • Raster (1 m cells, 8-bit) — 10 000 × 10 000 cells ≈ 100 MB.

Vector wins for sparse features; raster wins for true full-coverage surfaces.

Common analysis patterns mixing both

  1. Zonal statistics. Vector polygons + raster values → per-polygon mean/sum/majority. (Module 10.3.)
  2. Point sampling. Vector points + raster → sample raster at point coords.
  3. Distance surfaces. Vector features → raster of distance to nearest feature.
  4. Rasterised overlays. Convert vector categories to raster, then combine with other rasters via map algebra.
  5. Classified imagery → vector polygons. Turn a land-cover classification into editable polygons for reporting.

The best analysts move fluidly between models; the worst insist their favourite does everything.

Decision checklist

When you get a new question:

  1. Is the phenomenon discrete or continuous? → vector / raster.
  2. How big is the area? → small favours vector, continental favours raster.
  3. What downstream analyses? → overlays lean vector; neighbourhood operations lean raster.
  4. Do I need exact source coordinates? → vector preserves them; raster snaps to grid.
  5. What are my performance constraints? → sparse features favour vector; dense uniform data favours raster.

Self-check exercises

1. Would you store a world land-cover dataset at 100 m resolution as vector or raster?

Raster. At 100 m, a world raster is ~400 billion cells — huge, but tractable with tiled COGs. Converting to vector would produce hundreds of millions of polygons with stair-stepped boundaries — slower to query and bulkier despite being "simpler" in the abstract.

2. A drone produces a 2 cm resolution orthophoto of a 5 ha field. How much does that raster weigh (8-bit, 3 bands)?

5 ha = 50 000 m²; at 2 cm × 2 cm, that's 50 000 / 0.0004 = 125 million cells per band; × 3 bands × 1 byte = 375 MB uncompressed. Compression (LZW, JPEG in GeoTIFF) typically halves or quarters this. Still large — plan storage and cloud transfer accordingly.

3. You want to compute "average elevation per watershed" from a DEM and a watershed polygon layer. Which analysis pattern is that?

Zonal statistics — a classic vector/raster hybrid operation. Each polygon becomes a zone; the raster provides per-cell values; the result is one statistic per polygon. Available in QGIS, rasterstats (Python), and ST_SummaryStats (PostGIS raster).

Summary

  • Vector for discrete features; raster for continuous surfaces.
  • The question drives the model more than the data source does.
  • Conversions lose information — be explicit about what's lost and why.
  • Real workflows blend both; zonal statistics is the most common bridge.

Further reading

  • Burrough & McDonnell — Principles of Geographical Information Systems, vector/raster chapters.
  • Gorelick et al. — Google Earth Engine paper.
  • Geocomputation with R (Lovelace, Nowosad, Muenchow) — open textbook with parallel vector/raster treatments.
  • Esri — Fundamentals of Raster Data.