Module 20: Data Quality, Ethics & Careers

20.3 Locational Privacy and Ethics

Where GIS crosses into surveillance, discrimination, and harm — and how to stay on the right side.

Lesson 97 of 100·16 min read

Key takeaways

Location data is PII; treating it otherwise risks real harm.

Anonymisation is harder than it looks; k-anonymity and differential privacy help.

GIS practitioners carry ethical responsibilities — data decisions affect real people.

Introduction

GIS isn't neutral. A map showing the clinics with abortion services reveals patients; a delivery dataset reveals home addresses; a choropleth of ethnicity can fuel gerrymandering. Location carries more than latitude and longitude — it carries identity. This lesson covers privacy and ethics for the professional.

Why location is PII

A single coordinate plus a timestamp identifies a person's home (where they sleep).
Workplace from daytime locations.
Relationships from repeated co-locations.
Health status (visits to hospitals, specialists).
Political/religious views (visits to organisations).

Four data points per person — home, work, two frequent destinations — uniquely identify 95 % of people in a population.

Legal context

GDPR (EU) — location is personal data. Requires consent, purpose limitation, access rights.
CCPA (California) — similar.
HIPAA (US health) — strict location handling for health data.
LGPD (Brazil) — mirrors GDPR.
PIPEDA (Canada).

Compliance isn't just bureaucracy — the fines and reputational damage are significant.

Anonymisation techniques

Naive removal

Drop the name column and hope. Fails reliably — locations alone re-identify.

Coarsening

Round coordinates to 1 km grid cells, or aggregate to neighbourhoods. Simple and often sufficient for aggregate analysis.

k-anonymity

Each record must share its location with at least k-1 others in the dataset. Achieved by clustering rare coordinates or aggregating to zones containing at least k people.

Differential privacy

Add carefully calibrated noise so any individual's contribution to the dataset is statistically hidden, even against adversaries with auxiliary knowledge. Most rigorous; used by the US Census since 2020.

Data minimisation

Only collect the locations you actually need. Don't store exact GPS traces when hourly centroids suffice.

Common anti-patterns

"Just leave the coordinates" — anonymisation by removing names alone is easily defeated.
Choropleths over small zones — a 5-person zone with 1 case = 20 % rate, but reveals identity.
Open delivery datasets — vehicle GPS traces expose driver homes.
Public Wi-Fi traces — easily linked to individuals.

The re-identification literature

Key papers to know:

Sweeney, L. (2002) — 87 % of Americans identifiable by 5-digit ZIP + birth date + sex.
de Montjoye et al. (2013) — four spatial-temporal points identify 95 % of people in mobile-phone datasets.
Hern (2018) — Strava heatmap revealed military bases.

Ethical considerations beyond privacy

Surveillance

High-resolution imagery + AI = mass visual surveillance. Who's watching? For what purpose? Under whose authority?

Algorithmic bias

Models trained on biased data perpetuate bias. Crime-prediction algorithms trained on past arrests reinforce over-policing of certain neighborhoods.

Environmental justice

GIS can reveal disparities (industrial pollution siting) — and it can obscure them (aggregation hiding hotspots in minority communities).

Dual-use dilemmas

Open source satellite imagery + ML = deforestation monitoring AND military targeting. Releasing tools freely has unpredictable consequences.

Practical principles

Consent — collect only with informed consent where possible.
Minimise — the less data, the less risk.
Coarsen — aggregate when possible.
Secure — encrypt at rest and in transit.
Limit access — role-based permissions.
Delete — retention schedules matter.
Audit — log access; review regularly.
Transparent — tell subjects how their data is used.
Accountable — document decisions; review periodically.
Proportionate — the sensitivity of location demands handling proportional to potential harm.

Organisational practice

Appoint a data protection officer.
Conduct Data Protection Impact Assessments (DPIAs) for new spatial datasets.
Train analysts on privacy principles.
Code reviews for location-handling logic.
Cross-functional ethics committees for contentious projects.

Self-check exercises

1. Why does removing names from a GPS dataset not anonymise it?

Location itself identifies people. Four data points (home, work, common places) re-identify 95 % of individuals in typical datasets. An adversary with any auxiliary information (employee directory, residential address list) can link "anonymised" traces back to names in minutes. True anonymisation requires coarsening, k-anonymity, or differential privacy — not just name removal.

2. Your client wants to publish a dashboard of asthma patient addresses for research. Ethical issues?

Many. Rooftop-level addresses + medical condition = severe privacy violation. Reframe: aggregate to neighborhood or census tract; ensure each zone has ≥10 patients (k-anonymity); avoid combining with other sensitive attributes; get IRB / ethics board approval; publish under restricted access for qualified researchers only. Never publish rooftop patient data publicly.

3. What's one positive use of spatial privacy techniques?

Enabling life-saving research while protecting subjects. Differentially-private medical geography lets epidemiologists identify disease hotspots without revealing individual patients. Aggregated mobility data (e.g., Google COVID mobility reports) informed public-health decisions while preserving individual privacy. The techniques make useful data sharing compatible with strong privacy.

Summary

Location is PII; treat it accordingly.
Anonymisation needs more than name removal; use k-anonymity or differential privacy.
Legal frameworks (GDPR, CCPA, HIPAA) require specific handling.
Ethics extends beyond privacy — surveillance, bias, dual-use all matter.
Professional practice requires explicit decisions, not defaults.