GDALRaster Processing

nearblack

What is nearblack?

nearblack walks the edge of a raster inward and replaces pixels that are "nearly" black (or nearly white) with exact black, white, or NoData. This is the standard preprocessing step on scanned or JPEG-compressed aerial imagery where the edge collar contains lossy-compressed near-black pixels that would otherwise blend unevenly during mosaicking. After running nearblack, the collar is a clean uniform value that downstream cutlines and alpha masks can treat as transparent.

Shell
nearblack [options] [-o <outfile>] [-of <format>] <infile>

Commonly used options:

  • -o <outfile> — output file (otherwise modifies input in place)
  • -of <format> — output driver
  • -white — process near-white collar (default is near-black)
  • -near <distance> — tolerance for "nearly black/white" in source pixel units (default 15)
  • -nb <count> — number of consecutive non-near pixels required to stop eroding (default 2)
  • -setalpha — add an alpha band with 0 where the collar is
  • -setmask — add a mask band instead of an alpha band
  • -color <R,G,B[,A...]> — treat this colour as the collar target (repeatable)
  • -co <NAME>=<VALUE> — creation options

When would you use nearblack?

Use nearblack on any RGB imagery that has a collar of near-black or near-white edge pixels, typically from JPEG compression bleeding, scanning, or legacy orthomosaic mosaicking. Typical jobs: cleaning up a scanned historical aerial image so its black collar becomes transparent (nearblack -near 20 -setalpha -o cleaned.tif scanned.tif), removing the near-white collar from a snow-masked satellite scene before mosaicking, or preparing a batch of aerial tiles for a seamless mosaic with gdal_merge.py -n 0.

The two tuning knobs that matter are -near (how close to black counts as collar) and -nb (how many non-collar pixels stop the erosion). Generous -near values clean more aggressively but risk eating into legitimate dark areas; conservative -near leaves residual collar. Add -setalpha to produce an alpha band that downstream mosaickers and web-map tilers can treat as transparency.

FAQs

How do I pick a good -near value?

Start at the default (15) and inspect the result. JPEG-compressed imagery often needs 20–30. Heavily compressed scans may need 40 or more. Too high and dark foreground (water, shadow, dense forest) starts disappearing. Always QA a few tiles visually before running across a batch.

Why did nearblack erode into my image?

-near was too aggressive, or the image has legitimately dark regions touching the collar. Reduce -near, or use -nb to require more consecutive non-collar pixels before the erosion stops (e.g. -nb 5). If collar removal still misbehaves, consider masking the collar with a vector cutline and gdalwarp -cutline instead.

How does nearblack relate to -alpha in gdalwarp?

They solve different pieces of the problem. nearblack -setalpha produces a well-defined alpha band keyed to collar pixels. gdalwarp -dstalpha creates an alpha band from NoData transparency during reprojection. Together, nearblack is typically run first to normalise edges, then gdalwarp carries the resulting alpha through downstream resampling.

Can nearblack process multispectral imagery?

Yes, but it assumes RGB semantics. Pixels "near black" means all three bands near zero. On multispectral data with four or more bands, only the first three are considered unless you set -color. For single-band masking, gdal_calc.py with a threshold is usually cleaner.