ML for satellite imagery: CNNs and U-Net segmentation
Deep learning has rewritten remote sensing. CNNs (object detection) and U-Nets (semantic segmentation) are now standard. This week you train one on real GOES data.
Could a machine learning model spot bleaching coral from satellite imagery before a human diver does?
Yes — and several already do. This week you'll learn U-Net, the same architecture researchers use for reef-bleaching detection, lava-flow mapping, and (yes) rocket-plume segmentation.
Learning objectives
- Train a CNN for object detection in raster imagery
- Build a U-Net for semantic segmentation of clouds / plumes / fires
- Generate training data via thresholding + manual labels
- Evaluate with IoU and confusion matrices
Try it: tune a confidence threshold
Every classifier has a knob: how confident must it be to flag a detection? Higher threshold → fewer false positives, more missed detections. Try it.
Primer
Deep learning has rewritten the playbook for satellite imagery analysis over the past decade. Convolutional neural networks now do object detection, semantic segmentation, super-resolution, and change detection at production scale across every major Earth-observation platform. This week is the practical primer: when to use deep learning vs threshold rules, the U-Net architecture, and how to train one on real GOES data.
When deep learning beats thresholding
Threshold rules (Week 14's Band 7 > 320 K) work when the discriminator is a single scalar feature. They break down when:
- The discriminator is spatial-contextual (a plume looks different from a wildfire in shape and spatial neighborhood, not just brightness).
- You need probabilistic output (confidence scores) for downstream cost-of-error decisions.
- You have many labeled examples and want a single classifier that captures complex patterns.
Threshold rules are great for fast, explainable, debuggable baseline detection. Deep learning shines for the next layer: scoring, classification, and segmentation refinement.
The U-Net architecture
U-Net (Ronneberger et al. 2015) is the workhorse for image segmentation in remote sensing. It's an encoder-decoder with skip connections:
- Encoder — successive 3x3 convolutions + 2x2 max-pool, halving the spatial dimensions and doubling the channel count at each level. By the bottleneck, the feature map is small but channel-rich.
- Decoder — successive 2x2 transpose convolutions + 3x3 convolutions, doubling the spatial dimensions and halving channels. Reconstructs the original resolution.
- Skip connections — at each decoder level, concatenate the corresponding encoder feature map. This preserves fine-grained spatial detail that would otherwise be lost in the bottleneck.
The output is a same-size map of per-pixel class probabilities. For plume segmentation, the classes are {background, plume}; for multi-class fire/plume/cloud, expand accordingly.
import torch.nn as nn
class UNet(nn.Module):
def __init__(self, in_ch=1, out_ch=1, n_features=32):
super().__init__()
# ... 4 down blocks + bottleneck + 4 up blocks ...
# Each block: Conv3x3 → BatchNorm → ReLU → Conv3x3 → BatchNorm → ReLU
# Down: Conv block + MaxPool2x2
# Up: ConvTranspose2x2 + concat with skip + Conv block
Weak supervision
The training-data problem: who hand-labels rocket plumes in tens of thousands of GOES frames? Nobody. The trick is weak supervision — generate the training labels programmatically.
For plumes: run Week 14's threshold detector + Week 20's morphology cleanup over a year of GOES frames around known launches. Cross-check against the published launch schedule. Use those pixel masks as training labels. The labels are noisy (some false positives, some false negatives), but with enough volume the U-Net learns to denoise — it picks up on spatial context the threshold rule can't see.
Evaluation: IoU and confusion matrices
For segmentation, accuracy is misleading (a network that predicts "no plume everywhere" gets 99.99% accuracy because most pixels really are no plume). Use:
- IoU (Intersection over Union) — area of overlap / area of union. 1.0 is perfect, 0.0 is no overlap. Compute per-class IoU and report the mean (mIoU).
- Confusion matrix — true positives, false positives, false negatives, true negatives at the pixel level. Derive precision (TP / (TP + FP)) and recall (TP / (TP + FN)). For a typical production detector, the gate is false positive rate must stay below a few percent — alarm fatigue kills the user experience long before missed detections do.
Small models, not big
For thermal plume segmentation in 200×200 pixel windows, a 32-feature U-Net (~1M parameters) is more than enough. Don't reach for big pretrained models — they need huge training sets, they're slow to deploy, and the feature distribution of satellite imagery is far enough from ImageNet that pretrained weights help less than you'd expect.
The lab
You'll generate weakly-supervised training data from threshold detections + morphology over a year of GOES Band 7 frames — a generic recipe applicable to plume detection, wildfire mapping, gas-flare inventories, or volcanic-hotspot classification — train a small U-Net in PyTorch, and evaluate on held-out scenes with IoU + confusion matrices. The same architecture (encoder-decoder with skip connections, small parameter count, weak-supervision pretraining) underpins most operational hotspot-classification layers in industry. What each operator does on top — feature stacking, gating, fusion — is the secret sauce that doesn't ship in an open curriculum.
Connecting to Hawaiʻi: ML for reef health
Researchers at the Hawaiʻi Institute of Marine Biology, at NOAA Coral Reef Watch, and at University of Hawaiʻi have published work using U-Net and similar CNNs to detect coral bleaching from satellite imagery — automating what used to require human diver surveys. The same architecture (encoder-decoder with skip connections) you'll learn this week powers those reef-monitoring systems. Weak supervision (training on imperfect labels) was a key technique because hand-labeling bleaching is expensive.
Hands-on lab: U-Net for plume segmentation
Generate training data from threshold-detected plumes in GOES Band 7. Train a small U-Net to segment plume pixels. Evaluate on held-out launches with IoU and confusion matrices.
Quiz — click an answer to check it
No grade, no shame. Tap any option; you'll see if it's right plus the answer if not. The point is to notice what you already know and what's still settling.
- Encoder-decoder with skip connections, ideal for segmentation
- Just a CNN
- Recurrent
- Transformer-only
- Overlap between predicted and ground-truth mask
- Loss only
- Reprojection error
- Compression ratio
- Weak supervision (programmatic labels)
- Manual labeling
- Synthetic data
- Augmentation
- Translation invariance and locality
- They're newest
- Only choice
- Marketing
- Faster inference, less overfitting on small training sets, deployable to edge
- Always smaller is worse
- Required by law
- No reason
Reflection
Take five minutes with this. Write your answer somewhere. Carry it into next week.