Building the Pipeline: From Drone Imagery to Prescription Endpoint

March 26, 2026 · 18 min read ·

ssnmdockergcp

Penrose Field Notes · SSNM Series · Article II.b of III

Article II described the physics of what the M3M captures, the H3 spatial indexing that turns flight coverage into a verifiable checklist, and the GCP processing architecture that carries raw imagery through WebODM. That article ended with a prescription file on a John Deere terminal. This one describes the engineering between those two points — the containerized pipeline that accepts imagery from one or both drone platforms, processes it through platform-specific photogrammetric profiles, and publishes prescription tiles to an edge endpoint where the Atlas web application and a farmer’s USB drive can both reach them.

The pipeline handles three ingest scenarios: M3M multispectral only, Cine RGB only, or both platforms from the same flight window. The M3M produces the prescription data — NDVI, NDRE, CIre, calibrated reflectance. The Cine produces the visual validation layer — high-resolution natural color that confirms whether a spectral anomaly corresponds to something you can actually see in the canopy. Either platform can enter the pipeline independently. When both are present, they process sequentially through the same NodeODM instance — M3M first (fast orthophoto, lower memory, produces the analytical artifacts), then Cine (full Structure-from-Motion reconstruction, higher memory, produces the visual QA layer and a denser point cloud) — and their outputs merge into a single H3 hexagonal grid carrying properties from both sensors.

The decision to containerize this pipeline — and to containerize it the way it is containerized — is not incidental. Every architectural choice in this system has a precedent in the applied computer science literature, from Docker’s role in computational reproducibility [1, 2] to the Hilbert-curve ordering that makes edge-cached tile delivery possible [8]. What follows is both a build guide and a literature-grounded argument for why these specific patterns compose into a system that works.

Phase 1 — Pipeline Portability

The first engineering decision was to separate the processing logic from its orchestration. The processing pipeline — index computation, H3 zonal statistics, terrain tile generation, equipment export — lives in a single, standalone Python module that knows nothing about where it runs or which platforms contributed imagery.

pipeline/processing/scripts/process_task.py    → run_pipeline(task_dir, config)
pipeline/processing/scripts/orchestrator.py    → imports process_task, manages lifecycle
pipeline/processing/scripts/platform_detect.py → EXIF-based camera model identification
pipeline/processing/scripts/compute_ndvi.py    → NDVI/NDRE/CIre (M3M) or VARI (Cine)
pipeline/processing/scripts/h3_zonal_stats.py  → per-hex stats (merges multi-platform)
pipeline/processing/scripts/terrain_rgb.py     → DSM → terrain-RGB PMTiles

process_task.py exports one function: run_pipeline(). It accepts a task directory containing WebODM outputs and a configuration object specifying H3 resolution, field boundary, available platforms, and export formats. It reads the platforms field from the configuration to determine which outputs are present — M3M multispectral orthomosaic, Cine RGB orthomosaic, or both — and adjusts its index computation accordingly.

When M3M data is present, compute_ndvi.py reads the five-band multispectral orthomosaic (Green, Red, Red Edge, NIR, Alpha) and computes three indices: NDVI from the red and NIR bands, NDRE from the red edge and NIR bands, and CIre (Chlorophyll Index red edge) as a linear cross-check against NDRE. The NDVI formulation — (NIR − Red) / (NIR + Red) — was defined by Rouse et al. [18] and experimentally validated by Tucker [19] as sensitive to green leaf area and photosynthetically active biomass. NDRE substitutes the red edge band for red, detecting nitrogen stress through chlorophyll loss before canopy architecture change becomes visible in NDVI — the mechanism that makes mid-season prescription possible.

When only Cine data is available, the pipeline falls back to VARI (Visible Atmospherically Resistant Index), which operates on RGB alone. VARI provides a coarser health signal — useful for visual scouting, insurance documentation, or fields where multispectral data is not available, but insufficient for calibrated nitrogen prescription. When both platforms are present, h3_zonal_stats.py populates the H3 grid with properties from each: multispectral indices from the M3M, VARI from either the Cine RGB or the M3M’s onboard RGB camera, point cloud density metrics from the Cine (whose full SfM reconstruction produces a denser cloud than the M3M’s fast-orthophoto mode), and camera coverage counts per platform.

The function does not authenticate to anything. It does not download or upload. It does not know whether it is running inside Docker on a laptop or on a GCP Spot VM. It processes.

Boettiger [1] established that Docker containers solve dependency isolation for scientific computing by capturing the full computational environment — OS, libraries, runtime — as a portable image. Nüst et al. [2] extended this into practical Dockerfile design principles for reproducible data science: pin base images, minimize layers, separate build-time from run-time dependencies. The pipeline’s Dockerfile follows both: it extends the osgeo/gdal:ubuntu-full-3.8.0 base image (a reproducible, versioned GDAL distribution), installs the Python geospatial stack via pinned requirements.txt, and compiles tippecanoe from source for PMTiles generation. The resulting image is identical whether built in CI, on a developer laptop, or on a Cloud Batch VM.

orchestrator.py is the local-development wrapper. It polls a local WebODM instance for completed tasks, delegates platform detection to platform_detect.py, and calls run_pipeline() when processing finishes. Platform detection reads EXIF Camera Model Name from a sample of images in the task directory — M3M identifies multispectral frames, L2D-20c identifies the Hasselblad sensor on the Cine. When both camera models appear in the same project, the orchestrator submits separate WebODM tasks per platform with the appropriate ODM parameters, waits for both to complete, then triggers post-processing with the merged output. The PLATFORM_DETECT environment variable can override auto-detection with a fixed value — m3m, cine, or dual — for cases where EXIF metadata is absent or ambiguous.

Dragoni et al. [3] describe the microservice decomposition pattern where each service owns a single responsibility and communicates through well-defined interfaces. The pipeline applies this at the function level rather than the service level: process_task.py owns processing, orchestrator.py owns lifecycle, platform_detect.py owns sensor identification. When the pipeline moves to GCP, process_task.py does not change. Only the wrapper changes.

Phase 2 — GCP Cloud Batch

The cloud deployment replaces the local orchestrator with three GCP-native components: an event trigger, a right-sizing function, and a batch compute job. The manifest declares which platforms are present, and the orchestrator provisions compute for the most demanding one.

pipeline/cloud/orchestrator/main.py        → Cloud Function (manifest → right-size → Batch job)
pipeline/cloud/batch_entrypoint.py         → Container entrypoint (per-platform NodeODM tasks)
pipeline/cloud/Dockerfile.cloud            → Extends local Dockerfile with gcloud CLI
pipeline/infra/bootstrap.sh                → One-time GCP project setup (APIs, SA, buckets, deploy)
pipeline/infra/batch-job-template.json     → Reference Cloud Batch job spec

The manifest as contract

The manifest is the single source of truth for what a job contains. It declares which platforms contributed imagery, how many images each produced, and what processing parameters apply.

{
  "project_id": "field-042",
  "platforms": {
    "m3m": {
      "image_count": 487,
      "image_prefix": "m3m/",
      "radiometric_calibration": "camera",
      "has_calibration_panel": true
    },
    "cine": {
      "image_count": 502,
      "image_prefix": "cine/"
    }
  },
  "h3_resolution": 10,
  "field_boundary": "gs://bucket/field-042/boundary.geojson"
}

Omit a platform key to skip it entirely. An M3M-only manifest has no cine key. A Cine-only manifest has no m3m key. The Cloud Function reads the platforms object, determines the most demanding platform present, and sizes the VM accordingly.

Platform-aware VM sizing

M3M’s --fast-orthophoto mode skips dense point cloud reconstruction entirely — consuming roughly half the RAM and completing 3–5× faster than full SfM for the same image count. The Cine’s full reconstruction is the binding constraint. In dual mode, the Cloud Function sizes the VM for the Cine task, and the M3M task runs first on the same machine in a fraction of the time.

Scenario	Images (per platform)	Machine	RAM	Est. Time
M3M only, 80 ac	500	n2-highmem-4	32 GB	30–90 min
M3M only, 640 ac	3000	n2-highmem-8	64 GB	1.5–4 hr
Cine only, 80 ac	500	n2-highmem-8	64 GB	2–6 hr
Cine only, 640 ac	3000	n2-highmem-16	128 GB	6–14 hr
Dual, 80 ac	500 + 500	n2-highmem-8	64 GB	2.5–7 hr
Dual, 640 ac	3000 + 3000	n2-highmem-16	128 GB	7.5–18 hr

RAM, not CPU, is the binding constraint. Westoby et al. [20] establish that Structure-from-Motion involves computationally intensive stages — feature matching, bundle adjustment, dense reconstruction — that scale super-linearly with image count. WebODM’s bundle adjustment holds the full feature point cloud in memory. Insufficient RAM forces disk swap, which is 5–10× slower and produces unreliable results. The sizing table is deliberately conservative: a failed job that retries on a larger instance costs more than a correctly sized job that finishes on the first attempt.

Yang et al. [4] established that geospatial processing is simultaneously data-intensive, compute-intensive, and access-intensive — the exact trifecta that justifies ephemeral, auto-scaling cloud batch execution over fixed-capacity VMs. Bebortta et al. [5] validate the serverless pattern specifically for geospatial workloads, covering cloud, edge, and fog computing paradigms. Cloud Batch embodies both: it provisions Compute Engine Spot VMs, runs Docker containers, retries on preemption (up to two retries), and terminates VMs on completion — with zero service fees beyond the underlying compute. Vacca [21] benchmarks OpenDroneMap against commercial photogrammetry software, validating that the open-source pipeline produces comparable point clouds and orthomosaics for SfM reconstruction — the empirical basis for choosing ODM as the containerized processing engine.

The dual-platform entrypoint

batch_entrypoint.py reads the PLATFORMS_JSON environment variable (the manifest’s platforms object, serialized) and processes each platform sequentially through NodeODM:

Download all images from GCS to local SSD, organized by platform prefix (m3m/, cine/). Processing directly from a GCS FUSE mount adds per-access network latency that dramatically slows ODM’s heavy random I/O — always transfer first.
If M3M present — submit to NodeODM with --fast-orthophoto --skip-3dmodel --radiometric-calibration camera --dsm --dtm --cog --orthophoto-resolution 5 --dem-resolution 10 --split 400 --split-overlap 100. The --radiometric-calibration camera flag applies per-image corrections from DJI’s EXIF metadata: black level subtraction, vignetting correction from the per-camera polynomial, and gain/exposure normalization [22]. The --split 400 flag enables split-merge processing for large datasets, dividing them into overlapping submodels of ~400 images that process independently and stitch together — trading 30–50% more wall time for dramatically lower peak memory. Wait for completion.
If Cine present — submit to NodeODM with full SfM parameters: --dsm --dtm --cog --orthophoto-resolution 2 --dem-resolution 5 --feature-quality high --pc-classify --split 400 --split-overlap 100. The --pc-classify flag runs ground classification on the point cloud, enabling DTM extraction (bare-earth model vs. surface model). No radiometric calibration — the Cine captures natural color for visual validation, not absolute reflectance. Wait for completion.
Post-process — call run_pipeline() from process_task.py with the combined output directory and a config indicating which platforms are present. The function detects platform outputs and adjusts index computation accordingly.
Upload all artifacts to the GCS delivery bucket.
Record completion status, output paths, and hex count in Firestore.

Processing is sequential because NodeODM handles one task at a time. In dual mode, M3M processes first (~30% of total wall time), then Cine (~70%). Peak memory is determined by the Cine task — the M3M task’s fast-orthophoto mode uses significantly less RAM because it skips dense point cloud reconstruction entirely. The two tasks never compete for memory on the same VM.

Infrastructure bootstrap

bootstrap.sh is a one-time, idempotent setup script that enables GCP APIs, creates service accounts with appropriate roles, provisions ingest and delivery buckets, initializes Firestore, and deploys the Cloud Function with its Eventarc trigger. It contains no secrets; all credential binding happens through IAM role grants. The batch job template specifies provisioningModel: SPOT for 60–91% cost reduction, maxRetryCount: 2 for preemption resilience, and maxRunDuration: 21600s (6 hours) as a safety timeout.

Phase 3 — R2 Tile Proxy

Processed tiles need an endpoint. The design constraint: no running tile server, no per-request compute, no egress fees. PMTiles on Cloudflare R2 with a Workers proxy satisfies all three.

workers/tile-proxy/src/index.ts            → CF Worker (ZXY → byte-range request against R2)
workers/tile-proxy/wrangler.toml           → R2 binding to atlas-tiles bucket
pipeline/cloud/sync_to_r2.sh              → rclone GCS-to-R2 sync
pipeline/cloud/rclone.conf.example         → rclone config template for gcs + r2 remotes

The PMTiles format [6] packs an entire tile pyramid — all zoom levels, all tiles — into a single file. Tiles are ordered by a Hilbert space-filling curve, and Moon et al. [8] proved that this ordering achieves superior locality-preserving clustering compared to Z-order (Morton) curves — meaning spatially adjacent tiles receive nearby byte offsets in the archive. The consequence is that a MapLibre viewport covering a contiguous geographic area can be served by a small number of HTTP Range Requests against the single file, even when the archive contains millions of tiles. This is the same design principle underlying Cloud Optimized GeoTIFF [7]: internal tiling, pre-built overviews, and HTTP Range Request access from static object storage. COG established the pattern for raster data. PMTiles adapted it for map tiles. Both eliminate the need for a running tile server.

For dual-platform outputs, three to four PMTiles files per field cover the full visualization stack:

h3_stats_{field_id}.pmtiles — H3 vector tiles carrying per-hex properties from all available platforms. Generated by tippecanoe from the merged H3 GeoJSON. Always present.
ndvi_{field_id}.pmtiles — M3M NDVI raster tiles, colorized with the RdYlGn diverging ramp. Present when M3M data available.
rgb_{field_id}.pmtiles — Cine RGB orthomosaic raster tiles. Present when Cine data available.
terrain_{field_id}.pmtiles — Terrain-RGB elevation tiles from the DSM. Present when either platform produces a DSM.

The Cloudflare Worker translates standard ZXY tile requests into PMTiles byte-range lookups against R2. The wrangler.toml binds the Worker to the atlas-tiles R2 bucket, and the Worker handles CORS headers, cache-control directives, and a metadata endpoint that returns tileset bounds and available zoom levels. The Worker must run on a custom domain — not *.workers.dev — for Cloudflare’s edge caching to activate.

After Cloud Batch processing completes and artifacts land in the GCS delivery bucket, sync_to_r2.sh runs rclone sync to copy PMTiles to R2. R2’s S3-compatible API means rclone works without modification — the rclone.conf.example provides the template for configuring both gcs and r2 remotes. R2 charges zero egress fees — the defining cost advantage over GCS with Cloud CDN, where per-GB egress charges dominate at any meaningful tile traffic volume.

https://tiles.penrose.dev/{field_id}/h3_stats/{z}/{x}/{y}.mvt
https://tiles.penrose.dev/{field_id}/ndvi/{z}/{x}/{y}.png
https://tiles.penrose.dev/{field_id}/rgb/{z}/{x}/{y}.png
https://tiles.penrose.dev/{field_id}/terrain/{z}/{x}/{y}.png
https://tiles.penrose.dev/{field_id}/metadata.json

In dual mode, both the M3M NDVI raster and Cine RGB raster are available as separate tile layers. The Atlas frontend toggles between them. In single-platform mode, only the available layers appear.

Phase 4 — Atlas Tile Integration

The Atlas application is a SvelteKit frontend with MapLibre GL JS. It consumes tiles from the R2 edge endpoint and renders H3 hexagonal prescription zones with data-driven NDVI color ramps. When both platforms are present, it offers a toggle between the M3M NDVI analytical layer and the Cine RGB visual validation layer — the same dual-platform cross-examination described in Article I, now navigable in a browser.

atlas/src/lib/map/pipeline-layers.ts              → PipelineLayerManager (platform-aware)
atlas/src/lib/stores/pipelineStore.svelte.ts       → Reactive pipeline overlay state
atlas/src/lib/maps/styles.ts                       → TILES_BASE_URL configuration
atlas/src/lib/map/MapView.svelte                   → Layer lifecycle (rebuild on style.load, cleanup)

The PipelineLayerManager reads the metadata.json endpoint for a given field to determine which tile layers are available. The metadata declares the platforms value ("m3m", "cine", or "dual"), the available tilesets, and the bounds. In dual mode, the manager loads both the M3M NDVI raster source and the Cine RGB raster source, defaulting to the NDVI layer with a toggle for visual validation. In single-platform mode, it loads only what exists. The H3 vector tile layer is always present regardless of platform — it carries the unified zonal statistics from whatever sensors contributed data.

The manager implements a three-level graceful fallback for tile resolution: R2 edge tiles (primary, via the Workers proxy), GCS delivery bucket (fallback, via the maplibre-cog-protocol plugin for direct COG range-request access), and local WebODM TMS (development). The fallback chain is transparent to the map layers — only the tile URL template changes. Layer lifecycle management handles style.load events (which fire when the user switches base map styles and would otherwise destroy all custom layers) and component destroy cleanup.

pipelineStore.svelte.ts manages which field’s pipeline data is active, which layers are visible, and which property drives the color ramp. Toggling between NDVI mean, NDRE mean, elevation, slope, and prescription rate view is a store update that swaps the fill-color expression on the H3 vector layer — no tile refetch required, because all properties travel in the same vector tile.

The H3 hexagonal grid is the critical bridge between raster imagery and vector visualization. Brodsky [9] describes the H3 system’s icosahedral gnomonic projection with aperture-7 hexagonal refinement. Sahr et al. [10] provide the theoretical foundation: hexagonal cells offer equidistant neighbors, reduced edge effects, and better isotropy than square grids for spatial coverage problems. Birch et al. [11] demonstrate this advantage experimentally — all six neighbors of a hexagonal cell are approximately equidistant from its center, eliminating the directional bias present in square grids where corner-neighbors sit at 1.41× the distance of edge-neighbors. For aggregating vegetation indices across agricultural fields, this isotropy means spatial statistics in each cell are not systematically biased by grid orientation relative to field features. Bondaruk et al. [12] evaluate H3, S2, and other DGGS implementations against the OGC specification, providing the comparative basis for choosing H3 over alternatives.

Singla and Eldawy [13] model zonal statistics — the raster-to-vector aggregation that populates each hex with index means and standard deviations — as a sort-merge join between raster pixels and vector zones, demonstrating scalability to trillion-pixel datasets. The pipeline’s h3_zonal_stats.py uses rasterstats (which implements windowed reads via rasterio) rather than distributed computing, because at H3 Resolution 10 a single hex covers ~15,000 m² and a 640-acre field produces only ~200 hexes. The entire zonal statistics pass — across NDVI, NDRE, DSM, slope, aspect, and point cloud metrics from both platforms — completes in under two minutes.

Per-hex properties in the merged grid

The H3 GeoJSON output carries a platforms property on each feature — "m3m", "cine", or "dual" — telling downstream consumers which data layers are available. Multispectral indices (ndvi_mean, ndre_mean, cire_mean) come from the M3M. VARI can come from either the Cine RGB or the M3M’s onboard RGB camera. Point cloud metrics (pc_count, pc_density, pc_z_mean, pc_z_std) come from the Cine when available — its full SfM reconstruction produces a substantially denser point cloud than the M3M’s fast-orthophoto mode. Camera coverage is tracked per platform (shot_count_m3m, shot_count_cine). Terrain derivatives (DSM, DTM, slope, aspect, canopy height model) come from whichever platform is present; when both are, the M3M’s is used for consistency with the spectral data.

Phase 5 — Equipment Export

The prescription is only useful if the farmer’s equipment can execute it.

pipeline/processing/scripts/export_shapefile.py    → H3 GeoJSON → zipped Shapefile
pipeline/processing/scripts/export_isoxml.py       → H3 GeoJSON → ISO 11783 TaskData
pipeline/cloud/export_function/main.py             → Cloud Function HTTP endpoint (signed URL)
atlas/src/lib/export/equipment-export.ts           → Client-side export + browser download

Two export formats cover the entire VRA controller market. Shapefile — read by every controller manufactured in the last twenty years: John Deere GreenStar, Trimble GFX and TMX, Raven Viper 4, AG Leader InCommand, Precision Planting 20|20, Climate FieldView. ISO-XML TaskData (ISO 11783 Part 10) — the ISOBUS-compliant standard for AGCO, Fendt, CLAAS, CNH (Case IH and New Holland), Kubota, and any ISO 11783 compliant controller. Paraforos et al. [14] provide the definitive peer-reviewed review of ISOBUS, covering all fourteen parts of the standard including Part 10’s role in task controller prescription map data interchange.

export_shapefile.py converts the H3 hexagonal GeoJSON to a zipped Shapefile with WGS84 projection and a TgtRate column in lbs N/acre. The hexagonal zone geometry exports directly — no rectangular gridding, no dissolving into irregular polygons. DBF column names stay under 10 characters (the dBASE format constraint). All four required components — .shp, .shx, .dbf, .prj — share the same base name inside the zip.

export_isoxml.py rasterizes the H3 hexagons onto a north-aligned rectangular grid and packages the result into the TASKDATA/ folder structure with TASKDATA.XML and binary grid files. The rasterization step is necessary because ISO-XML represents prescriptions as rectangular grids, not polygonal zones.

The AgGateway ADAPT framework [15] provides the industry’s common data model for translating between precision agriculture formats — over 500,000 downloads. Campos et al. [16] validate the complete end-to-end pipeline from UAV multispectral imagery through prescription map generation to variable-rate field execution.

Both formats carry the same underlying data: zone polygons, nitrogen rate in lbs/acre (or kg/ha for metric terminals), field boundary metadata, and — when dual-platform data is available — the stress classification column that tells the agronomist whether each zone’s rate was driven by nitrogen sufficiency index from the M3M data or excluded due to water stress classification confirmed by cross-referencing the Cine’s visual layer.

export_function/main.py is a Cloud Function HTTP endpoint that generates a time-limited signed download URL for the requested export format. The Atlas frontend’s equipment-export.ts presents the available formats, triggers the Cloud Function, and initiates the browser download. The farmer selects their equipment brand, gets the right file format, and downloads it.

Phase 6 — CI/CD

Three GitHub Actions workflows keep the three deployable components in sync with their source directories.

.github/workflows/atlas-deploy.yml            → Cloudflare Pages on atlas/ changes
.github/workflows/tile-proxy-deploy.yml        → Cloudflare Workers on workers/tile-proxy/ changes
.github/workflows/pipeline-container.yml       → Artifact Registry on pipeline/ changes

Each workflow is scoped to its directory path filter. A documentation change does not rebuild the container. A frontend color tweak does not redeploy the Worker. A processing script fix does not redeploy the Atlas app. Nuyujukian [17] presents the integrated DevOps framework for scientific computing — Git repositories, CI/CD engines, container registries — that enables reproducible, decentralized data workflows across local, HPC, and cloud environments.

The pipeline container workflow builds Dockerfile.cloud (which extends the local Dockerfile with gcloud CLI and cloud-specific dependencies), runs a smoke test against a small sample dataset from each platform (M3M and Cine), and pushes the tagged image to Google Artifact Registry. Cloud Batch jobs reference the image by tag, so deploying a new pipeline version is a container push followed by a tag update in the batch job template — no VM reprovisioning, no downtime, no migration.

The Time Window

The processing window — the time between imagery upload and prescription availability — is the product’s core performance metric. It is bounded almost entirely by photogrammetry, which is RAM-bound and irreducibly serial in its most expensive phases [20].

For an M3M-only 80-acre sugar beet field (~500 multispectral images): fast-orthophoto completes in 30–90 minutes. Post-processing — NDVI/NDRE computation, H3 zonal statistics, PMTiles generation, equipment export — adds under 10 minutes. Total: under 2 hours. The farmer flies in the morning. The prescription is on the endpoint by lunch.

For a dual-platform 80-acre field (~500 images per platform): M3M fast-orthophoto completes in 30–90 minutes, Cine full SfM adds 2–6 hours, post-processing adds under 10 minutes. Total: 2.5–7 hours. The Cine adds the visual validation layer at the cost of tripling the time window.

For a dual-platform 640-acre wheat section (~3000 images per platform): the window stretches to 7.5–18 hours on an n2-highmem-16. M3M accounts for ~30% of that time, Cine for ~70%. Post-processing remains under 10 minutes regardless of field size or platform count — it is the photogrammetry that scales, not the analysis.

Whether the Cine’s time cost is worth it depends on whether the agronomist needs natural-color confirmation of the spectral anomalies, or whether the M3M’s multispectral data alone is sufficient for the prescription decision. For routine mid-season nitrogen applications on well-characterized fields, M3M-only is the minimum viable prescription path. For first-year fields, anomaly investigation, or insurance documentation, the Cine’s visual layer justifies the additional processing time.

The equipment file is a derived artifact of the H3 grid, generated in the same post-processing pass. It is available the moment tiles are.

What This Pipeline Does Not Do

It does not make agronomic decisions. The NDVI-to-rate conversion — the function that maps a sufficiency index to a nitrogen application rate — is a lookup table calibrated by an agronomist for a specific crop, growth stage, and soil series. The pipeline computes the index. The agronomist signs off on the rate table. The farmer executes the prescription. The pipeline’s job is to make the time between flight and execution as short and as transparent as possible.

It does not replace ground truth. Guo et al. [22] demonstrate that radiometric calibration converts raw sensor digital numbers into physically comparable reflectance values — but calibration corrects for sensor response, not for agronomic interpretation. The N-rich strip described in Article II remains the in-field reference that anchors the sufficiency calculation. The Cine’s visual layer confirms the spatial pattern. Neither replaces tissue sampling.

It does not lock anyone in. Every artifact is an open format. COGs are OGC-standardized GeoTIFFs [7]. PMTiles are an IANA-registered media type [6]. Shapefiles are a thirty-year-old de facto standard. ISO-XML is an ISO standard [14]. H3 is an open library [9]. The pipeline is containerized [1, 2], decomposed into independently replaceable modules [3], and deployed on commodity cloud infrastructure [4, 5]. If a better tool emerges for any phase, swap it in. The decisions are the product. The code is the implementation.

Next: Article III — Timing, Crop-Specific Thresholds, and the Full-Season Flight Calendar for Magic Valley Row Crops.

References:

Boettiger, C. (2015). An introduction to Docker for reproducible research. ACM SIGOPS Operating Systems Review, 49(1), 71–79. doi:10.1145/2723872.2723882
Nüst, D. et al. (2020). Ten simple rules for writing Dockerfiles for reproducible data science. PLOS Computational Biology, 16(11), e1008316. doi:10.1371/journal.pcbi.1008316
Dragoni, N. et al. (2017). Microservices: yesterday, today, and tomorrow. In Present and Ulterior Software Engineering, Springer, 195–216. doi:10.1007/978-3-319-67425-4_12
Yang, C. et al. (2011). Spatial cloud computing: how can the geospatial sciences use and help shape cloud computing? Int. J. Digital Earth, 4(4), 305–329. doi:10.1080/17538947.2011.587547
Bebortta, S. et al. (2020). Geospatial serverless computing: architectures, tools and future directions. ISPRS Int. J. Geo-Information, 9(5), 311. doi:10.3390/ijgi9050311
Liu, B. (2022). PMTiles specification, version 3. protomaps.com. github.com/protomaps/PMTiles/blob/main/spec/v3/spec.md
Open Geospatial Consortium (2023). OGC Cloud Optimized GeoTIFF standard, version 1.0. OGC Document 21-026.
Moon, B., Jagadish, H.V., Faloutsos, C., Saltz, J.H. (2001). Analysis of the clustering properties of the Hilbert space-filling curve. IEEE Trans. Knowledge and Data Engineering, 13(1), 124–141. doi:10.1109/69.908985
Brodsky, I. (2018). H3: Uber’s hexagonal hierarchical spatial index. Uber Engineering Blog. uber.com/blog/h3/
Sahr, K., White, D., Kimerling, A.J. (2003). Geodesic discrete global grid systems. Cartography and Geographic Information Science, 30(2), 121–134. doi:10.1559/152304003100011090
Birch, C.P.D., Oom, S.P., Beecham, J.A. (2007). Rectangular and hexagonal grids used for observation, experiment and simulation in ecology. Ecological Modelling, 206(3–4), 347–359. doi:10.1016/j.ecolmodel.2007.03.041
Bondaruk, B., Roberts, S.A., Robertson, C. (2020). Assessing the state of the art in discrete global grid systems: OGC criteria and present functionality. Geomatica, 74(1), 9–30. doi:10.1139/geomat-2019-0015
Singla, S., Eldawy, A. (2020). Raptor zonal statistics: fully distributed zonal statistics of big raster + vector data. IEEE Int. Conf. Big Data, 571–580. doi:10.1109/BigData50022.2020.9377907
Paraforos, D.S., Sharipov, G.M., Griepentrog, H.W. (2019). ISO 11783-compatible industrial sensor and control systems and related research: a review. Computers and Electronics in Agriculture, 163, 104863. doi:10.1016/j.compag.2019.104863
AgGateway (2024). ADAPT: Agricultural Data Application Programming Toolkit. adaptframework.org
Campos, J. et al. (2020). On-farm evaluation of prescription map-based variable rate application of pesticides in vineyards. Agronomy, 10(1), 102. doi:10.3390/agronomy10010102
Nuyujukian, P. (2023). Leveraging DevOps for scientific computing. arXiv:2310.08247.
Rouse, J.W. Jr., Haas, R.H., Schell, J.A., Deering, D.W. (1974). Monitoring vegetation systems in the Great Plains with ERTS. Third ERTS-1 Symposium, NASA SP-351, 309–317.
Tucker, C.J. (1979). Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment, 8(2), 127–150. doi:10.1016/0034-4257(79)90013-0
Westoby, M.J. et al. (2012). ‘Structure-from-Motion’ photogrammetry: a low-cost, effective tool for geoscience applications. Geomorphology, 179, 300–314. doi:10.1016/j.geomorph.2012.08.021
Vacca, G. (2019). Overview of open source software for close range photogrammetry. ISPRS Archives, XLII-4/W14, 239–245. doi:10.5194/isprs-archives-XLII-4-W14-239-2019
Guo, Y. et al. (2019). Radiometric calibration for multispectral camera of different imaging conditions mounted on a UAV platform. Sustainability, 11(4), 978. doi:10.3390/su11040978