Modern hurricane forecasting excels at track prediction once a tropical system is organized enough to name. What remains harder is the period before organization — the days when atmospheric conditions are evolving toward a storm that doesn't yet exist on any map. We explored whether that pre-formation period leaves a detectable fingerprint in the large-scale atmospheric state. It does, but finding it required three iterations and a key insight about spatial scale.
The data
We used ERA5 reanalysis data from ECMWF — geopotential height at 500 hPa, total column water vapor, and wind shear between 200 and 850 hPa — over regions centered on eventual landfall locations. For each event, we extracted a 30-day window preceding landfall plus 5 days after. The current analysis covers eight 2020 CONUS events: the Nashville and Easter tornado outbreaks, Tropical Storm Cristobal, the Iowa derecho, and Hurricanes Laura, Sally, Delta, and Zeta. For each event, we generated a control — the same geographic region and calendar date in a year with no comparable storm.
Three versions, one key insight
Version 1 used a single atmospheric field (500 hPa geopotential height) over a broad CONUS-scale window. Six of eight events showed a positive precursor signal, with a mean approach correlation of +0.271 versus −0.094 for controls. Two events — Hurricane Delta and the Iowa derecho — showed no signal at all. The pilot was promising but inconsistent.
Version 2 added total column water vapor and wind shear, building a multi-field representation. This improved mean correlation from +0.010 (z500 alone) to +0.283 (combined), and pushed positive detection from 4/8 to 7/8. The multi-field approach resolved events where moisture dynamics, not geopotential structure, carried the precursor signal. But Hurricane Delta remained negative.
Version 3 introduced adaptive spatial routing: instead of analyzing every event in the same broad CONUS window, the system selects a spatial frame matched to the event type. Hurricanes route to a Gulf of Mexico or Southeast Gulf window where the tropical signal concentrates. Tornadoes and derechos stay on the CONUS window where the synoptic-scale reorganization is visible. The derecho routes to a high-resolution (0.25°, 6-hour) Midwest corridor that resolves its sub-daily convective signal.
The result: 8/8 events positive. Mean event–control gap: +0.500, a 3.6× improvement over v2. Hurricane Delta, which had been the worst event at −0.964 on the CONUS window, scored +0.920 on the Southeast Gulf window — its precursor signal was always there, but it was diluted by the continental-scale analysis frame. The derecho scored +0.755 on the CONUS window and +1.049 at high resolution, the largest gap of any event in the study.
Adaptive Routing Results (v3)
| Event | Type | Window | Event Corr | Control Corr | Gap |
|---|---|---|---|---|---|
| Nashville Tornadoes | Tornado | CONUS | +0.373 | +0.505 | −0.132 |
| Easter Tornadoes | Tornado | CONUS | +0.279 | +0.429 | −0.150 |
| TS Cristobal | Tropical | SE Gulf | +0.554 | −0.360 | +0.914 |
| Iowa Derecho | Derecho | CONUS | +0.123 | −0.632 | +0.755 |
| Hurricane Laura | Hurricane | Gulf | +0.807 | +0.005 | +0.802 |
| Hurricane Sally | Hurricane | Gulf | +0.419 | +0.250 | +0.169 |
| Hurricane Delta | Hurricane | SE Gulf | +0.841 | −0.079 | +0.920 |
| Hurricane Zeta | Hurricane | Gulf | +0.494 | −0.229 | +0.723 |
Mean gap across all 8 events: +0.500 (adaptive routing). The two tornado events have negative gaps — their controls show stronger approach trends than the events, suggesting the synoptic signal is weaker or different for tornado outbreaks at this spatial scale. All hurricanes and the derecho are strongly positive.
Storms decompose into modes
Each event produces a multi-field interaction profile across five atmospheric variables: moisture (total column water vapor), low-level shear, thermal structure (850 hPa temperature), flow gradient (500 hPa north–south), and spectral concentration (500 hPa). Decomposing this profile reveals distinct interaction modes — independent patterns of atmospheric coupling that characterize how the pre-storm environment organizes.
Hurricanes separate from other event types with a type separation score of 0.647. Every event shows three or more distinct interaction modes. Hurricane Delta's leading mode is dominated by moisture–thermal coupling, which explains the pre-formation signal. The Iowa derecho shows a different structure: its leading mode loads on flow gradient and spectral concentration, reflecting the large-scale steering flow that channeled the convective system across the Midwest.
These modes are not labels applied after the fact. They emerge from the atmospheric fields in the 14 days before each event. The modes correspond to physical processes — formation (moisture and thermal coupling), steering (flow gradient dominance), and intensification (shear modulation) — that atmospheric scientists would recognize, but extracted without any weather knowledge built into the analysis.
The progression tells a story
Version Progression
| Version | What Changed | Positive Events | Mean Gap |
|---|---|---|---|
| v1 | Single field (z500), CONUS window | 4 / 8 | +0.010 |
| v2 | Multi-field (z500 + TCWV + shear) | 7 / 8 | +0.283 |
| v3 | Adaptive spatial routing | 8 / 8 | +0.500 |
Each version added exactly one capability. v1 established that the signal exists. v2 showed that multiple atmospheric fields carry complementary information. v3 showed that the spatial analysis frame must match the phenomenon — a hurricane's precursor concentrates in the Gulf, not across the continent. The derecho's precursor lives in a narrow Midwest corridor at 6-hour temporal resolution, invisible in daily CONUS data. Matching the frame to the physics was the key.
Honest limitations
Eight events is still a small sample. The two tornado outbreaks have negative gaps, meaning the precursor framework in its current form does not reliably detect tornado precursors at these spatial scales. The adaptive routing requires knowing the event type in advance to select the right window — operationally, this would need to be automated or replaced by a multi-scale approach that evaluates all windows simultaneously.
ERA5 is a reanalysis product, not a forecast. It incorporates observations that were available only after the fact. A true operational system would need to work with forecast or real-time analysis fields. Our results represent an upper bound on what's detectable. We also note that the Hurricane Delta result (−0.964 on CONUS, +0.920 on SE Gulf) illustrates both the power and the fragility of spatial windowing — the signal was always there, but the analysis frame determined whether it was visible.
Data and reproducibility
All data is public: ERA5 reanalysis (ECMWF Copernicus Climate Data Store), 500 hPa geopotential height, total column water vapor, wind shear 200–850 hPa. Eight CONUS events from 2002–2024 with paired year-matched controls. Spatial windows: CONUS (25–50°N, 125–65°W), Gulf (25–32.5°N, 100–80°W), SE Gulf (25–37°N, 95–75°W). All computation performed on Apple Silicon with no cloud resources.
Working on atmospheric precursors, severe storm detection, or reanalysis-based forecasting? Reach us at trevin@lytelab.ai