Real-time UK rail data engine in Rust (v1.20.0). Subscribes directly to National Rail's Darwin Push Port STOMP firehose, tracks every train in a lock-free in-memory registry, and predicts delays with a LightGBM model — answering the vast majority of queries from local state and prediction without ever polling the live GBR API.
A UK rail data engine written in Rust. RailPredict subscribes directly to the Darwin Push Port — National Rail's STOMP-based firehose of every train movement in the country — and uses that stream to build an intelligent buffer between users and the Great British Railways API. The vast majority of queries are answered from local state, in-memory cache, and statistical prediction; the only call that ever hits GBR directly is the one that genuinely requires it: final ticket purchase.
The GBR API is slow, rate-limited, and expensive to call repeatedly. A naive rail app that hits the live API for every search, every page load, and every status check will feel sluggish and burn through its quota the moment traffic spikes.
Most rail apps are dumb mirrors: ask GBR, show the result, repeat. RailPredict is built on the premise that the vast majority of what a user needs during a journey search can be answered locally — without a single outbound request.
Rail data is separated into tiers based on how frequently it actually needs to change:
| Tier | Data | Staleness tolerance | Source | |:------|:---------------------------------------------------|:--------------------|:--------------------------| | **A** | Timetables, station names, base fares | Days–weeks | GTFS static feed | | **B** | Predicted delay, likely platform, fare range | Minutes–hours | Local history + inference | | **C** | Live position, seat availability, final price lock | Seconds | GBR Retail API (live) |
The goal: by the time a user reaches checkout, all Tier A and Tier B data is already loaded. The single Tier C call happens only when they confirm payment.
Instead of polling GBR for train positions, RailPredict subscribes to the Darwin Push Port — National Rail's STOMP-based firehose of every train movement in the UK. Incoming XML messages are parsed, filtered, and written to an in-memory registry keyed by RID. Read-only queries are served entirely from that cache.
Not every train deserves the same attention. Each train in the registry is assigned a state that controls how aggressively the system monitors it:
Dormant → Monitored → Active → Critical → Terminal
| State | Condition | Behaviour | |:--------------|:-------------------------------|:--------------------------------| | **Dormant** | Departure > 2 h | Static data only, no live calls | | **Monitored** | 30–120 min out | Darwin stream, history building | | **Active** | 0–30 min out | High-frequency updates | | **Critical** | < 5 min or disruption detected | Real-time stream, push alerts | | **Terminal** | Departed | Evicted from active monitoring |
External events — signal failures, weather alerts, route-level disruption flags — can force emergency promotion regardless of departure time.
If 50 users are watching the same train, one outbound call is made and the result is fanned out to all 50. The coalescer deduplicates concurrent in-flight requests by key and wires late arrivals directly onto the pending future.
If GBR starts returning errors, the system enters Cache Only mode and stops sending requests until GBR recovers. Callers see cached data; GBR sees no additional load.