Architecture··7 min read
Why Helpway runs entirely on Cloudflare's edge
First-token latency from Tokyo: 620ms. From a single US-East region: closer to 2.1 seconds. Geography is the bottleneck.
Every Helpway component — AI pipeline, widget, WebSocket, vector search — runs on Cloudflare Workers at the edge closest to the visitor.
The naive architecture is one region (say us-east-1) serving everyone. From Tokyo that's a 150ms ping before a single token renders. Helpway starts its reply from a Tokyo-adjacent Cloudflare city, so the first token reaches the visitor in 620ms end-to-end.
The secondary benefit is reliability. An issue in one region doesn't take the product down globally — traffic automatically flows to the next-closest city.