Architecture·February 14, 2026·7 min read

Why Helpway runs entirely on Cloudflare's edge

First-token latency from Tokyo: 620ms. From a single US-East region: closer to 2.1 seconds. Geography is the bottleneck.

Every Helpway component — AI pipeline, widget, WebSocket, vector search — runs on Cloudflare Workers at the edge closest to the visitor.

The naive architecture is one region (say us-east-1) serving everyone. From Tokyo that's a 150ms ping before a single token renders. Helpway starts its reply from a Tokyo-adjacent Cloudflare city, so the first token reaches the visitor in 620ms end-to-end.

The secondary benefit is reliability. An issue in one region doesn't take the product down globally — traffic automatically flows to the next-closest city.