Comment by pzo - Hacker Neue

pzo Dec 2, 2025 parent

The did explain a little bit:

> We’ll be able to do things like run fast models on the edge, run model pipelines on instantly-booting Workers, stream model inputs and outputs with WebRTC, etc.

Benefit to 3rd party developers is reducing latency and improving robustness of AI pipeline. Instead of going back and forth with https request at each stage to do inference you could make all in one request, e.g. doing realtime, pipelined STT, text translation, some backend logic, TTS and back to user mobile device.

weird-eye-issue Dec 2, 2025

You are seemingly answering something that they did not ask at all

badmonster Dec 2, 2025

Does edge inference really solve the latency issue for most use cases? How does cost compare at scale?

viraptor Dec 2, 2025

Depends on how much the latency matters to you and the customers. Most services realistically won't gain much at all. Even the latency of normal web requests is very rarely relevant. Only the business itself and answer that question though.

chrisweekly Dec 2, 2025

> "Even the latency of normal web requests is very rarely relevant."

Hard disagree. Performance is typically the most important feature for any website. User abandonment / bounce rate follows a predictable, steep, nonlinear curve based on latency.

viraptor Dec 2, 2025

I've changed the latency of actual services as well as core web vials many times and... no. Turns out the line is not that steep. For the range 200ms-1s, it's pretty much flat. Sure, you can start seeing issues for multi second requests, but that's terrible processing time. A change like eliminating intercontinental transfer latency - barely visible in results in ecommerce.

There's this old meme of Amazon seeing a difference for every 100ms latency and I've never seen it actually reproduced in a controlled way. Even when CF tries to advertise lower latency https://www.cloudflare.com/en-au/learning/performance/more/w... their data is companies reducing it by whole seconds. "Walmart found that for every 1 second improvement in page load time, conversions increased by 2%" - that's not steep. When there's a claim about improvements per 100ms, it's still based on averaging multi-second data like in https://auditzy.com/blog/impact-of-fast-load-times-on-user-e...

In short - if you have something extremely interactive, I'm sure it matters for experience. For a typical website loading in under 1s, edge will barely matter. If you have data proving otherwise, I'd genuinely love to see that. For websites loading in over 1s, it's likely much easier to improve the core experience than split thing out into edge.

chrisweekly Dec 3, 2025

Ok, I think we're actually in agreement -- given your all-important qualification "for the range 200ms-1s". Yes, ofc, that first part of the curve above the drop is quite flat; there's hardly time for the user to get impatient and bounce.

My point about the shape of the curve stands. 100ms can matter more on the steepest part of the slope than 2s does further to the right.

This item has no comments currently.