CloudFront WebSockets over VPC Origins: An Architecture Deep-Dive

For two years, the answer to "can I put my WebSocket backend in a private subnet behind CloudFront?" was a flat no. You could shield a REST API, a static site, or a server-rendered app behind CloudFront VPC origins and keep the load balancer entirely off the public internet. But the moment a client tried to upgrade an HTTP connection to a WebSocket, the handshake failed. People discovered this the hard way: a clean wss:// request would come back as a 502, and AWS Support would confirm the limitation in writing. The workaround was always the same retreat: make the load balancer public again, bolt on a WAF, and accept the larger attack surface as the cost of doing real-time business.

On May 1, 2026, that limitation quietly disappeared. Amazon CloudFront now supports WebSocket traffic through VPC origins, which means a chat backend, a collaborative editor, a live trading feed, or an IoT command channel can live entirely in a private subnet with no public ingress at all, and CloudFront becomes the single front door for both the HTTP and the WebSocket halves of the application. This article is a deep-dive on how that works, why it was hard before, and the architectures it unlocks.

The Problem This Solves

A WebSocket is a long-lived, full-duplex TCP connection. It starts life as an ordinary HTTP/1.1 request carrying an Upgrade: websocket header; if the server agrees, both sides stop speaking HTTP and start exchanging raw WebSocket frames over the same socket. That persistent, bidirectional nature is exactly what makes real-time features possible, and it is also exactly what made the connection awkward to route through a content delivery network designed around short, cacheable request/response cycles.

Before this launch, teams that wanted real-time features on AWS had three unappealing options.

The first was a public load balancer. Put an internet-facing Application Load Balancer in a public subnet, attach a WAF, lock the security group down to the CloudFront managed prefix list, and hope you remembered every restriction. This works, but every public endpoint is a public endpoint: it has a routable address, it shows up in scans, and the burden of keeping it locked down never goes away.

The second was API Gateway WebSocket APIs. This is a genuinely managed service, but it is a different programming model. You decompose your connection lifecycle into $connect, $disconnect, and route-keyed messages, you push outbound messages through a separate management API, and you store connection IDs somewhere durable. For a team that already has a perfectly good WebSocket server written in FastAPI, Socket.IO, or plain Node, rewriting it to fit API Gateway's event model is a tax, not a feature.

The third was VPC origins, until you hit the wall. CloudFront VPC origins shipped in late 2024 and were the obvious answer: private subnets, traffic on the AWS backbone, CloudFront as the only way in. Teams adopted it for their HTTP traffic, then tried to route their WebSocket path through the same private origin and got a 502 with an OriginDNSError during the upgrade. The documented reality was blunt: VPC origins did not support WebSocket connections. The only fix was to peel the WebSocket traffic back out to a public load balancer, which defeated the entire point of having gone private in the first place.

The May 2026 launch removes the wall. The same VPC origin that already served your HTTP traffic now carries WebSocket traffic too, with no architectural compromise and no separate public endpoint.

A Two-Minute Refresher on CloudFront VPC Origins

To understand why WebSockets over VPC origins is interesting, you need to understand what a VPC origin actually is, because it is not the same thing as pointing CloudFront at a public DNS name.

A traditional "custom origin" is just a hostname. CloudFront resolves it over the public internet and connects to whatever answers. You secure it by hoping nobody finds the origin's real address and by restricting the security group to CloudFront's published prefix list. The origin still has a public IP; you are just asking the world not to use it.

A VPC origin is different. When you create one, CloudFront provisions a service-managed elastic network interface (ENI) inside your VPC and wires up a private path from the CloudFront edge to that ENI across the AWS backbone network. Your origin, an internal ALB, an NLB, or an EC2 instance, never needs a public IP and never needs a route to an internet gateway. Traffic from the edge reaches it over AWS-internal networking, not the open internet.

Public custom origin versus VPC origin: where the traffic actually flows

Two pieces of plumbing make this safe by default. CloudFront automatically creates a security group named CloudFront-VPCOrigins-Service-SG, which is fully service-managed; you reference it from your origin's own security group instead of editing it. And because the connection rides the backbone, you can drop the public-subnet, internet-gateway, and NAT machinery that a public origin would have required. The result is an application that is genuinely unreachable except through your CloudFront distribution. Provisioning the ENI and reaching a Deployed state takes up to roughly fifteen minutes, which is worth knowing before you wire it into a deployment pipeline.

What VPC origins gave you in 2024 was every byte of HTTP traffic flowing through one private, controllable front door. What it withheld until 2026 was the WebSocket byte stream. Now you get both.

How CloudFront Handles a WebSocket

WebSocket support in CloudFront is not a toggle you enable; it is on for every distribution automatically. The subtlety is entirely in the request headers, because the WebSocket handshake is just an HTTP request with a specific set of headers that the origin must actually see.

Here is the handshake as RFC 6455 defines it. The client sends an ordinary GET that asks to switch protocols:

GET /chat HTTP/1.1
Host: app.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Sec-WebSocket-Protocol: chat, superchat
Origin: https://app.example.com

If the server agrees, it answers with a 101 Switching Protocols and proves it understood the handshake by hashing the client's key into the Sec-WebSocket-Accept header:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat

After that exchange, HTTP is done and the socket carries WebSocket frames in both directions until somebody closes it or the network drops.

The catch is that CloudFront, by default, does not forward every header to the origin. If the Sec-WebSocket-Key and Sec-WebSocket-Version headers never reach your server, the server cannot complete the handshake, and the upgrade silently fails. So the one mandatory configuration step is to make sure those headers pass through. You have two ways to do it:

Attach the AllViewer managed origin request policy to the cache behavior, which forwards every viewer header to the origin. This is the simplest choice and the one most real-time apps want anyway.
Or create a custom origin request policy that explicitly forwards Sec-WebSocket-Key and Sec-WebSocket-Version.

AWS also recommends forwarding Sec-WebSocket-Protocol, Sec-WebSocket-Accept, and Sec-WebSocket-Extensions to avoid compression-related surprises with subprotocols and extensions. A few protocol facts shape your design:

CloudFront speaks WebSockets over HTTP/1.1 only. The handshake cannot ride HTTP/2 or HTTP/3 connections, which matters if you assumed your whole site was h2/h3 end to end.
The standard ports apply: 80 for ws://, 443 for wss://. Your viewer protocol policy and origin protocol policy govern WebSocket connections exactly as they govern ordinary HTTP, so "redirect HTTP to HTTPS" applies to the handshake too.
If a connection drops for any reason, the client is responsible for reconnecting. CloudFront does not transparently re-establish a broken WebSocket.

Putting Them Together: The End-to-End Path

With the WebSocket-over-VPC-origins launch, the two mechanisms compose cleanly. A single CloudFront distribution can route the static assets, the REST calls, and the WebSocket upgrade all to origins that live in private subnets, with cache behaviors deciding which path goes where.

One distribution, one private VPC, both HTTP and WebSocket traffic

Walk the WebSocket path one hop at a time:

The browser opens wss://app.example.com/ws. TLS terminates at the CloudFront edge.
CloudFront matches the /ws/* cache behavior, sees the Upgrade: websocket header, and, because the behavior uses AllViewer (or forwards the Sec-WebSocket-* headers explicitly), prepares to relay the handshake intact.
CloudFront opens a connection to the VPC origin over the service-managed ENI, entirely on the AWS backbone, and forwards the upgrade request to the internal ALB.
The ALB routes to a WebSocket server in a private subnet, which returns 101 Switching Protocols.
The 101 flows back through CloudFront to the client, and the tunnel is established. From here, frames pass in both directions through the edge and the private origin, and the origin never had, and never needed, a public address.

The elegant part is that this is the same distribution serving the rest of the app. You are not running a parallel public endpoint for real-time traffic and a private one for everything else. There is one front door, one TLS configuration, one WAF, one set of logs.

Configuration: The Parts That Actually Matter

The console steps are short, but a few settings determine whether your real-time app is robust or flaky.

Origin request policy. As covered above, this is the non-negotiable one. Use AllViewer or forward the Sec-WebSocket-* headers. If WebSocket upgrades return 4xx/5xx errors that look like the server "not understanding" the request, this is the first thing to check.

Timeouts and heartbeats. This is where most production pain lives. CloudFront has an origin response (read) timeout, 30 seconds by default and tunable up to 60 seconds in the console (higher via a quota increase), that governs how long it waits for the origin during the request phase. Your load balancer has its own idle timeout, 60 seconds by default on an ALB and configurable up to several thousand seconds, that will close a connection with no traffic. The practical consequence: a WebSocket that sits silent will get torn down somewhere in the chain. The fix is application-level heartbeats. Send a ping/pong frame every 20 to 30 seconds so the connection is never idle long enough for any timer to fire, and raise the ALB idle timeout to comfortably exceed your heartbeat interval. Treat this as mandatory, not optional.

Reconnection logic. Because CloudFront will not rebuild a dropped socket, your client needs reconnect-with-backoff and, if your protocol is stateful, a way to resume, a session token, a last-seen message ID, or a full re-sync on reconnect. This is normal WebSocket hygiene, but the edge in the path makes it non-negotiable.

HTTP/1.1 awareness. If you have tuned your distribution for HTTP/3 everywhere, remember the handshake itself is HTTP/1.1. This does not break anything; it just means the WebSocket path will not benefit from h2/h3 multiplexing for the upgrade.

Security group wiring. Reference the CloudFront-VPCOrigins-Service-SG (or the CloudFront managed prefix list) in your origin's security group so the ENI can reach the ALB, and otherwise keep the origin closed. Do not edit the service-managed group itself.

Security Posture: One Front Door

The strongest argument for this architecture is what it does to your attack surface. Before, "real-time" meant "publicly reachable," and a public reachable endpoint is a standing liability. Now the real-time backend has the same security posture as the rest of a VPC-origin application:

No public ingress. The ALB, NLB, or EC2 instance has no public IP and no route to the internet. It cannot be scanned, fingerprinted, or hit directly. The only path in is through your distribution.
A single control point. AWS WAF, AWS Shield Standard (and Shield Advanced if you subscribe), geo-restrictions, signed URLs/cookies, and rate-based rules all attach at the CloudFront layer and now protect the WebSocket path as a side effect of protecting the distribution. You are not maintaining a second security stack for real-time traffic.
Built-in DDoS protection. Because the WebSocket handshake terminates at the edge, the always-on absorption capacity of the CloudFront/Shield layer stands in front of your connection floods, not your origin.
Less to get wrong. The most common cloud security incidents are misconfigurations, an overly permissive security group, a forgotten public subnet, an ACL that drifted. Removing the public endpoint removes an entire category of mistakes.

It is worth stating the cost plainly because there isn't one: AWS charges no additional fee for WebSocket traffic through VPC origins beyond the normal CloudFront and data-transfer pricing you already pay. The security upgrade is free.

What This Opens Up

The headline feature is "private WebSockets," but the interesting part is the set of architectures that become clean instead of compromised. Here is where this actually changes designs.

Collapse Two Endpoints Into One

The most immediate win is consolidation. Plenty of teams today run a public ALB only for the WebSocket path and a private VPC origin for everything else, a split forced entirely by the old limitation. That split can now collapse. One distribution, path-routed cache behaviors, every origin private. Fewer DNS records, fewer certificates, fewer security groups, one WAF web ACL, one set of access logs to reason about. Architecturally simpler systems fail in fewer ways.

Retire API Gateway WebSocket Workarounds

If you adopted API Gateway WebSocket APIs only because it was the managed path to real-time, you can now reconsider. A conventional WebSocket server, FastAPI with websockets, Socket.IO, Phoenix Channels, a Go service, anything that speaks RFC 6455, can run in a private subnet behind CloudFront without rewriting your connection lifecycle into $connect/$disconnect Lambdas and an external connection store. You keep your existing code and your existing mental model, and you get the edge in front of it. API Gateway WebSocket APIs are still excellent when you want fully serverless fan-out and per-message billing; the point is that it is now a choice rather than the only private-friendly option.

The decision shifts: real-time on AWS now has a private, server-based path

Genuinely Private Real-Time Products

A long list of product features are, at heart, "a WebSocket server that needs to be reachable but should not be exposed." Each of these is now buildable with no public ingress:

Collaborative editing. Documents, whiteboards, design canvases, and code editors that sync cursors and operations in real time. The CRDT or operational-transform backend stays private.
Chat and presence. Support chat, team messaging, and live presence indicators where the message bus runs in a private subnet behind the same distribution that serves the app shell.
Live dashboards and telemetry. Operations consoles, trading screens, sports and betting feeds, and analytics dashboards that stream updates rather than poll. The fan-out service is no longer a public endpoint.
IoT command and control. Device management backends that maintain a persistent channel to each device for commands and status. Keeping that channel private is often a compliance requirement, not a nicety.
Multiplayer and interactive sessions. Game lobbies, live quizzes, auctions, and any "many clients, shared state, sub-second updates" workload.
Streaming AI responses. Token-by-token LLM output and live agent traces delivered over a WebSocket from a private inference gateway, so the model-serving tier never faces the internet.

In every case the shape is identical: viewers hit CloudFront, CloudFront forwards the upgrade to a private origin on the backbone, and the real-time tier enjoys the edge's TLS, WAF, and DDoS protection without owning a public address.

Compliance and Regulated Workloads

For workloads under PCI DSS, HIPAA, FedRAMP, or an internal "no public ingress" mandate, the old WebSocket limitation was a genuine blocker: you either accepted a public endpoint and wrote a lot of compensating-control documentation, or you didn't ship the real-time feature. A private origin with CloudFront as the sole entry point is a much easier control story to write and to audit. The data path is private, the entry point is single and well-instrumented, and the WebSocket traffic is covered by the same controls as everything else.

Cost and Operational Simplification

Going fully private also trims the bill and the toil. No public ALB means one fewer load balancer to pay for and patch. No public subnet for the real-time tier can mean no NAT gateway and no per-GB NAT data-processing charges for that path. Traffic on the AWS backbone avoids public-internet egress. And consolidating onto one distribution means one place to configure caching, security, and logging instead of maintaining a divergent public stack just for the socket.

Limitations and Gotchas

This is a clean feature, but go in with eyes open.

HTTP/1.1 only for the handshake. Confirm your clients and any intermediaries are fine with that.
Idle connections die. Without heartbeats, CloudFront, ALB, or both will close a quiet socket. Send pings every 20 to 30 seconds and tune the ALB idle timeout above your interval.
No automatic reconnection. Build reconnect-with-backoff and state resumption into the client. Always.
Header forwarding is mandatory. Forget the origin request policy and the upgrade fails in ways that look like an application bug.
Same-account and now cross-account. The distribution and the VPC origin historically had to be in the same account; cross-account VPC origins support exists now, which matters for multi-account landing zones, but verify the configuration for your topology.
Provisioning latency. A new VPC origin takes up to ~15 minutes to reach Deployed. Account for it in automation.
Sticky sessions still apply. If your WebSocket servers are stateful and you are load-balancing across several, configure ALB stickiness or, better, externalize connection state so any node can serve any client.
Regional availability. The feature is available in all AWS commercial regions where VPC origins is supported; check that your region qualifies before you design around it.

A Migration Sketch

If you are sitting on the old public-ALB-for-WebSockets workaround, the path off it is short:

Stand up an internal ALB (or NLB) in your private subnets pointing at your existing WebSocket servers. No application changes.
Create a VPC origin for that internal load balancer and wait for Deployed.
Add a cache behavior, for example /ws/*, to your existing distribution that targets the new VPC origin and attaches the AllViewer origin request policy.
Reference the CloudFront-VPCOrigins-Service-SG in the load balancer's security group; remove any public-internet ingress rules.
Cut traffic over, verify the handshake and your heartbeats, then decommission the public ALB and its WAF/security-group scaffolding.

The application code does not change. What changes is that the real-time tier stops being a public endpoint, and your architecture diagram loses a box that was only ever there to work around a limitation that no longer exists.

Conclusion

CloudFront WebSockets over VPC origins is a small launch with an outsized effect on how real-time systems get designed on AWS. The capability itself is narrow, route a persistent, full-duplex connection to a private origin, but it erases the single most common reason real-time backends had to live on the public internet. The payoff is consolidation onto one private front door, the same edge security for your sockets as for your pages, no extra cost, and the freedom to keep the WebSocket server you already wrote. If you have ever made a load balancer public purely so a chat feature or a live dashboard could connect, this is the launch that lets you take that decision back.

For the broader picture of how CloudFront's edge, caching, and security layers fit together, see the companion Amazon CloudFront architecture deep-dive.