Insights · VoIP / UCaaS
Self-Hosted VoIP/UCaaS on Kubernetes
Carrier-grade unified communications without a cloud PBX. How we run SIP, media, and per-tenant TLS on infrastructure we own — and why voice traffic does not belong behind a generic proxy.
Unified communications is usually sold as a cloud PBX: you rent extensions, someone else owns the media path, and your call quality is a line item in a contract you cannot inspect. Our Performance Cloud venture runs the other way — carrier-grade voice and UCaaS, self-hosted on Kubernetes, on infrastructure we own end to end.
Voice is an unforgiving workload to self-host well. Here is how the pieces fit.
Signaling and media are different problems
The single most important design decision in self-hosted voice is to stop treating it as “just another web service.” A SIP call has two distinct streams:
- Signaling (SIP) sets up, modifies, and tears down calls. It is control-plane traffic, and it carries media addressing inside its message bodies.
- Media (RTP) is the actual audio. It is latency- and jitter-sensitive, and every extra hop or buffer degrades it.
Treating these the same way is the classic mistake. A general-purpose HTTP proxy that is not SIP-aware will happily rewrite or mangle the addressing inside SIP bodies, and routing real-time media through a layer designed for web requests adds latency exactly where you can least afford it.
So we split them. Signaling and media are handled by purpose-built components inside the cluster, and they reach the outside world over dedicated network addresses on the cluster’s load-balancer layer rather than being funneled through the same proxy that serves web traffic. Each media class gets its own path. The web app and API, which are ordinary HTTPS, go through the normal web ingress; SIP, TURN, and RTP do not.
Kubernetes for the control plane, not the call path
Kubernetes earns its place here for everything around the call: the web application, the API, provisioning, identity, and the lifecycle of the signaling and media servers themselves. Declarative deployment, rolling updates, and self-healing are genuinely useful for those.
What we are careful about is keeping the real-time path short. Media servers are exposed on stable, dedicated load-balancer addresses so that audio takes a direct route in and out of the cluster. The orchestration layer manages the pods; it does not sit in the middle of every audio packet.
TLS without inbound exposure
A multi-tenant voice platform needs certificates — for the web app, the API, and customer-facing identity endpoints — and it needs them to renew themselves without a human in the loop.
We issue and renew certificates using a DNS-based challenge. The certificate authority validates control of the domain by checking a DNS record, which means we never have to open an inbound port just to prove we own a name. For internal staging we use an internal certificate authority; at go-live the same mechanism issues publicly-trusted certificates. The renewal path is identical in both cases, so promoting an environment does not mean rebuilding how TLS works.
Sovereignty as a feature, not a constraint
Running voice this way is more work than renting a cloud PBX. The payoff is that every layer is ours to inspect and tune: we can trace a call from signaling to media, we own the quality budget end to end, and there is no third party in the audio path who can reprice, throttle, or fail us silently.
For a UCaaS product, “we operate every hop your call takes” is not a slogan. It is the difference between a platform you can stand behind and one you can only apologize for.
voipucaaskubernetessipnetworking