Tag Archives: K8s

VoIP is an only child – ‘Gotchas’ on running VoIP applications inside Containers

It’s 2021, and everyone loves Containers; Docker & Kubernetes are changing how software is developed, deployed and scaled.

And yet so much of the Telco world still uses bare metal servers and dedicated hardware for processing.

So why not use Containers or VMs more for VoIP applications?

Disclaimer – When I’m talking VoIP about VoIP I mean the actual Voice over IP, that’s the Media Stream, RTP, the Audio, etc, not the Signaling (SIP). SIP is fine with Containers, it’s the media that has a bad time and that this post focuses on,

Virtualization Fundamentals

Once upon a time in Development land every application ran on it’s own server running in a DC / Central Office.

This was expensive to deploy (buying servers), operate (lots of power used) and maintain (lots of hardware to keep online).

Each server was actually sitting idle for a large part of the time, with the application running on it only using a some of the available resources some of the time.

One day Virtualization came and suddenly 10 physical servers could be virtualized into 10 VMs.

These VMs still need to run on servers but as each VM isn’t using 100% of it’s allocated resources all the time, instead of needing 10 servers to run it on you could run it on say 3 servers, and even do clever things like migrate VMs between servers if one were to fail.

VMs share the resources of the server it’s running on.

A server running VMs (Hypervisor) is able to run multiple VMs by splitting the resources between VMs.

If a VM A wants to run an operation at the same time a VM B & VM C, the operations can’t be run on each VM at the same time* so the hypervisor will queue up the requests and schedule them in, typically based on first-in-first out or based on a resource priority policy on the Hypervisor.

This is fine for a if VM A, B & C were all Web Servers.
A request coming into each of them at the same time would see the VM the Hypervisor schedules the resources to respond to the request slightly faster, with the other VMs responding to the request when the hypervisor has scheduled the resources to the respective VM.

VoIP is an only child

VoIP has grown up on dedicated hardware. It’s an only child that does not know how to share, because it’s never had to.

Having to wait for resources to be scheduled by the Hypervisor to to VM in order for it to execute an operation is fine and almost unnoticeable for web servers, it can have some pretty big impacts on call quality.

If we’re running RTPproxy or RTPengine in order to relay media, scheduling delays can mean that the media stream ends up “bursty”.

RTP packets needing relaying are queued in the buffer on the VM and only relayed when the hypervisor is able to schedule resources, this means there can be a lot of packet-delay-variation (PDV) and increased latency for services running on VMs.

VMs and Containers both have this same fate, DPDK and SR-IOV assist in throughput, but they don’t stop interrupt headaches.

VMs that deprive other VMs on the same host of resources are known as “Noisy neighbors”.

The simple fix for all these problems? There isn’t one.

Each of these issues can be overcome, dedicating resources, to a specific VM or container, cleverly distributing load, but it is costly in terms of resources and time to tweak and implement, and some of these options undermine the value of virtualization or containerization.

As technology marches forward we have scenarios where Kubernetes can expose FPGA resources to pass them through to Pods, but right now, if you need to transcode more than ~100 calls efficiently, you’re going to need a hardware device.

And while it can be done by throwing more x86 / ARM compute resources at the problem, hardware still wins out as cheaper in most instances.

Sorry, no easy answers here…