Category Archives: VoIP

Credit Control Request / Answer call flow in IMS Charging

Basics of EPC/LTE Online Charging (OCS)

Early on as subscriber trunk dialing and automated time-based charging was introduced to phone networks, engineers were faced with a problem from Payphones.

Previously a call had been a fixed price, once the caller put in their coins, if they put in enough coins, they could dial and stay on the line as long as they wanted.

But as the length of calls began to be metered, it means if I put $3 of coins into the payphone, and make a call to a destination that costs $1 per minute, then I should only be allowed to have a 3 minute long phone call, and the call should be cutoff before the 4th minute, as I would have used all my available credit.

Conversely if I put $3 into the Payphone and only call a $1 per minute destination for 2 minutes, I should get $1 refunded at the end of my call.

We see the exact same problem with prepaid subscribers on IMS Networks, and it’s solved in much the same way.

In LTE/EPC Networks, Diameter is used for all our credit control, with all online charging based on the Ro interface. So let’s take a look at how this works and what goes on.

Generic 3GPP Online Charging Architecture

3GPP defines a generic 3GPP Online charging architecture, that’s used by IMS for Credit Control of prepaid subscribers, but also for prepaid metering of data usage, other volume based flows, as well as event-based charging like SMS and MMS.

Network functions that handle chargeable services (like the data transferred through a P-GW or calls through a S-CSCF) contain a Charging Trigger Function (CTF) (While reading the specifications, you may be left thinking that the Charging Trigger Function is a separate entity, but more often than not, the CTF is built into the network element as an interface).

The CTF is a Diameter application that generates requests to the Online Charging Function (OCF) to be granted resources for the session / call / data flow, the subscriber wants to use, prior to granting them the service.

So network elements that need to charge for services in realtime contain a Charging Trigger Function (CTF) which in turn talks to an Online Charging Function (OCF) which typically is part of an Online Charging System (AKA OCS).

For example when a subscriber turns on their phone and a GTP session is setup on the P-GW/PCEF, but before data is allowed to flow through it, a Diameter “Credit Control Request” is generated by the Charging Trigger Function (CTF) in the P-GW/PCEF, which is sent to our Online Charging Server (OCS).

The “Credit Control Answer” back from the OCS indicates the subscriber has the balance needed to use data services, and specifies how much data up and down the subscriber has been granted to use.

The P-GW/PCEF grants service to the subscriber for the specified amount of units, and the subscriber can start using data.

This is a simplified example – Decentralized vs Centralized Rating and Unit Determination enter into this, session reservation, etc.

The interface between our Charging Trigger Functions (CTF) and the Online Charging Functions (OCF), is the Ro interface, which is a Diameter based interface, and is common not just for online charging for data usage, IMS Credit Control, MMS, value added services, etc.

3GPP define a reference online-charging interface, the Ro interface, and all the application-specific interfaces, like the Gy for billing data usage, build on top of the Ro interface spec.

Basic Credit Control Request / Credit Control Answer Process

This example will look at a VoLTE call over IMS.

When a subscriber sends an INVITE, the Charging Trigger Function baked in our S-CSCF sends a Diameter “Credit Control Request” (CCR) to our Online Charging Function, with the type INITIAL, meaning this is the first CCR for this session.

The CCR contains the Service Information AVP. It’s this little AVP that is where the majority of the magic happens, as it defines what the service the subscriber is requesting. The main difference between the multitude of online charging interfaces in EPC networks, is just what the service the customer is requesting, and the specifics of that service.

For this example it’s a voice call, so this Service Information AVP contains a “IMS-Information” AVP. This AVP defines all the parameters for a IMS phone call to be online charged, for a voice call, this is the called-party, calling party, SDP (for differentiating between voice / video, etc.).

It’s the contents of this Service Information AVP the OCS uses to make decision on if service should be granted or not, and how many service units to be granted. (If Centralized Rating and Unit Determination is used, we’ll cover that in another post)
The actual logic, relating to this decision is typically based on the the rating and tariffing, credit control profiles, etc, and is outside the scope of the interface, but in short, the OCS will make a yes/no decision about if the subscriber should be granted access to the particular service, and if yes, then how many minutes / Bytes / Events should be granted.

In the received Credit Control Answer is received back from our OCS, and the Granted-Service-Unit AVP is analysed by the S-CSCF.
For a voice call, the service units will be time. This tells the S-CSCF how long the call can go on before the S-CSCF will need to send another Credit Control Request, for the purposes of this example we’ll imagine the returned value is 600 seconds / 10 minutes.

The S-CSCF will then grant service, the subscriber can start their voice call, and start the countdown of the time granted by the OCS.

As our chatty subscriber stays on their call, the S-CSCF approaches the limit of the Granted Service units from the OCS (Say 500 seconds used of the 600 seconds granted).
Before this limit is reached the S-CSCF’s CTF function sends another Credit Control Request with the type UPDATE_REQUEST. This allows the OCS to analyse the remaining balance of the subscriber and policies to tell the S-CSCF how long the call can continue to proceed for in the form of granted service units returned in the Credit Control Answer, which for our example can be 300 seconds.

Eventually, and before the second lot of granted units runs out, our subscriber ends the call, for a total talk time of 700 seconds.

But wait, the subscriber been granted 600 seconds for our INITIAL request, and a further 300 seconds in our UPDATE_REQUEST, for a total of 900 seconds, but the subscriber only used 700 seconds?

The S-CSCF sends a final Credit Control Request, this time with type TERMINATION_REQUEST and lets the OCS know via the Used-Service-Unit AVP, how many units the subscriber actually used (700 seconds), meaning the OCS will refund the balance for the gap of 200 seconds the subscriber didn’t use.

If this were the interface for online charging of data, we’d have the PS-Information AVP, or for online charging of SMS we’d have the SMS-Information, and so on.

The architecture and framework for how the charging works doesn’t change between a voice call, data traffic or messaging, just the particulars for the type of service we need to bill, as defined in the Service Information AVP, and the OCS making a decision on that based on if the subscriber should be granted service, and if yes, how many units of whatever type.

HTTP PUT call recordings from FreeSWITCH

The mod_httapi in FreeSWITCH allows you to upload your call recordings to a HTTP server, in my case I’ve put together a Flask based Python server for a project I’m working on, which when the call ends, uploads to my web server. Presto!

<action application="record" data="http://(file=recording.wav,name=part1.PCMU,method=POST)your_domain.com/recording_upload"/>

Obviously you’ll need to replace the URL etc, but you can then just extract the POSTed file out and boom, you don’t need to store any recordings on each FreeSWITCH instance.

This is fantastic if you’re running multiple instances in a cluster or containerized, and want every FreeSWITCH instance to be dumb and with access to the same data as every other instance.

ENUM in Practice with Kamailio

In our last post we covered the theory behind ENUM and its use, and in this post we’ll cover setting up Kamailio to query an ENUM server.

Before we start, if you’re not familiar with ENUM, check out my primer on the topic here, and then for more in-depth info check out this post on configuring an ENUM server in Bind and the ways records can be configured,

So once we’ve got an ENUM server configured and confirmed we can query it and get the results we want using Dig, we can configure Kamailio.

But before we get to the Kamailio side, a word on how Kamailio handles DNS,
Kamailio doesn’t have the ability to set a DNS server, instead it uses the system DNS server details,
This means your system will need to use the DNS server we want to query for ENUM for all DNS traffic, not just for Kamailio. This means you may need to setup Recursion to still be able to query DNS records for the outside world.

To add support to Kamailio, we’ll need to load the enum module (enum.so),

In terms of parameters, all we’ll set is the doman_suffix, which is, as it sounds, the domain suffix used in the DNS queries. If you’re using a different domain for your ENUM it’d need to be reflected here.

 modparam("enum", "domain_suffix", "e164.arpa.")

Next up inside our minimalist dialplan we’ll just add enum_query(); to query the SIP URI,

if(is_method("INVITE")){
         enum_query();
         xlog("Ran ENUM query");
         xlog("To URI $tU");
         forward();
}

Obviously in production you’d want to add more sanity checks and error handling, but with this, sending a SIP INVITE to Kamailio with an E.164 number in the SIP URI user part, will lead to an ENUM query resolving this, and routing the traffic to it,

An example PCAP above, showing a call to +61355500912, the resulting ENUM query and routing to server1.enum.nickvsnetworking.com,

By using Kamailio’s transaction module ™ as a stateful proxy, we can run queries again if one were to fail,

Copy of example config is available here,

Docker & BIND as an ENUM Playground

In the last we covered what ENUM is and how it works, so to take this into a more practical example, I thought I’d share the details of the ENUM server I’ve setup in my lab, and the Docker container I’ve bundled it into.

Inside the Docker container we’ll be running Bind – this post won’t teach you much about Bind, there’s already lots of good information on it elsewhere, but we will cover the parameters involved in setting up ENUM records (NAPTR) for E.164 addresses.

Getting the Environment up and Running

First we’ll need to setup our environment, I’ve published the images for the container to Dockerhub, but we’ll build it from the Dockerfile so you can edit the files and rebuild as you play around:

git clone https://github.com/nickvsnetworking/ENUM_Playground
cd ENUM_Playground
docker build --pull --rm -f "Dockerfile" -t enum:latest "."

systemd-resolve on Ubuntu binds to port 53 by default, which can lead to some headaches, so we’ll create a new network in Docker for this to run in, so it doesn’t conflict with anything else you may be running:

sudo docker network create --subnet=172.30.0.0/26 enum_playground

And now we’ll run the ENUM container in the enum_playground network and with the IP 172.30.0.2,

docker run -d --rm --name=enum --net=enum_playground --ip=172.30.0.2 enum

Ok, that’s the environment setup, let’s run some queries!

E.164 to SIP URI Resolution with ENUM

In our last post we covered the basics of formatting an E.164 number and querying a DNS server to get it’s call routing information.

Again we’re going to use Dig to query this information. In reality ENUM queries would be run by an endpoint, or software like FreeSWITCH or Kamailio (Spoiler alert, posts on ENUM handling in those coming later), but as we’re just playing Dig will work fine.

So let’s start by querying a single E.164 address, +61355500911

First we’ll reverse it and put full stops / periods between the numbers, to get 1.1.9.0.0.5.5.5.3.1.6

Next we’ll add the e164.arpa prefix, which is the global prefix for ENUM addresses, and presto, that’s what we’ll query – 1.1.9.0.0.5.5.5.3.1.6.e164.arpa

Lastly we’ll feed this into a Dig query against the IP of our container and of type NAPTR,

dig @172.30.0.2 -t naptr 1.1.9.0.0.5.5.5.3.1.6.e164.arpa

So what did you get back?

Well, if everything is working your output should look something like the output I’ve got below,

NAPTR results for queried ENUM Address

So how do we interpret this? Well let’s break it down,

The first part is the domain we queried, simple enough in this case,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Next up is the TTL or expiry, in this case it’s 3600 seconds (1 hour), shorter periods allow for changes to propagate / be reflected more quickly but at the expense of more load as results can’t be cached for as long. The class (IN) represents Internet, which is the only class commonly used, even on internal systems.

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Then we have the type of record returned, in our case it’s a NAPTR record,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

After that is the Order, this defines the order in which the rules are to be parsed. Lower numbers are processed first, if no matches then the next lowest, and so on until the highest number is reached, we’ll touch on this in more detail later in this post,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The Pref is the processing preference. This is very handy for load balancing, as we can split traffic between hosts with different preferences. We’ll cover this later in this post too.

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The Flags represent the type of record we’re going to get, for most ENUM traffic this is going to be set to U, to denote a SIP URI with Regex, while the Service value we’ll be looking for will be “E2U+sip” service to identify SIP URIs to route calls to, but could be other values like Email addresses, IM Addresses or PSTN numbers, to be parsed by other applications.

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Lastly we’ve got the Regex part. Again not going to cover Regex as a whole, just the DNS particulars.

Everything between the first and second ! denotes what we’re searching for, while everything from the second ! to the last ! denotes what we replace it with.

In the below example that means we’re matching ^.* which means starting with (^) any character (.) zero or more times (*), which gets replaced with sip:[email protected],

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

How should this be treated?

For the first example, a call to the E.164 address of 61355500912 will be first formatted into a domain as per the ENUM requirements (1.1.9.0.0.5.5.5.3.1.6.e164.arpa) and then queried as a NAPTR record against the DNS server,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Only a single record has been returned so we don’t need to worry about the Order or Preference, and the Regex matches anything and replaces it with the resulting SIP URI of sip:[email protected], which is where we’ll send our INVITE.

Under the Hood

Inside the Repo we cloned earlier, if you open the e164.arpa.db file, things will look somewhat familiar,

The record we just queried is the first example in the Bind config file,

; E.164 Address +61355500911 - Simple no replacement (Resolves all traffic to sip:[email protected])
1.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The config file is just the domain, class, type, order, preference, flags, service and regex.

Astute readers may have noticed the trailing . which where we can put a replacement domain if Regex is not used, but it cannot be used in conjunction with Regex, so for all our work it’ll just be a single trailing . on each line.

You can (and probably should) change the values in the e164.arpa.db file as we go along to try everything out, you’ll just need to rebuild the container and restart it each time you make a change.

This post is going to focus on Bind, but the majority of modern DNS servers support NAPTR records, so you can use them for ENUM as well, for example I manage the DNS for this site thorough Cloudflare, and I’ve put a screenshot below of an example private ENUM address I’ve added into it.

Setting up a NAPTR record in Cloudflare DNS

Preference to Split Traffic between Servers

So with a firm understanding of a single record being returned, let’s look at how we can use ENUM to cleverly route traffic to multiple hosts.

If we have a pool of servers we may wish to evenly distribute all traffic across them, so that’s how E.164 address +61355500912 is setup – to route traffic evenly (50/50) across two servers.

Querying it with Dig provides the following result:

dig @172.30.0.2 -t naptr 2.1.9.0.0.5.5.5.3.1.6.e164.arpa
;; ANSWER SECTION:
2.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 2.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

So as the order value (10) is the same for both records, we can ignore it – there isn’t one value lower than the other.

We can see both records have a preference of 100, in practice, this means they each get 50% of the traffic. The formula for traffic distribution is pretty simple, each server gets the value of it’s preference, divided by the total of all the preferences,

So for server1 it’s preference is 100 and the total of all the preferences combined is 200, so it gets 100/200, which is equivalent to one half aka 50%.

We might have a scenario where we have 3 servers, but one is significantly more powerful than the others, so let’s look at giving more traffic to one server and less to others, this example gets a little more complex but should cement your understanding of how the preference works;

dig @172.30.0.2 -t naptr 3.1.9.0.0.5.5.5.3.1.6.e164.arpa
3.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 3.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 200 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .
3.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

So now 3 servers, again none have a lower order than the other, it’s set to 10 for them all so we can ignore the order,

Next we can see the total of all the priority values is 400,

Server 2 has a priority of 100 so it gets 100/400 total priority, or a quarter of all traffic. Server 1 has the same value, so also gets a quarter of all traffic,

Server 3 however has a priority of 200 so it gets 200/400, or to simplify half of all traffic.

The Bind config for this is:

; E.164 Address +61355500913 - More complex load balance between 3 hosts (25% server1, 25% server2, 50% server3)
3.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 3.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .
3.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 200 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Order for Failover

Primarily the purpose of the order is to enable wildcard routes (as we’ll see later) to be overwritten by more specific routes, but a secondary use in some implementations use Order as a way to list the preferences of the SIP URIs to route to. For example we could have two servers, one a primary and the other a standby, with the standby only to be used only if the primary SIP URI was not responding.

E.164 number +61355500914 is setup to return two SIP URIs,

dig @172.30.0.2 -t naptr 4.1.9.0.0.5.5.5.3.1.6.e164.arpa
;; ANSWER SECTION:
4.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 4.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  20 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Our DNS client will first use the SIP URI sip:[email protected] as it has the lower order value (10), and if that fails, can try the entry with the next lowest order-value (20) which would be sip:[email protected].

The Bind config for this is:

; E.164 Address +61355500914 - Order example returning multiple SIP URIs to try for failover
4.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 4.1.9.0.0.5.5.5.3.1.6 IN NAPTR 20 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Wildcards

If we have a 1,000 number block, having to add 1000 individual records can be very tedious. Instead we can use wildcard matching (thanks to the fact we’ve reversed the E.164 address) to match ranges. For example if we have E.164 numbers from +61255501000 to +61255501999 we can add a wildcard entry to match the +61255501x prefix,

I’ve set this up already so let’s lookup the E.164 number +6125501234,

dig @172.30.0.2 -t naptr 4.3.2.1.0.5.5.5.2.1.6.e164.arpa
;; ANSWER SECTION:
4.3.2.1.0.5.5.5.2.1.6.e164.arpa. 3600 IN NAPTR  50 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

If you look up any other number starting with +6125501 you’ll get the same result, and here’s the Bind config for it:

; Wildcard E.164 Address +61255501* - Wildcard example for all destinations starting with E.164 prefix +61255501x to single destination (sip:[email protected])
; For example E.164 number +6125501234 will resolve to sip:[email protected]
*.1.0.5.5.5.2.1.6 IN NAPTR 100 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The catch with this is they’re all pointing at the same SIP URI, so we can’t treat the calls differently based on the called number – This is where the Regex magic comes in.

We can use group matching to match a group and fill it in the dialed number into the SIP Request URI, for example:

!(^.*$)!sip:+1\[email protected]!

Will match the E.164 number requested and put it inside sip:[email protected]

The +61255502xxx prefix is setup for this, so if we query +61255502000 (or any other number between +61255502000 and +61255502999) we’ll get the regex query in the resulting record.

Keep in mind DNS doesn’t actually apply the Regex transformation, just shares it, and the client applies the transformation.

dig @172.30.0.2 -t naptr 0.0.0.2.0.5.5.5.2.1.6.e164.arpa
;; ANSWER SECTION:
0.0.0.2.0.5.5.5.2.1.6.e164.arpa. 3600 IN NAPTR  100 100 "u" "E2U+sip" "!(^.*$)!sip:+1\[email protected]!" .

And the corresponding Bind config:

; Wildcard example for all destinations starting with E.164 prefix +61255502x to regex filled destination
; For example a request to 61255502000 will return sip:[email protected])
*.2.0.5.5.5.2.1.6 IN NAPTR 100 100 "u" "E2U+sip" "!(^.*$)!sip:+1\\[email protected]!" .

One last thing to keep in mind, is that Wildcard priorities are of any length.
This means +612555021 would match as well as +6125550299999999999999. Typically terminating switches drop any superfluous digits, and NU those that are too short, but keep this in mind, that length is not taken into account.

Wildcard Priorities

So with our wildcards in place, what if we wanted to add an exception, for example one number in our 61255502xxx block of numbers gets ported to another carrier and needs to be routed elsewhere?

Easy, we just add another entry for that number being more specific and with a lower order than the wildcard, which is what’s setup for E.164 number +61255502345,

dig @172.30.0.2 -t naptr 5.4.3.2.0.5.5.5.2.1.6.e164.arpa
;; ANSWER SECTION:
5.4.3.2.0.5.5.5.2.1.6.e164.arpa. 3600 IN NAPTR  50 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Which does not return the same result as the others that match the wildcard,

Bind config:

; Wildcard example for all destinations starting with E.164 prefix +61255502x to regex filled destination
; For example a request to +61255502000 will return sip:[email protected])
*.2.0.5.5.5.2.1.6 IN NAPTR 100 100 "u" "E2U+sip" "!(^.*$)!sip:+1\\[email protected]!" .

; More specific example with lower order than +6125550x wildcard for E.164 address +61255502345 will return sip:[email protected]
5.4.3.2.0.5.5.5.2.1.6 IN NAPTR 50 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

We can combine all of the tricks we’ve covered here, from statically defined entries, wildcards, regex replacement, multiple entries with multiple orders and preferences, to create really complex routing, using only DNS.

Summary & Next Steps

So by now hopefully you’ve got a fair understanding of how NAPTR and DNS work together to translate E.164 addresses into SIP URIs,

Of course being able to do this manually with Dig and comprehend how it’ll route is only one part of the picture, in the next posts we’ll cover using Kamailio and FreeSWITCH to query ENUM routing information and route traffic to it,

ENUM – DNS based Call Routing

DNS is commonly used for resolving domain names to IP Addresses, and is often described as being like “the phone book of the Internet”.

So what’s the phone book of phone books?

The answer, is (kind of) DNS. With the aid of E.164 number to URI mapping (ENUM), DNS can be used to resolve phone numbers into SIP URIs to route the traffic to.

So what is ENUM?

ENUM allows us to bypass the need for a central switch for routing calls to numbers, and instead, through a DNS lookup, resolve a phone number into a reachable SIP URI that is the ultimate destination for the traffic.

Imagine you want to call a company, you dial the phone number for that company, your phone does a DNS query against the phone number, which returns the SIP URI of the company’s PBX, and your phone sends the SIP INVITE directly to the company’s PBX, with no intermediary party carrying the call.

3GPP have specified ENUM as the prefered mechanism for resolving phone numbers into SIP addresses, and while it’s widespread adoption on the public Internet is still in its early days (See my post on The Sad story of ENUM in Australia) it is increasingly common in IMS networks and inside operator networks.

IETF defined RFC 6116 for “The E.164 to Uniform Resource Identifiers (URI) Dynamic Delegation Discovery System (DDDS) Application (ENUM)”, which defines how the system works.

So how does ENUM actually work?

ENUM allow us to lookup a phone number on a DNS server and find the SIP URI a server that will handle traffic for the phone number, but it’s a bit more complicated than the A or AAAA records you’d use to resolve a website, ENUM relies on NAPTR records.

Let’s look at the steps involved in taking an E.164 number and knowing where to send it.

Step 1 – Reverse the Numbers

We read phone numbers from left to right.

This is because historically the switch needs to get all the long-distance routing sorted first. The switch has to route your call to the exchange that serves that subscriber, which is what all the area codes and prefixes assigned to areas are all about (Throwback to SZU for any old Telco buffs).

For an E.164 number you’ve got a Country Code, Area Code and then the Subscriber Number. The number gets more specific as it goes along.

But getting more specific as you go along is the opposite how how DNS works, millions of domains share the .com suffix, and the unique / specific part is the bits before that.

So the first step in the ENUM process is to reverse the phone number, so let’s take phone number (03) 5550 0912, which in E.164 is +61 3 5550 0912.

As the spaces in the phone numbers are there for the humans, we’ll drop all of them and reverse the number, as DNS is more specific right-to-left, so we end up with

2.1.9.0.0.5.5.5.3.1.6

Step 2 – Add the Suffix

The ITU ENUM specifies the suffix e164.arpa be assigned for public ENUM entries. Private ENUM deployments may use their own suffix, but to make life simple I’m going to use e164.arpa as if it were public.

So we’ll append the e164.arpa domain onto our reversed and formatted E.164 phone number:

2.1.9.0.0.5.5.5.3.1.6.e164.arpa

Step 3 – Query it

Next we’ll run a Naming Authority Pointer (NAPTR) query against the domain, to get back a list of records for that number.

DNS is a big topic, and NAPTR and SRV takes up a good chunk of it, but what you need to know is that by using NAPTR we’re not limited to just a single response, we could have a weighted pool of servers handling traffic for this phone number, and be able to control load through the use of NAPTR, amongst other things.

Of course, if our phone can query the public NAPTR records, then so can anyone else, so we can just use a tool like Dig to query the record ourselves,

dig @10.0.1.252 -t naptr 2.1.9.0.0.5.5.5.3.1.6.e164.arpa

In the answers section I’ve setup this DNS server to only return a single response, with the regex SIP URI to use, in my case that’s sip:[email protected]

You’ll obviously need to replace the DNS server with your DNS server, and the query with the reversed and formatted version of the E.164 number you wish to query.

Step 4 – Send SIP traffic

After looking at the NAPTR records returned and using the weight and priority to determine which server/s to send to first, our phone forwards an INVITE to the URI returned in the NAPTR record.

How to interpret the returned results?

The first thing to keep in mind when working with ENUM is multiple records being returned is supported, and even encouraged.

NAPTR results return 7 fields, which define how it should be handled.

The host part is fairly obvious, and defines the host / DNS entry we’re talking about.

The Service defines what type of service this is. ENUM can be expanded beyond just voice, for example you may want to also return an email address or IM address as well as a SIP Address on an ENUM query, which you can do. By default voice uses the “E2U+sip” service to identify SIP URIs to route calls to, so in this context that’s what we’re interested in, but keep in mind there are other types out there,

Example ENUM query against a phone number showing other types of services (Email & Web)

The Order simply defines the order in which the rules are to be parsed. Lower numbers are processed first, if no matches then the next lowest, and so on until the highest number is reached.

The Pref is the processing preference. For load balancing 50/50 between two sites say a Melbourne and Sydney site, we’d return two results, with the same Order, and the same Pref, would see traffic split 50/50 between the two sites.
We could split this further, a Pref value of 10 for Melbourne, 10 for Sydney, 5 for Brisbane and 5 for Perth would see 33% of calls route to Melbourne, 33% of calls route to Sydney, 16.5% of calls route to Brisbane and 16.5% of calls route to Perth.
This is because we’d have a total preference value of 30, and the individual preference for each entry would work out as the fraction of the total (ie Pref 10 out of 30 = 10/30 or 33.3%).

The Flags denote the type of record we’re going to get, for most ENUM traffic this is going to be set to U, to denote a SIP URI with Regex.

The regexp field contains our SIP URI in the form of a Regular expression, which can include pattern matching and replacement. This is most commonly used to fill in the phone number into the SIP URI, for example instead of hardcoding the phone number into the response, we could use a Regular expression to fill in the requested number into the SIP URI.

!(^.*$)!sip:+1\[email protected]!

ENUM sounds great, how do I get it?

Here’s the tricky part.

If you’re looking to implement ENUM for an internal network, great, I’ll have some more posts here over the next few weeks covering off configuration of a DNS server to support ENUM lookups, and using Kamailio to lookup ENUM routes.

In terms of public ENUM, while many carriers are using ENUM inside their networks, public adoption of ENUM in most markets has been slow, for a number of reasons.

Many incumbent operators have been reluctant to embrace public ENUM as their role as an operator would be relegated to that of a Domain registrar.
Additionally, there’s real security risks involved in moving to ENUM – opening your phone system up to the world to accept inbound calls from anywhere. This could lead to DOS-style attacks of flooding phone numbers with automatically generated traffic, privacy risks and even less validation in terms of caller ID trust.

RIPE maintains the EnumData.org website listing the status of ENUM for each country / region.

IMS Routing with iFCs

SIP routing is complicated, there’s edge cases, traffic that can be switched locally and other traffic that needs to be proxied off to another Proxy or Application server. How can you define these rules and logic in a flexible way, that allows these rules to be distributed out to multiple different network elements and adjusted on a per-subscriber basis?

Enter iFCs – The Initial Filter Criteria.

iFCs are XML encoded rules to define which servers should handle traffic matching a set of rules.

Let’s look at some example rules we might want to handle through iFCs:

  • Send all SIP NOTIFY, SUBSCRIBE and PUBLISH requests to a presence server
  • Any Mobile Originated SMS to an SMSc
  • Calls to a specific destination to a MGC
  • Route any SIP INVITE requests with video codecs present to a VC bridge
  • Send calls to Subscribers who aren’t registered to a Voicemail server
  • Use 3rd party registration to alert a server that a Subscriber has registered

All of these can be defined and executed through iFCs, so let’s take a look,

iFC Structure

iFCs are encoded in XML and typically contained in the Cx-user-data AVP presented in a Cx Server Assignment Answer response.

Let’s take a look at an example iFC and then break down the details as to what we’re specifying.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

Each rule in an iFC is made up of a Priority, TriggerPoint and ApplicationServer.

So for starters we’ll look at the Priority tag.
The Priority tag allows us to have multiple-tiers of priority and multiple levels of matching,
For example if we had traffic matching the conditions outlined in this rule (TriggerPoint) but also matching another rule with a lower priority, the lower priority rule would take precedence.

Inside our <TriggerPoint> tag contains the specifics of the rules and how the rules will be joined / matched, which is what we’ll focus on predominantly, and is followed by the <ApplicationServer> which is where we will route the traffic to if the TriggerPoint is matched / triggered.

So let’s look a bit more about what’s going on inside the TriggerPoint.

Each TriggerPoint is made up of Service Point Trigger (SPTs) which are individual rules that are matched or not matched, that are either combined as logical AND or logical OR statements when evaluated.

By using fairly simple building blocks of SPTs we can create a complex set of rules by joining them together.

Service Point Triggers (SPTs)

Let’s take a closer look at what goes on in an SPT.
Below is a simple SPT that will match all SIP requests using the SIP MESSAGE method request type:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

So as you may have guessed, the <Method> tag inside the SPT defines what SIP request method we’re going to match.

But Method is only one example of the matching mechanism we can use, but we can also match on other attributes, such as Request URI, SIP Header, Session Case (Mobile Originated vs Mobile Terminated) and Session Description such as SDP.

Or an example of a SPT for anything Originating from the Subscriber utilizing the <SessionCase> tag inside the SPT.

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SessionCase>0</SessionCase>
        </SPT>

Below is another SPT that’s matching any requests where the request URI is sip:[email protected] by setting the <RequestURI> tag inside the SPT:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

We can match SIP headers, either looking for the existence of a header or the value it is set too,

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SIPHeader>
              <Header>To</Header>
              <Content>"Nick"</Content>
            </SIPHeader>
        </SPT>

Having <Header> will match if the header is present, while the optional Content tag can be used to match

In terms of the Content this is matched using Regular Expressions, but in this case, not so regular regular expressions. 3GPP selected Extended Regular Expressions (ERE) to be used (IEEE POSIX) which are similar to the de facto standard PCRE Regex, but with a few fewer parameters.

Condition Negated

The <ConditionNegated> tag inside the SPT allows us to do an inverse match.

In short it will match anything other than what is specified in the SPT.

For example if we wanted to match any SIP Methods other than MESSAGE, setting <ConditionNegated>1</ConditionNegated> would do just that, as shown below:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

And another example of ConditionNegated in use, this time we’re matching anything where the Request URI is not sip:[email protected]:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

Finally the <Group> tag allows us to group together a group of rules for the purpose of evaluating.
We’ll go into it more in in the below section.

ConditionTypeCNF / ConditionTypeDNF

As we touched on earlier, <TriggerPoints> contain all the SPTs, but also, very importantly, specify how they will be interpreted.

SPTs can be joined in AND or OR conditions.

For some scenarios we may want to match where METHOD is MESSAGE and RequestURI is sip:[email protected], which is different to matching where the METHOD is MESSAGE or RequestURI is sip:[email protected].

This behaviour is set by the presence of one of the ConditionTypeCNF (Conjunctive Normal Form) or ConditionTypeDNF (Disjunctive Normal Form) tags.

If each SPT has a unique number in the GroupTag and ConditionTypeCNF is set then we evaluate as AND.

If each SPT has a unique number in the GroupTag and ConditionTypeDNF is set then we evaluate as OR.

Let’s look at how the below rule is evaluated as AND as ConditionTypeCNF is set:

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated) as each SPT is in a different Group which leads to “and” behaviour.

If we were to flip to ConditionTypeDNF each of the SPTs are evaluated as OR.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeDNF>1</ConditionTypeDNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated).

Where this gets a little bit more complex is when we have multiple entries in the same Group tag.

Let’s say we have a trigger point made up of:

<SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
<SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

<SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

How would this be evaluated?

If we use ConditionTypeDNF every SPT inside the same Group are matched as AND, and SPTs with distinct are matched as OR.

Let’s look at our example rule evaluated as ConditionTypeDNF:

<ConditionTypeDNF>1</ConditionTypeDNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as AND – So Method is message and Session Case is 0, OR the header “P-Some-Header” is present.

Let’s do another one, this time as ConditionTypeCNF:

<ConditionTypeCNF>1</ConditionTypeCNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as OR – So Method is message OR Session Case is 0, AND the header “P-Some-Header” is present.

Adding support for AMR Codec in FreeSWITCH

If you’re building IMS Networks, the AMR config is a must, but FreeSWITCH does not ship with AMR due to licencing constraints, but has all the hard work done, you just need to add the headers for AMR support and compile.

LibOpenCore has support for AMR which we build, and then with a few minor tweaks to copy the C++ header files over to the FreeSWITCH source directory, and enable support in modules.conf.

Then when building FreeSWITCH you’ve got the AMR Codec to enable you to manage IMS / VoLTE media streams from mobile devices.

Instead of copying and pasting a list of commands to do this, I’ve published a Dockerfile here you can use to build a Docker image, or on a straight Debian Buster machine if you’re working on VMs or Bare Metal if you run the commands from the Dockerfile on the VM / bare metal.

You can find the Dockerfile on my Github here,

A very unstable Diameter Routing Agent (DRA) with Kamailio

I’d been trying for some time to get Kamailio acting as a Diameter Routing Agent with mixed success, and eventually got it working, after a few changes to the codebase of the ims_diameter_server module.

It is rather unstable, in that if it fails to dispatch to a Diameter peer, the whole thing comes crumbling down, but incoming Diameter traffic is proxied off to another Diameter peer, and Kamailio even adds an extra AVP.

Having used Kamailio for so long I was really hoping I could work with Kamailio as a DRA as easily as I do for SIP traffic, but it seems the Diameter module still needs a lot more love before it’ll be stable enough and simple enough for everyone to use.

I created a branch containing the fixes I made to make it work, and with an example config for use, but use with caution. It’s a long way from being production-ready, but hopefully in time will evolve.

https://github.com/nickvsnetworking/kamailio/tree/Diameter_Fix

VoIP is an only child – ‘Gotchas’ on running VoIP applications inside Containers

It’s 2021, and everyone loves Containers; Docker & Kubernetes are changing how software is developed, deployed and scaled.

And yet so much of the Telco world still uses bare metal servers and dedicated hardware for processing.

So why not use Containers or VMs more for VoIP applications?

Disclaimer – When I’m talking VoIP about VoIP I mean the actual Voice over IP, that’s the Media Stream, RTP, the Audio, etc, not the Signaling (SIP). SIP is fine with Containers, it’s the media that has a bad time and that this post focuses on,

Virtualization Fundamentals

Once upon a time in Development land every application ran on it’s own server running in a DC / Central Office.

This was expensive to deploy (buying servers), operate (lots of power used) and maintain (lots of hardware to keep online).

Each server was actually sitting idle for a large part of the time, with the application running on it only using a some of the available resources some of the time.

One day Virtualization came and suddenly 10 physical servers could be virtualized into 10 VMs.

These VMs still need to run on servers but as each VM isn’t using 100% of it’s allocated resources all the time, instead of needing 10 servers to run it on you could run it on say 3 servers, and even do clever things like migrate VMs between servers if one were to fail.

VMs share the resources of the server it’s running on.

A server running VMs (Hypervisor) is able to run multiple VMs by splitting the resources between VMs.

If a VM A wants to run an operation at the same time a VM B & VM C, the operations can’t be run on each VM at the same time* so the hypervisor will queue up the requests and schedule them in, typically based on first-in-first out or based on a resource priority policy on the Hypervisor.

This is fine for a if VM A, B & C were all Web Servers.
A request coming into each of them at the same time would see the VM the Hypervisor schedules the resources to respond to the request slightly faster, with the other VMs responding to the request when the hypervisor has scheduled the resources to the respective VM.

VoIP is an only child

VoIP has grown up on dedicated hardware. It’s an only child that does not know how to share, because it’s never had to.

Having to wait for resources to be scheduled by the Hypervisor to to VM in order for it to execute an operation is fine and almost unnoticeable for web servers, it can have some pretty big impacts on call quality.

If we’re running RTPproxy or RTPengine in order to relay media, scheduling delays can mean that the media stream ends up “bursty”.

RTP packets needing relaying are queued in the buffer on the VM and only relayed when the hypervisor is able to schedule resources, this means there can be a lot of packet-delay-variation (PDV) and increased latency for services running on VMs.

VMs and Containers both have this same fate, DPDK and SR-IOV assist in throughput, but they don’t stop interrupt headaches.

VMs that deprive other VMs on the same host of resources are known as “Noisy neighbors”.

The simple fix for all these problems? There isn’t one.

Each of these issues can be overcome, dedicating resources, to a specific VM or container, cleverly distributing load, but it is costly in terms of resources and time to tweak and implement, and some of these options undermine the value of virtualization or containerization.

As technology marches forward we have scenarios where Kubernetes can expose FPGA resources to pass them through to Pods, but right now, if you need to transcode more than ~100 calls efficiently, you’re going to need a hardware device.

And while it can be done by throwing more x86 / ARM compute resources at the problem, hardware still wins out as cheaper in most instances.

Sorry, no easy answers here…

Using Docker to develop SIP solutions with Kamailio

VoIP networks are often large beasts, with lots of moving parts; carrier links, SBCs, application servers, etc.

While writing the other Kamailio tutorials on this blog I often find I’m spinning up several VMs to act as different parts of the network, for example in the posts on the Dispatcher module I needed to have 3 VMs to show it in action.

Instead of going to all the effort of creating VMs (or running Ansible playbooks) we can use Docker and docker-compose to create a test environment with multiple Asterisk instances to dispatch traffic to from Kamailio.

I covered the basics of using Kamailio with Docker in this post, which runs a single Kamailio instance inside Docker with a provided config file, but in this post we’ll use docker-compose to run multiple Asterisk instances and setup Kamailio to dispatch traffic to them.

I am a big Kubernetes fan, and yes, all this can be done in Kubernetes, and would be a better fit for a production environment, but for a development environment it’s probably overkill.

Like in the first post that covered Kamailio and Docker we’ll start with a Dockerfile to create the config we want.
The Dockerfile is largely unchanged from my original post on Docker and Kamailio, for the Kamailio config in this example I’m using Dispatcher with a flat text file of the targets to dispatch to (dispatcher.list), so I’ll include a command to copy the two config files into the Container when the image is created in the Dockerfile:

#Copy the config file onto the Filesystem of the Docker instance
COPY dispatcher.list /etc/kamailio/
COPY kamailio.cfg /etc/kamailio/

The Kamailio config we’re using is very similar to the Dispatcher example but with a few minor changes to the timers and setting it to use the Dispatcher data from a text file instead of a database. If you have a look at the contents of dispatcher.list you’ll see three entries; dispatcher_w_docker_asterisk_1, dispatcher_w_docker_asterisk_2 & dispatcher_w_docker_asterisk_3. These will be the hostnames of the 3 Asterisk instances we’ll create.

Next up we’ll take a look at the docker-compose file, which defines how our environment will be composed, and defines which containers will be run

The docker-compose file contains definitions about the Containers we want to run, for this example we’ll run several Asterisk instances and a single Kamailio instance.

I’m using a Docker image for Asterisk from Andrius Kairiukstis to deploy Asterisk, running a default config, so we’ll start our docker-compose file by defining a service called “asterisk” using Andrius’ image:

services:
   asterisk:
     image: andrius/asterisk
     deploy:
       mode: replicated
       replicas: 6

The replicas: 6 parameter is ignored by standard docker-compose up command, but will be used if you’re using Docker swarm, otherwise we’ll manually set the number of replicas when we run the command.

So with that defined let’s define our Kamailio service;

services:
   asterisk:
     image: andrius/asterisk
     deploy:
       mode: replicated
       replicas: 6
   kamailio_dispatcher:
     depends_on:
       - asterisk
     build:
      context: .
      dockerfile: Dockerfile
     ports:
       - "5060:5060/udp"
     restart: always

That will build Kamailio from our Dockerfile and expose port 5060,

We can run the docker-compose file with 3 replicas by running:

nick@oldfaithful:kamailio-101-tutorial/Docker_examples/Dispatcher_w_Docker$ docker-compose up --force-recreate --build --scale asterisk=3

Next if we run docker-ps we can see the running instances, the single Kamailio and the 3 Asterisk instances:

nick@oldfaithful:kamailio-101-tutorial/Docker_examples/Dispatcher_w_Docker$ docker ps

We can also query Kamailio to see how it’s Dispatcher table is looking and confirm it can reach each of our 3 instances:

nick@oldfaithful:kamailio-101-tutorial/Docker_examples/Dispatcher_w_Docker$ docker exec -it dispatcher_w_docker_kamailio_dispatcher_1 kamcmd dispatcher.list

Kamailio Bytes – OnReply Route

So far with most of our discussions about Kamailio we’ve talked about routing the initial SIP request (INVITE, REGISTER, SUBSCRIBE, etc), but SIP is not a one-message protocol, there’s a whole series of SIP messages that go into a SIP Dialog.

Sure the call may start with an INVITE, but there’s the 180 RINGING, the 200 OK and the ACK that go into getting the call actually established, and routing these in-dialog messages is just as important as routing the first INVITE.

When we’ve talked about SIP routing it’s all happened in the request_route {} block:

request_route {
        xlog("Received $rm to $ru - Forwarding");
        append_hf("X-Proxied: You betcha\r\n");
        #Forward to new IP
        forward("192.168.1.110");
}

In the example above we statelessly forward any initial requests to the IP 192.168.1.110.

All the routing from that point on happens using the standard RFC3261 in-dialog routing using the Route headers.

We can add an onreply_route{} block to handle any replies from 192.168.1.110 back to the originator.

But why would we want to?

Some simple answers would be to do some kind of manipulation to the message – say to strip a Caller ID if CLIP is turned off, or to add a custom SIP header containing important information, etc.

onreply_route{
        xlog("Got a reply $rs");
        append_hf("X-Proxied: For the reply\r\n");
}

Let’s imagine a scenario where the destination our SIP proxy is relaying traffic to (192.168.1.110) starts responding with 404 error.

We could detect this in our onreply_route{} and do something about it.

onreply_route{
        xlog("Got a reply $rs");
        if($rs == 404) {
                #If remote destination returns 404
                xlog("Got a 404 for $rU");
                #Do something about it
        }
}

In the 404 example if we were using Dispatcher it’s got easily accessed logic to handle these scenarios a bit better than us writing it out here, but you get the idea.

There are a few other special routes like onreply_route{}, failure routes and event routes, etc.

Hopefully now you’ll have a better idea of how and when to use onreply_route{} in Kamailio.

I like big Butts and I cannot lie.

In other parts of the world it’s known as a telephone test set, lineman’s handset, test phone, etc, but to me it’s a butt / butt set / buttinski.

They’re essentially ruggedized, portable telephones, often with an ability to monitor a line without looping it / going off hook, and used by techs and lineys throughout phone networks everywhere.

I carry Fluke TS52 Pro in my toolbox, it’s got a built in voltmeter, waterproof, backlit display and lots of memory storage options.

It’s a really beautiful bit of kit, (only thing missing is a TDR which the next model up has) but I very rarely use it.

The butt set is in my mind the quintessential piece of test gear, and I’ve got a few from various time periods.

The Telecom Ruggabut was launched in 1994/1995 and was standard issue prior to privatization, and was designed in Australia for the Australian market,

As such it features some uniquely Australian features such as detection for 12kHz Subscriber Pulse Metering (used in Payphones), while the “TLB” Button is tone-loop-break, a 100ms pause in dialling,

Prior to the Ruggabutt there was the Versadial and Versadial Mk2. Lightweight, tough and with a handy RJ12 jack for testing subscriber handpieces, these were made in huge numbers, by PMG and then Telecom.

And as far back as my “collection” goes is the Australian Post Office (APO) Telephone Test Handset No. 4, which lives on the Step exchange in my office, and is a simple rotary dial plus speaker, mic, switch and Neon light to denote ringing.

SIP Hold – With RFC6337

I had a discussion with a friend the other day about if hold is signified with a=sendonly or a=recvonly, which led me to revisiting the RFC to confirm, so here’s an overview of how “Call Hold” works in SIP.

By the Book

According to RFC 6337 a user can hold calls by sending a new SDP offer in an established session (Re-INVITE on active call), with an SDP payload of a=sendonly for each media stream the user want’s to hold.

The SIP Switch / PBX / UAS replies with an updated SDP where each media stream’s SDP contains a=recvonly.

So it’s both, depending on which leg you’re looking at.

In Common Practice

When a UAC puts a call in hold, it does so by sending a SIP re-INVITE, updating the SDP to include the attribute line “sendonly”

See the bottom line of the SDP is a=sendonly ? That’s denoting the call is to be put on hold,

If the call hold was sucesful the UAS sends back a 200 Ok, with the SDP attribute set to recvonly

The a=recvonly denotes the call has been held.

To retrieve the call another SIP re-invite is sent by the UAC, this time setting the media attribute back to sendrecv

If sucesful a 200 OK is sent by the UAS with the a=sendrecv also set.

FreeSWITCH WebRTC with sipML5

Most people think of SIP when it comes to FreeSWITCH, Asterisk and Kamailio, but all three support WebRTC.

FreeSWITCH makes WebRTC fairly easy to use and treats it much the same way as any SIP endpoint, in terms of registration and diaplan.

Setting up the SIP Profile

On the SIP profile we’ll need to activate WebRTC you’ll need to ensure a few lines of config are present:

    <!-- for sip over secure websocket support -->
    <!-- You need wss.pem in $${certs_dir} for wss or one will be created for you -->
    <param name="wss-binding" value=":7443"/>

Next you’ll need to restart FreeSWITCH and a self-signed certificate should get loaded,

Once you’ve restarted FreeSWITCH will fail to detect any WebSocket certificate and generate a self signed certificate for you. This means that you can verify FreeSWITCH is listening as expected using Curl:

curl https://yourhostname:7443 -vvv

You should see an error regarding the connection failing due to an invalid certificate, if so, great! Let’s put in a valid certificate.

If not double check the firewall on your server allow traffic to port TCP 7443,

Loading your TLS Certificate

WebRTC & websocket are recent standards – this means a valid TLS certificate is mandatory. So to get this to work you’ll need a valid SSL certificate.

Let’sEncrypt should work fine, if you’ve got your own CA that’s in the trusted CA list on your machine that’ll do, or I’m using a cert I generated with Mkcert.

When we restarted FreeSWITCH after adding the wss-binding config a certificate was automatically generated in the $${certs_dir} of FreeSWITCH,

You can verify where the certs_dir is by echoing out the variable in FreeSWITCH:

fs_cli -x 'eval $${certs_dir}'

Unless you’ve changed it you’ll probably find your certs in /etc/freeswitch/tls/

The certificate and private key are stored in a single file, with the Certificate and the Private Key appended to the end,

In my case the certificate is called “webrtc.pem” and the private key file is “webrtc-key.pem”,

I’ll need to start by replacing the contents of the current certificate/ key file wss.pem with the certificate I’ve got webrtc.pem, and then appending the private key – webrtc-key.pem to the end of wss.pem,

cat /home/nick/webrtc.pem > /etc/freeswitch/tls/wss.pem
cat /home/nick/webrtc-key.pem >> /etc/freeswitch/tls/wss.pem

Next up I’ll restart FreeSWITCH, and run Curl again to verify this time the certificate is valid:

curl https://yourhostname:7443 -vvv

All going well no certificate error will be reported and we can setup our WebRTC client.

Configuring sipML5

Dubango Telecom’s sipML5 is a BSD licenced HTML5 SIP client,

I’ll use the demo version on their website to connect to my FreeSWITCH WebRTC server, which you can run in your browser from here,

We’ll start by clicking the “Export Mode” button to set our wss:// URL;

Change the WebSocket Server URL to the URL of your FreeSWITCH instance (you must use a domain, not an IP Address)

If you’re running behind a NAT adding ICE servers is probably a good idea, although this will slow down connection times, you can use Google’s public STUN server by pasting in the below value:

[{ url: 'stun:stun.l.google.com:19302'}]

Finally we’ll save those settings and return back to the main tab,

You’ll need to register with a username and password that’s valid on the FreeSWITCH box, in my case I’m using 1000 with the password 1000 (exists by default),

Replace webrtc with the domain name of your FreeSWITCH instance,

Finally you should be able to click Login and see Connected above,

Then we can make calls to endpoints on FreeSWITCH using the dial box;

The Debug console in your browser will provide all the info you need to debug any issues, and you can trace WebSocket traffic using Sofia like any other SIP traffic.

Hopefully this was useful to you – I’ll cover more of WebRTC on Asterisk and also Kamailio in later posts!

Kamailio Bytes – Gotchas with Kamailio as an Asterisk Load Balancer

How do I make Kamailio work with Asterisk?

It’s a seemingly simple question, the answer to which is – however you want, sorry if that’s not a simple answer.

I’ve talked about the strengths and weaknesses of Kamailio and Asterisk in my post Kamailio vs Asterisk, so how about we use them to work together?

The State of Play

So before we go into the nitty gritty, let’s imagine we’ve got an Asterisk box with a call queue with Alice and Bob in it, set to ring those users if they’re not already on a call.

Each time a call comes in, Asterisk looks at who in the queue is not already on a call, and rings their phone.

Now let’s imagine we’re facing a scenario where the single Asterisk box we’ve got is struggling, and we want to add a second to share the load.

We do it, and now we’ve got two Asterisk boxes and a Kamailio load balancer to split the traffic between the two boxes.

Now each time a call comes in, Kamailio sends the SIP INVITE to one of the two Asterisk boxes, and when it does, that Asterisk box looks at who is in the queue and not already on a call, and then rings their phone.

Let’s imagine a scenario where a Alice & Bob are both on calls on Asterisk box A, and another call comes in this call is routed to Asterisk box B. Asterisk box B looks at who is in the queue and who is already on a call, the problem is Alice and Bob are on calls on Asterisk box A, so Asterisk box B doesn’t know they’re both on a call and tries to ring them.

We have a problem.

Scaling stateful apps is a real headache,

So have a good long hard think about how to handle these issues before going down this path!

FreeSWITCH + ESL = Programmable Voice

No great secret, I’m a big Python fan.

Recently I’ve been working on a few projects with FreeSWITCH, and looking at options for programmatically generating dialplans, instead of static XML files.

Why not Static XML?

So let’s say I define a static XML dialplan.

It works great, but if I want to change the way a call routes I need to do it from the dialplan,

That’s not ideal if you’re using a distributed cluster of FreeSWITCH instances, or if you want to update on the fly.

Static XML means we have to define our dialplan when setting up the server, and would have to reconfigure the server to change it.

So what about mod_xml_curl?

When I’ve done this in the past I’ve relied on the mod_xml_curl module.

mod_xml_curl gets the XML dialplan using Curl from a web server, and so you setup a web server using Flask/PHP/etc, and dynamically generate the dialplan when the call comes in.

This sounds good, except you can’t update it on the fly.

mod_xml_curl means call routing decisions are made at the start of the call, and can’t be changed midway through the call.

So what’s ESL?

ESL is the Event Socket Library, essentially a call comes in, an ESL request is made to an external server.

For each step in the dialplan, an ESL request will be sent to the external server which tells it to do,

ESL allows us to use all FreeSWITCH’s fantastic modules, without being limited as to having to perform the call routing logic in FreeSWITCH.

So how do I use ESL?

You’ll need two create an ESL server,

Luckily there’s premade examples for popular languages;

Once you’ve got a basic server defined we’ll need to put some config in our XML Dialplan to say “transfer all your thinking to ESL!”;

<!-- Send everything that's numbers to ESL for Processing --> 
    <extension name="esl_route">
      <condition field="destination_number" expression="^\d*$">
        <action application="socket" data="10.0.1.252:5000"/>
        <action application="log" data="ERR Made it past ESL - Play error."/>
        <action application="playback" data="ivr/ivr-call_cannot_be_completed_as_dialed"/>
      </condition>
    </extension>

In the above example my ESL server is on 10.0.1.252 / port 5000.

Now any calls coming in will be transfered to ESL and where it goes from there, is something you define in your prefered programming language.

In the next post I’ll cover how I’ve been addressing this using Python and Greenswitch.

Kamailio Bytes – UAC – Authenticate Outbound Calls

Sometimes you need Kamailio to serve as a User Agent Client, we covered using UAC to send SIP REGISTER messages and respond with the authentication info, but if you find you’re getting 401 or 407 responses back when sending an INVITE, you’ll need to use the UAC module, specifically the uac_auth() to authenticate the INVITE,

When Kamailio relays an INVITE to a destination, typically any replies / responses that are part of that dialog will go back to the originator using the Via headers.

This would be fine except if the originator doesn’t know the user name and password requested by the carrier, but Kamailio does,

Instead what we need Kamailio to do is if the response to the INVITE is a 401 Unauthorised Response, or a 407 Proxy Authentication required, intercept the request, generate the response to the authentication challenge, and send it to the carrier.

To do this we’ll need to use the UAC module in Kamailio and set some basic params:

loadmodule "uac.so"
modparam("uac", "reg_contact_addr", "10.0.1.252:5060")
modparam("uac", "reg_db_url", DBURL)
modparam("uac","auth_username_avp","$avp(auser)")
modparam("uac","auth_password_avp","$avp(apass)")
modparam("uac","auth_realm_avp","$avp(arealm)")

Next up when we relay the INVITE (using the Transaction module because we need the response to be transaction stateful).

Before we can call the t_relay() command, we need to specify a failure route, to be called if a negative response code comes back, we’ll use one called TRUNKAUTH and tell the transaction module that’s the one we’ll use by adding t_on_failure(“TRUNKAUTH”);

	$du = "sip:sip.nickvsnetworking.com:5060";
	if(is_method("INVITE")) {
		t_on_failure("TRUNKAUTH");
		t_relay();
		exit;
	   }

What we’ve done is specified to rewrite the destination URI to sip.nickvsnetworking.com, if the request type is an INVITE, it’ll load a failure route called TRUNKAUTH and proxy the request with the transaction module to sip.nickvsnetworking.com.

What we get is a 401 response back from our imaginary carrier, and included in it is a www-auth header for authentication.

To catch this we’ll create an on failure route named “TRUNKAUTH”

failure_route[TRUNKAUTH] {
    xlog("trunk auth");
    }

We’ll make sure the transaction hasn’t been cancelled, and if it has bail out (no point processing subsequent requests on a cancelled dialog).

failure_route[TRUNKAUTH] {
    xlog("trunk auth");
    if (t_is_canceled()) {
        exit;
    }

And determine if the response code is a 401 Unauthorised Response, or a 407 Proxy Authentication required (Authentication requests from our upstream carrier):

failure_route[TRUNKAUTH] {
    xlog("trunk auth");
    if (t_is_canceled()) {
        exit;
    }
	xlog("Checking status code");
    if(t_check_status("401|407")) {
	xlog("status code is valid auth challenge");
    }
}

Next we’ll define the username and password we want to call upon for this challenge, and generate an authentication response based on these values using the uac_auth() command,

failure_route[TRUNKAUTH] {
    xlog("trunk auth");
    if (t_is_canceled()) {
        exit;
    }
	xlog("Checking status code");
    if(t_check_status("401|407")) {
	xlog("status code is valid auth challenge");
        $avp(auser) = "test";
        $avp(apass) = "test";
        uac_auth();

    }
}

Then finally we’ll relay that back to the carrier with our www-auth header populated with the challenge response;

failure_route[TRUNKAUTH] {
    xlog("trunk auth");
    if (t_is_canceled()) {
        exit;
    }
	xlog("Checking status code");
    if(t_check_status("401|407")) {
	xlog("status code is valid auth challenge");
        $avp(auser) = "test";
        $avp(apass) = "test";
        uac_auth();
	xlog("after uac_auth");
        t_relay();
        exit;
    }
}

And done!

We can get this data from the UAC database so we don’t need to load these values directly into our config file too using the SQLops module.

As always I’ve put the running code example on my GitHub.

Diameter and SIP: Registration-Termination-Request / Answer

These posts focus on the use of Diameter and SIP in an IMS / VoLTE context, however these practices can be equally applied to other networks.

The Registration-Termination Request / Answer allow a Diameter Client (S-CSCF) to indicate to the HSS (Diameter Server) that it is no longer serving that user and the registration has been terminated.

Basics:

The RFC’s definition is actually pretty succinct as to the function of the Server-Assignment Request/Answer:

The Registration-Termination-Request is sent by a Diameter Multimedia server to a Diameter Multimedia client in order to request the de-registration of a user.

Reference: TS 29.229

The Registration-Termination-Request commands are sent by a S-CSCF to indicate to the Diameter server that it is no longer serving a specific subscriber, and therefore this subscriber is now unregistered.

There are a variety of reasons for this, such as PERMANENT_TERMINATION, NEW_SIP_SERVER_ASSIGNED and SIP_SERVER_CHANGE.

The Diameter Server (HSS) will typically send the Diameter Client (S-CSCF) a Registration-Termination-Answer in response to indicate it has updated it’s internal database and will no longer consider the user to be registered at that S-CSCF.

Packet Capture

I’ve included a packet capture of these Diameter Commands from my lab network which you can find below.

Other Diameter Cx (IMS) Calls

User-Authorization-Request / User-Authorization-Answer
Server-Assignment-Request / Server-Assignment-Answer
Location-Info-Request / Location-Info-Answer
Multimedia-Auth-Request / Multimedia-Auth-Answer
Registration-Termination-Request / Registration-Termination-Answer
Push-Profile-Request / Push-Profile-Answer

References:

3GPP Specification #: 29.229

RFC 4740 – Diameter Session Initiation Protocol (SIP) Application

Diameter-User-Authorization-Request-Command-Code-300-Packet-Capture

Diameter and SIP: User-Authorization-Request/Answer

These posts focus on the use of Diameter and SIP in an IMS / VoLTE context, however these practices can be equally applied to other networks.

The Diameter User-Authorization-Request and User-Authorization-Answer commands are used as the first line of authorization of a user and to determine which Serving-CSCF to forward a request to.

Basics

When a SIP Proxy (I-CSCF) receives an incoming SIP REGISTER request, it sends a User-Authorization-Request to a Diameter server to confirm if the user exists on the network, and which S-CSCF to forward the request to.

When the Diameter server receives the User-Authorization-Request it looks at the User-Name (1) AVP to determine if the Domain / Realm is served by the Diameter server and the User specified exists.

Assuming the user & domain are valid, the Diameter server sends back a User-Authorization-Answer, containing a Server-Capabilities (603) AVP with the Server-Name of the S-CSCF the user will be served by.

I always find looking at the packets puts everything in context, so here’s a packet capture of both the User-Authorization-Request and the User-Authorization-Answer.

First Registration

If this is the first time this Username / Domain combination (Referred to in the RFC as an AOR – Address of Record) is seen by the Diameter server in the User-Authorization-Request it will allocate a S-CSCF address for the subscriber to use from it’s pool / internal logic.

The Diameter server will store the S-CSCF it allocated to that Username / Domain combination (AoR) for subsequent requests to ensure they’re routed to the same S-CSCF.

The Diameter server indicates this is the first time it’s seen it by adding the DIAMETER_FIRST_REGISTRATION (2001) AVP to the User-Authorization-Answer.

Subsequent Registration

If the Diameter server receives another User-Authorization-Request for the same Username / Domain (AoR) it has served before, the Diameter server returns the same S-CSCF address as it did in the first User-Authorization-Answer.

It indicates this is a subsequent registration in much the same way the first registration is indicated, by adding an DIAMETER_SUBSEQUENT_REGISTRATION (2002) AVP to the User-Authorization-Answer.

User-Authorization-Type (623) AVP

An optional User-Authorization-Type (623) AVP is available to indicate the reason for the User-Authorization-Request. The possible values / reasons are:

  • Creating / Updating / Renewing a SIP Registration (REGISTRATION (0))
  • Establishing Server Capabilities & Registering (CAPABILITIES (2))
  • Terminating a SIP Registration (DEREGISTRATION (1))

If the User-Authorization-Type is set to DEREGISTRATION (1) then the Diameter server returns the S-CSCF address in the User-Authorization-Answer and then removes the S-SCSF address it had associated with the AoR from it’s own records.

Other Diameter Cx (IMS) Calls

User-Authorization-Request / User-Authorization-Answer
Server-Assignment-Request / Server-Assignment-Answer
Location-Info-Request / Location-Info-Answer
Multimedia-Auth-Request / Multimedia-Auth-Answer
Registration-Termination-Request / Registration-Termination-Answer
Push-Profile-Request / Push-Profile-Answer

References:

3GPP Specification #: 29.229

RFC 4740 – Diameter Session Initiation Protocol (SIP) Application