All posts by Nick

About Nick

Dialtone.

Open5GS without NAT

While most users of Open5GS EPC will use NAT on the UPF / P-GW-U but you don’t have to.

While you can do NAT on the machine that hosts the PGW-U / UPF, you may find you want to do the NAT somewhere else in the network, like on a router, or something specifically for CG-NAT, or you may want to provide public addresses to your UEs, either way the default config assumes you want NAT, and in this post, we’ll cover setting up Open5GS EPC / 5GC without NAT on the P-GW-U / UPF.

Before we get started on that, let’s keep in mind what’s going to happen if we don’t have NAT in place,

Traffic originating from users on our network (UEs / Subscribers) will have the from IP Address set to that of the UE IP Pool set on the SMF / P-GW-C, or statically in our HSS.

This will be the IP address that’s sent as the IP Source for all traffic from the UE if we don’t have NAT enabled in our Core, so all external networks will see that as the IP Address for our UEs / Subscribers.

The above example shows the flow of a packet from UE with IP Address 10.145.0.1 sending something to 1.1.1.1.

This is all well and good for traffic originating from our 4G/5G network, but what about traffic destined to our 4G/5G core?

Well, the traffic path is backwards. This means that our router, and external networks, need to know how to reach the subnet containing our UEs. This means we’ve got to add static routes to point to the IP Address of the UPF / P-GW-U, so it can encapsulate the traffic and get the GTP encapsulated traffic to the UE / Subscriber.

For our example packet destined for 1.1.1.1, as that is a globally routable IP (Not an internal IP) the router will need to perform NAT Translation, but for internal traffic within the network (On the router) the static route on the router should be able to route traffic to the UE Subnets to the UPF / P-GW-U’s IP Address, so it can encapsulate the traffic and get the GTP encapsulated traffic to the UE / Subscriber.

Setting up static routes on your router is going to be different on what you use, in my case I’m using a Mikrotik in my lab, so here’s a screenshot from that showing the static route point at my UPF/P-GW-U. I’ve got BGP setup to share routes around, so all the neighboring routers will also have this information about how to reach the subscriber.

Next up we’ve got to setup IPtables on the server itself running our UPF/P-GW-U, to route traffic addressed to the UE and encapsulate it.

sudo ip route add 10.145.0.0/24 dev ogstun
sudo echo 1 > /proc/sys/net/ipv4/ip_forward
sudo iptables -A FORWARD -i ogstun -o osgtun -s 10.145.0.0/24 -d 0.0.0.0/0 -j ACCEPT

And that’s it, now traffic coming from UEs on our UPF/P-GW will leave the NIC with their source address set to the UE Address, and so long as your router is happily configured with those static routes, you’ll be set.

If you want access to the Internet, it then just becomes a matter of configuring traffic from that subnet on the router to be NATed out your external interface on the router, rather than performing the NAT on the machine.

In an upcoming post we’ll look at doing this with OSPF and BGP, so you don’t need to statically assign routes in your routers.

ENUM in Practice with Kamailio

In our last post we covered the theory behind ENUM and its use, and in this post we’ll cover setting up Kamailio to query an ENUM server.

Before we start, if you’re not familiar with ENUM, check out my primer on the topic here, and then for more in-depth info check out this post on configuring an ENUM server in Bind and the ways records can be configured,

So once we’ve got an ENUM server configured and confirmed we can query it and get the results we want using Dig, we can configure Kamailio.

But before we get to the Kamailio side, a word on how Kamailio handles DNS,
Kamailio doesn’t have the ability to set a DNS server, instead it uses the system DNS server details,
This means your system will need to use the DNS server we want to query for ENUM for all DNS traffic, not just for Kamailio. This means you may need to setup Recursion to still be able to query DNS records for the outside world.

To add support to Kamailio, we’ll need to load the enum module (enum.so),

In terms of parameters, all we’ll set is the doman_suffix, which is, as it sounds, the domain suffix used in the DNS queries. If you’re using a different domain for your ENUM it’d need to be reflected here.

 modparam("enum", "domain_suffix", "e164.arpa.")

Next up inside our minimalist dialplan we’ll just add enum_query(); to query the SIP URI,

if(is_method("INVITE")){
         enum_query();
         xlog("Ran ENUM query");
         xlog("To URI $tU");
         forward();
}

Obviously in production you’d want to add more sanity checks and error handling, but with this, sending a SIP INVITE to Kamailio with an E.164 number in the SIP URI user part, will lead to an ENUM query resolving this, and routing the traffic to it,

An example PCAP above, showing a call to +61355500912, the resulting ENUM query and routing to server1.enum.nickvsnetworking.com,

By using Kamailio’s transaction module ™ as a stateful proxy, we can run queries again if one were to fail,

Copy of example config is available here,

Diameter – Insert Subscriber Data Request / Response

While we’ve covered the Update Location Request / Response, where an MME is able to request subscriber data from the HSS, what about updating a subscriber’s profile when they’re already attached? If we’re just relying on the Update Location Request / Response dialog, the update to the subscriber’s profile would only happen when they re-attach.

We need a mechanism where the HSS can send the Request and the MME can send the response.

This is what the Insert Subscriber Data Request/Response is used for.

Let's imagine we want to allow a subscriber to access an additional APN, or change an AMBR values of an existing APN;

We'd send an Insert Subscriber Data Request from the HSS, to the MME, with the Subscription Data AVP populated with the additional APN the subscriber can now access.

Beyond just updating the Subscription Data, the Insert Subscriber Data Request/Response has a few other funky uses.

Through it the HSS can request the EPS Location information of a Subscriber, down to the TAC / eNB ID serving that subscriber. It’s not the same thing as the GMLC interfaces used for locating subscribers, but will wake Idle UEs to get their current serving eNB, if the Current Location Request is set in the IDR Flags.

But the most common use for the Insert-Subscriber-Data request is to modify the Subscription Profile, contained in the Subscription-Data AVP,

If the All-APN-Configurations-Included-Indicator is set in the AVP info, then all the existing AVPs will be replaced, if it’s not then everything specified is just updated.

The Insert Subscriber Data Request/Response is a bit novel compared to other S6a requests, in this case it’s initiated by the HSS to the MME (Like the Cancel Location Request), and used to update an existing value.

PS Data Off

Imagine a not-too distant future, one without flying cars – just one where 2G and 3G networks have been switched off.

And the imagine a teenage phone user, who has almost run out of their prepaid mobile data allocation, and so has switched mobile data off, or a roaming scenario where the user doesn’t want to get stung by an unexpectedly large bill.

In 2G/3G networks the Circuit Switched (Voice & SMS) traffic was separate to the Packet Switched (Mobile Data).

This allowed users to turn of mobile data (GPRS/HSDPA), etc, but still be able to receive phone calls and send SMS, etc.

With LTE, everything is packet switched, so turning off Mobile Data would cut off VoLTE connectivity, meaning users wouldn’t be able to make/recieve calls or SMS.

In 3GPP Release 14 (2017) 3GPP introduced the PS Data Off feature.

This feature is primarily implemented on the UE side, and simply blocks uplink user traffic from the UE, while leaving other background IP services, such as IMS/VoLTE and MMS, to continue working, even if mobile data is switched off.

The UE can signal to the core it is turning off PS Data, but it’s not required to, so as such from a core perspective you may not know if your subscriber has PS Data off or not – The default APN is still active and in the implementations I’ve tried, it still responds to ICMP Pings.

IMS Registration stays in place, SMS and MMS still work, just the UE just drops the requests from the applications on the device (In this case I’m testing with an Android device).

What’s interesting about this is that a user may still find themselves consuming data, even if data services are turned off. A good example of this would be push notifications, which are sent to the phone (Downlink data). The push notification will make it to the UE (or at least the TCP SYN), after all downlink services are not blocked, however the response (for example the SYN-ACK for TCP) will not be sent. Most TCP stacks when ignored, try again, so you’ll find that even if you have PS Data off, you may still use some of your downlink data allowance, although not much.

The SIM EF 3GPPPSDATAOFF defines the services allowed to continue flowing when PS Data is off, and the 3GPPPSDATAOFFservicelist EF lists which IMS services are allowed when PS Data is off.

Usually at this point, I’d include a packet capture and break down the flow of how this all looks in signaling, but when I run this in my lab, I can’t differentiate between a PS Data Off on the UE and just a regular bearer idle timeout… So have an irritating blinking screenshot instead…

Remember Bash History Forever

History in Bash is a huge time saver.

That beautifully crafted sed command you put together to replace the contents of something a few months ago? Just search through your Bash history and there it is.

Previously I’d been grepping the output of history to find what I was looking for, and now I’ve fallen in love with the search feature, but by default, many Linux distros limit the number of lines in the Bash history to 2,000. If you’re a regular Linux user, this isn’t cutting the mustard.

By default the bashrc file that ships with Ubuntu is limited to 2,000 lines or 1MB,

We can change all this very easily, by editing the ~/.bashrc file (Bash shell script), upping the limit of entries we keep. While you’re at it adding HISTTIMEFORMAT allows you to timestamp the commands you’re running, and the PROMPT_COMMAND below also writes immediately, so you won’t get lost data or missing stuff that you’ve just run in another terminal.

Example contents of ~/.bashrc:

export HISTSIZE=100000
export HISTFILESIZE=200000
export HISTTIMEFORMAT='%d/%m/%y %T '
PROMPT_COMMAND="history -a; $PROMPT_COMMAND"

And you can apply the changes with:

source ~/.bashrc

The Anti-Digit Dialing League

In 1962 Pacific Telephone and Telegraph announced that it would remove exchange names,

Up until this point in the US telephone numbers had been prefixed with the exchange name, as a one or two letter code, which would be used in place of the digits when you dialed,

For example if you were on the GArfield exchange (GA) you’d give your number as GA 1234 or GArfield 1234, to dial this the GA would just be converted into numbers based on the dial, so GA = 42 1234.

The Bell system had wanted to do away with this for a long time – it’s inflexibility meant digits that spelled out the prefix of common place names were filled up, while others were almost unused, and was not conducive to the growth patterns of telephone systems. Letters alone limited the dialing plan to 540 combinations for the area code, for 186 million Americans at the time, while moving to all-numbers opened up for use the 0 and 1 positions on the dial (which don’t have letters associated with them), expanding the pool.

The North American Numbering Plan (NANP) had been divided by AT&T in the 1940s and from 1951 onwards was being rolled out across the bell system, so it shouldn’t have come as any great surprise that in May of 1962 Pacific Telephone and Telegraph, like many other Bell system companies, made the announcement instead of exchange names, there would be a 3 digit exchange code / area code, followed by 4 more digits for the local subscriber, what it called “All-number dialing”.

This is where our story would end if it weren’t for some outcry of locals regarding the loss of their beloved exchange codes. Letters to the editor of local newspapers led to polling by the San Francisco Chronicle revealing two-thirds of their readers opposed to all-number dialing, which led to one man – Carl V May, taking out an advertisement in the the local newspapers with a simple one line statement and address,

Join the Anti-Digit Dialing League

P.O. Box 996, Sausalito, Calif

The ad received over 3,500 responses, and a sizable following for the group sprang up practically overnight, united in their opposition to the loss of the exchange letters and the “creeping numeralism” being pushed upon them.

These people are systematically trying to destroy the use of memory. They tell you to ‘write it down,’ not memorize it. Try writing a telephone number down in a dark booth while groping for a pencil, searching in an obsolete phone book and gasping for breath. And all this in the name of efficiency ! Engineers have a terrible intellectual weakness. ‘If it fits the machine,’ they say, ‘then it ought to fit people.’ This is something that bothers me very much: absentmindedness about people.

S. I. Hayakawa

To be clear, automation and the removal of switchboard operators for local calls (Direct Digit Dialing (DDD)) (“Subscriber Trunk Dialing” or “STD” as it’s known in the UK and Australia) had happened already, so this wasn’t about people losing their jobs, but rather Citizens wanting to keep the letters of the places their dialing.
Nor were phone numbers themselves changing due to All-Digit-Dialing, if your number was GA 1234 you’d still dial 42 1234 to get there, it would just be printed as 42 1234 instead of GA 1234 in the phone books.

A steady stream of telephone customers–“mainly from the Valley,” said a Times account of the local hearings–complained that ANC was dehumanizing, violated tradition, eliminated a sense of community, increased dialing errors, made phone numbers more difficult to remember and ran up phone bills, because people no longer knew where they were calling.

ADDL’s support continued to grow, badges appeared and a legal challenge was mounted against the phone company to prevent this, and a restraining order was issued to halt the project, and the Public Utilities Commission had to go through 3,200 pages of testimony from hearings in Los Angeles and San Francisco on the impact of the All-Number-Calling system.

The 25 cent lapel pin available for members of the ADDL

Comedian Alan Sherman wrote a song called “The Let’s All Call Up A.T & T And Protest To The President March” on his 1963 album “My Son, The Celebrity”, which hasn’t aged well…

But progress marched on, the restraining order was quashed and by 1964 NANP rolled on, and all-digit dialing continued to be rolled out across the rest of North America.

And as quickly as it appeared, the ADDL was gone.

NANP continued and phone numbers were changed and expanded several times since then, but never with resistance as strong as that of the ADDL.

While researching this it reminded me of Reply All episode #104 The Case of the Phantom Caller.

Further Reading

The Farmville herald., January 04, 1963, Page 1B, Image 9

The Chronicle., August 28, 1963, Page ELEVEN, Image 11

The Day of the Digits: Postscript to the Prefix War

Docker & BIND as an ENUM Playground

In the last we covered what ENUM is and how it works, so to take this into a more practical example, I thought I’d share the details of the ENUM server I’ve setup in my lab, and the Docker container I’ve bundled it into.

Inside the Docker container we’ll be running Bind – this post won’t teach you much about Bind, there’s already lots of good information on it elsewhere, but we will cover the parameters involved in setting up ENUM records (NAPTR) for E.164 addresses.

Getting the Environment up and Running

First we’ll need to setup our environment, I’ve published the images for the container to Dockerhub, but we’ll build it from the Dockerfile so you can edit the files and rebuild as you play around:

git clone https://github.com/nickvsnetworking/ENUM_Playground
cd ENUM_Playground
docker build --pull --rm -f "Dockerfile" -t enum:latest "."

systemd-resolve on Ubuntu binds to port 53 by default, which can lead to some headaches, so we’ll create a new network in Docker for this to run in, so it doesn’t conflict with anything else you may be running:

sudo docker network create --subnet=172.30.0.0/26 enum_playground

And now we’ll run the ENUM container in the enum_playground network and with the IP 172.30.0.2,

docker run -d --rm --name=enum --net=enum_playground --ip=172.30.0.2 enum

Ok, that’s the environment setup, let’s run some queries!

E.164 to SIP URI Resolution with ENUM

In our last post we covered the basics of formatting an E.164 number and querying a DNS server to get it’s call routing information.

Again we’re going to use Dig to query this information. In reality ENUM queries would be run by an endpoint, or software like FreeSWITCH or Kamailio (Spoiler alert, posts on ENUM handling in those coming later), but as we’re just playing Dig will work fine.

So let’s start by querying a single E.164 address, +61355500911

First we’ll reverse it and put full stops / periods between the numbers, to get 1.1.9.0.0.5.5.5.3.1.6

Next we’ll add the e164.arpa prefix, which is the global prefix for ENUM addresses, and presto, that’s what we’ll query – 1.1.9.0.0.5.5.5.3.1.6.e164.arpa

Lastly we’ll feed this into a Dig query against the IP of our container and of type NAPTR,

dig @172.30.0.2 -t naptr 1.1.9.0.0.5.5.5.3.1.6.e164.arpa

So what did you get back?

Well, if everything is working your output should look something like the output I’ve got below,

NAPTR results for queried ENUM Address

So how do we interpret this? Well let’s break it down,

The first part is the domain we queried, simple enough in this case,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Next up is the TTL or expiry, in this case it’s 3600 seconds (1 hour), shorter periods allow for changes to propagate / be reflected more quickly but at the expense of more load as results can’t be cached for as long. The class (IN) represents Internet, which is the only class commonly used, even on internal systems.

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Then we have the type of record returned, in our case it’s a NAPTR record,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

After that is the Order, this defines the order in which the rules are to be parsed. Lower numbers are processed first, if no matches then the next lowest, and so on until the highest number is reached, we’ll touch on this in more detail later in this post,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The Pref is the processing preference. This is very handy for load balancing, as we can split traffic between hosts with different preferences. We’ll cover this later in this post too.

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The Flags represent the type of record we’re going to get, for most ENUM traffic this is going to be set to U, to denote a SIP URI with Regex, while the Service value we’ll be looking for will be “E2U+sip” service to identify SIP URIs to route calls to, but could be other values like Email addresses, IM Addresses or PSTN numbers, to be parsed by other applications.

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Lastly we’ve got the Regex part. Again not going to cover Regex as a whole, just the DNS particulars.

Everything between the first and second ! denotes what we’re searching for, while everything from the second ! to the last ! denotes what we replace it with.

In the below example that means we’re matching ^.* which means starting with (^) any character (.) zero or more times (*), which gets replaced with sip:[email protected],

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

How should this be treated?

For the first example, a call to the E.164 address of 61355500912 will be first formatted into a domain as per the ENUM requirements (1.1.9.0.0.5.5.5.3.1.6.e164.arpa) and then queried as a NAPTR record against the DNS server,

1.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Only a single record has been returned so we don’t need to worry about the Order or Preference, and the Regex matches anything and replaces it with the resulting SIP URI of sip:[email protected], which is where we’ll send our INVITE.

Under the Hood

Inside the Repo we cloned earlier, if you open the e164.arpa.db file, things will look somewhat familiar,

The record we just queried is the first example in the Bind config file,

; E.164 Address +61355500911 - Simple no replacement (Resolves all traffic to sip:[email protected])
1.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The config file is just the domain, class, type, order, preference, flags, service and regex.

Astute readers may have noticed the trailing . which where we can put a replacement domain if Regex is not used, but it cannot be used in conjunction with Regex, so for all our work it’ll just be a single trailing . on each line.

You can (and probably should) change the values in the e164.arpa.db file as we go along to try everything out, you’ll just need to rebuild the container and restart it each time you make a change.

This post is going to focus on Bind, but the majority of modern DNS servers support NAPTR records, so you can use them for ENUM as well, for example I manage the DNS for this site thorough Cloudflare, and I’ve put a screenshot below of an example private ENUM address I’ve added into it.

Setting up a NAPTR record in Cloudflare DNS

Preference to Split Traffic between Servers

So with a firm understanding of a single record being returned, let’s look at how we can use ENUM to cleverly route traffic to multiple hosts.

If we have a pool of servers we may wish to evenly distribute all traffic across them, so that’s how E.164 address +61355500912 is setup – to route traffic evenly (50/50) across two servers.

Querying it with Dig provides the following result:

dig @172.30.0.2 -t naptr 2.1.9.0.0.5.5.5.3.1.6.e164.arpa
;; ANSWER SECTION:
2.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 2.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

So as the order value (10) is the same for both records, we can ignore it – there isn’t one value lower than the other.

We can see both records have a preference of 100, in practice, this means they each get 50% of the traffic. The formula for traffic distribution is pretty simple, each server gets the value of it’s preference, divided by the total of all the preferences,

So for server1 it’s preference is 100 and the total of all the preferences combined is 200, so it gets 100/200, which is equivalent to one half aka 50%.

We might have a scenario where we have 3 servers, but one is significantly more powerful than the others, so let’s look at giving more traffic to one server and less to others, this example gets a little more complex but should cement your understanding of how the preference works;

dig @172.30.0.2 -t naptr 3.1.9.0.0.5.5.5.3.1.6.e164.arpa
3.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 3.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 200 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .
3.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

So now 3 servers, again none have a lower order than the other, it’s set to 10 for them all so we can ignore the order,

Next we can see the total of all the priority values is 400,

Server 2 has a priority of 100 so it gets 100/400 total priority, or a quarter of all traffic. Server 1 has the same value, so also gets a quarter of all traffic,

Server 3 however has a priority of 200 so it gets 200/400, or to simplify half of all traffic.

The Bind config for this is:

; E.164 Address +61355500913 - More complex load balance between 3 hosts (25% server1, 25% server2, 50% server3)
3.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 3.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .
3.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 200 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Order for Failover

Primarily the purpose of the order is to enable wildcard routes (as we’ll see later) to be overwritten by more specific routes, but a secondary use in some implementations use Order as a way to list the preferences of the SIP URIs to route to. For example we could have two servers, one a primary and the other a standby, with the standby only to be used only if the primary SIP URI was not responding.

E.164 number +61355500914 is setup to return two SIP URIs,

dig @172.30.0.2 -t naptr 4.1.9.0.0.5.5.5.3.1.6.e164.arpa
;; ANSWER SECTION:
4.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 4.1.9.0.0.5.5.5.3.1.6.e164.arpa. 3600 IN NAPTR  20 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Our DNS client will first use the SIP URI sip:[email protected] as it has the lower order value (10), and if that fails, can try the entry with the next lowest order-value (20) which would be sip:[email protected].

The Bind config for this is:

; E.164 Address +61355500914 - Order example returning multiple SIP URIs to try for failover
4.1.9.0.0.5.5.5.3.1.6 IN NAPTR 10 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" . 4.1.9.0.0.5.5.5.3.1.6 IN NAPTR 20 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Wildcards

If we have a 1,000 number block, having to add 1000 individual records can be very tedious. Instead we can use wildcard matching (thanks to the fact we’ve reversed the E.164 address) to match ranges. For example if we have E.164 numbers from +61255501000 to +61255501999 we can add a wildcard entry to match the +61255501x prefix,

I’ve set this up already so let’s lookup the E.164 number +6125501234,

dig @172.30.0.2 -t naptr 4.3.2.1.0.5.5.5.2.1.6.e164.arpa
;; ANSWER SECTION:
4.3.2.1.0.5.5.5.2.1.6.e164.arpa. 3600 IN NAPTR  50 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

If you look up any other number starting with +6125501 you’ll get the same result, and here’s the Bind config for it:

; Wildcard E.164 Address +61255501* - Wildcard example for all destinations starting with E.164 prefix +61255501x to single destination (sip:[email protected])
; For example E.164 number +6125501234 will resolve to sip:[email protected]
*.1.0.5.5.5.2.1.6 IN NAPTR 100 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

The catch with this is they’re all pointing at the same SIP URI, so we can’t treat the calls differently based on the called number – This is where the Regex magic comes in.

We can use group matching to match a group and fill it in the dialed number into the SIP Request URI, for example:

!(^.*$)!sip:+1\[email protected]!

Will match the E.164 number requested and put it inside sip:[email protected]

The +61255502xxx prefix is setup for this, so if we query +61255502000 (or any other number between +61255502000 and +61255502999) we’ll get the regex query in the resulting record.

Keep in mind DNS doesn’t actually apply the Regex transformation, just shares it, and the client applies the transformation.

dig @172.30.0.2 -t naptr 0.0.0.2.0.5.5.5.2.1.6.e164.arpa
;; ANSWER SECTION:
0.0.0.2.0.5.5.5.2.1.6.e164.arpa. 3600 IN NAPTR  100 100 "u" "E2U+sip" "!(^.*$)!sip:+1\[email protected]!" .

And the corresponding Bind config:

; Wildcard example for all destinations starting with E.164 prefix +61255502x to regex filled destination
; For example a request to 61255502000 will return sip:[email protected])
*.2.0.5.5.5.2.1.6 IN NAPTR 100 100 "u" "E2U+sip" "!(^.*$)!sip:+1\\[email protected]!" .

One last thing to keep in mind, is that Wildcard priorities are of any length.
This means +612555021 would match as well as +6125550299999999999999. Typically terminating switches drop any superfluous digits, and NU those that are too short, but keep this in mind, that length is not taken into account.

Wildcard Priorities

So with our wildcards in place, what if we wanted to add an exception, for example one number in our 61255502xxx block of numbers gets ported to another carrier and needs to be routed elsewhere?

Easy, we just add another entry for that number being more specific and with a lower order than the wildcard, which is what’s setup for E.164 number +61255502345,

dig @172.30.0.2 -t naptr 5.4.3.2.0.5.5.5.2.1.6.e164.arpa
;; ANSWER SECTION:
5.4.3.2.0.5.5.5.2.1.6.e164.arpa. 3600 IN NAPTR  50 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

Which does not return the same result as the others that match the wildcard,

Bind config:

; Wildcard example for all destinations starting with E.164 prefix +61255502x to regex filled destination
; For example a request to +61255502000 will return sip:[email protected])
*.2.0.5.5.5.2.1.6 IN NAPTR 100 100 "u" "E2U+sip" "!(^.*$)!sip:+1\\[email protected]!" .

; More specific example with lower order than +6125550x wildcard for E.164 address +61255502345 will return sip:[email protected]
5.4.3.2.0.5.5.5.2.1.6 IN NAPTR 50 100 "u" "E2U+sip" "!^.*$!sip:[email protected]!" .

We can combine all of the tricks we’ve covered here, from statically defined entries, wildcards, regex replacement, multiple entries with multiple orders and preferences, to create really complex routing, using only DNS.

Summary & Next Steps

So by now hopefully you’ve got a fair understanding of how NAPTR and DNS work together to translate E.164 addresses into SIP URIs,

Of course being able to do this manually with Dig and comprehend how it’ll route is only one part of the picture, in the next posts we’ll cover using Kamailio and FreeSWITCH to query ENUM routing information and route traffic to it,

ENUM – DNS based Call Routing

DNS is commonly used for resolving domain names to IP Addresses, and is often described as being like “the phone book of the Internet”.

So what’s the phone book of phone books?

The answer, is (kind of) DNS. With the aid of E.164 number to URI mapping (ENUM), DNS can be used to resolve phone numbers into SIP URIs to route the traffic to.

So what is ENUM?

ENUM allows us to bypass the need for a central switch for routing calls to numbers, and instead, through a DNS lookup, resolve a phone number into a reachable SIP URI that is the ultimate destination for the traffic.

Imagine you want to call a company, you dial the phone number for that company, your phone does a DNS query against the phone number, which returns the SIP URI of the company’s PBX, and your phone sends the SIP INVITE directly to the company’s PBX, with no intermediary party carrying the call.

3GPP have specified ENUM as the prefered mechanism for resolving phone numbers into SIP addresses, and while it’s widespread adoption on the public Internet is still in its early days (See my post on The Sad story of ENUM in Australia) it is increasingly common in IMS networks and inside operator networks.

IETF defined RFC 6116 for “The E.164 to Uniform Resource Identifiers (URI) Dynamic Delegation Discovery System (DDDS) Application (ENUM)”, which defines how the system works.

So how does ENUM actually work?

ENUM allow us to lookup a phone number on a DNS server and find the SIP URI a server that will handle traffic for the phone number, but it’s a bit more complicated than the A or AAAA records you’d use to resolve a website, ENUM relies on NAPTR records.

Let’s look at the steps involved in taking an E.164 number and knowing where to send it.

Step 1 – Reverse the Numbers

We read phone numbers from left to right.

This is because historically the switch needs to get all the long-distance routing sorted first. The switch has to route your call to the exchange that serves that subscriber, which is what all the area codes and prefixes assigned to areas are all about (Throwback to SZU for any old Telco buffs).

For an E.164 number you’ve got a Country Code, Area Code and then the Subscriber Number. The number gets more specific as it goes along.

But getting more specific as you go along is the opposite how how DNS works, millions of domains share the .com suffix, and the unique / specific part is the bits before that.

So the first step in the ENUM process is to reverse the phone number, so let’s take phone number (03) 5550 0912, which in E.164 is +61 3 5550 0912.

As the spaces in the phone numbers are there for the humans, we’ll drop all of them and reverse the number, as DNS is more specific right-to-left, so we end up with

2.1.9.0.0.5.5.5.3.1.6

Step 2 – Add the Suffix

The ITU ENUM specifies the suffix e164.arpa be assigned for public ENUM entries. Private ENUM deployments may use their own suffix, but to make life simple I’m going to use e164.arpa as if it were public.

So we’ll append the e164.arpa domain onto our reversed and formatted E.164 phone number:

2.1.9.0.0.5.5.5.3.1.6.e164.arpa

Step 3 – Query it

Next we’ll run a Naming Authority Pointer (NAPTR) query against the domain, to get back a list of records for that number.

DNS is a big topic, and NAPTR and SRV takes up a good chunk of it, but what you need to know is that by using NAPTR we’re not limited to just a single response, we could have a weighted pool of servers handling traffic for this phone number, and be able to control load through the use of NAPTR, amongst other things.

Of course, if our phone can query the public NAPTR records, then so can anyone else, so we can just use a tool like Dig to query the record ourselves,

dig @10.0.1.252 -t naptr 2.1.9.0.0.5.5.5.3.1.6.e164.arpa

In the answers section I’ve setup this DNS server to only return a single response, with the regex SIP URI to use, in my case that’s sip:[email protected]

You’ll obviously need to replace the DNS server with your DNS server, and the query with the reversed and formatted version of the E.164 number you wish to query.

Step 4 – Send SIP traffic

After looking at the NAPTR records returned and using the weight and priority to determine which server/s to send to first, our phone forwards an INVITE to the URI returned in the NAPTR record.

How to interpret the returned results?

The first thing to keep in mind when working with ENUM is multiple records being returned is supported, and even encouraged.

NAPTR results return 7 fields, which define how it should be handled.

The host part is fairly obvious, and defines the host / DNS entry we’re talking about.

The Service defines what type of service this is. ENUM can be expanded beyond just voice, for example you may want to also return an email address or IM address as well as a SIP Address on an ENUM query, which you can do. By default voice uses the “E2U+sip” service to identify SIP URIs to route calls to, so in this context that’s what we’re interested in, but keep in mind there are other types out there,

Example ENUM query against a phone number showing other types of services (Email & Web)

The Order simply defines the order in which the rules are to be parsed. Lower numbers are processed first, if no matches then the next lowest, and so on until the highest number is reached.

The Pref is the processing preference. For load balancing 50/50 between two sites say a Melbourne and Sydney site, we’d return two results, with the same Order, and the same Pref, would see traffic split 50/50 between the two sites.
We could split this further, a Pref value of 10 for Melbourne, 10 for Sydney, 5 for Brisbane and 5 for Perth would see 33% of calls route to Melbourne, 33% of calls route to Sydney, 16.5% of calls route to Brisbane and 16.5% of calls route to Perth.
This is because we’d have a total preference value of 30, and the individual preference for each entry would work out as the fraction of the total (ie Pref 10 out of 30 = 10/30 or 33.3%).

The Flags denote the type of record we’re going to get, for most ENUM traffic this is going to be set to U, to denote a SIP URI with Regex.

The regexp field contains our SIP URI in the form of a Regular expression, which can include pattern matching and replacement. This is most commonly used to fill in the phone number into the SIP URI, for example instead of hardcoding the phone number into the response, we could use a Regular expression to fill in the requested number into the SIP URI.

!(^.*$)!sip:+1\[email protected]!

ENUM sounds great, how do I get it?

Here’s the tricky part.

If you’re looking to implement ENUM for an internal network, great, I’ll have some more posts here over the next few weeks covering off configuration of a DNS server to support ENUM lookups, and using Kamailio to lookup ENUM routes.

In terms of public ENUM, while many carriers are using ENUM inside their networks, public adoption of ENUM in most markets has been slow, for a number of reasons.

Many incumbent operators have been reluctant to embrace public ENUM as their role as an operator would be relegated to that of a Domain registrar.
Additionally, there’s real security risks involved in moving to ENUM – opening your phone system up to the world to accept inbound calls from anywhere. This could lead to DOS-style attacks of flooding phone numbers with automatically generated traffic, privacy risks and even less validation in terms of caller ID trust.

RIPE maintains the EnumData.org website listing the status of ENUM for each country / region.

Looking inside the MMS Exchange (With call flow and PCAP)

So you want to send an MMS?

We’ve covered SMS in the past, but MMS is a different kettle of fish.

Let’s look at how the call flow goes, when Bob wants to send a picture to Alice.

Before Bob sends the MMS, his phone will have to be setup with the correct settings to send MMS.
Sometimes this is done manually, for others it’s done through the Carrier provisioning SMS that preloads the settings, and for others it’s baked in based on the Android Carrier settings XML,

APN settings for Telstra in Australia for MMS

It’s made up of the APN to send MMS traffic over, the MMSC address (Multimedia Message Switching Center) and often an MMS proxy and port combination for where the traffic will actually go.

Message Flow – Bob to MMSC (Mobile Originated MMS)

Bob opens his phone, creates a new message to Alice, selects the picture (or other multimedia filetype) to send to her and hits the send button.

For starters, MMS has a file size limit, like MTU it’s not advertised, so you don’t know if you’ve hit it, so rather like MTU is a “lowest has the highest success of getting through” rule. So Bob’s phone will most likely scale the image down to fit inside 300K.

Next Bob’s phone knows it has an MMS to send, for this is opens up a new bearer on the MMS APN, typically called MMS, but configured in the phone by Bob.

Why use a separate APN for sending 300K of MMS traffic?
Once upon a time mobile data was expensive.
By having a separate APN just for MMS traffic (An APN that could do nothing except send / receive MMS) allowed easier billing / tariffing of data, as MMS traffic was sent over a APN which was unmetered.

After the bearer is setup on the MMS APN, Bob’s phone begins crafting a HTTP 1.1 Post to be sent to the MMSC.
The content type of this request will be application/vnd.wap.mms-message and the body of the HTTP post will be made up of MMS Message Encapsulation, with the body containing the picture he wants to send to Alice.

Note: Historically Wireless Session Protocol (WSP) was used in lieu of HTTP. These clients would now need a WAP gateway to translate into HTTP.

This HTTP Post is then sent to the MMSC Address, or, if present, the MMSC Proxy address.
This traffic is sent over the MMS APN that we just brought up.

HTTP POST Headers for the MO MMS Message

MMS Message Encapsulation from MO MMS Message

The MMSC receives this information, and then, if all was successful, responds with a 200 OK,

200 OK response to MO MMS Message

So now the MMSC has the information from Bob, let’s flip over to Alice.

Message Flow – MMSC to Alice (Mobile Terminated MMS)

For the purposes of simplicity, we’re going to rule out the MMSC from doing clever things like converting the media, accepting email (SMPP) as MMS, etc, etc. Instead we’re going to assume Alice and Bob are on the same Network, and our MMSC is just doing store-and-forward.

The MMSC will look at the To address in the MMS Message Encapsulation of the request Bob sent, to determine that this message is destined for Alice.

The MMSC will load the media content (photo) sent by Bob destined for Alice and serve it via HTTP. The MMSC generates a random URL to serve it this particular file on, with each MMS the MMSC handles being assigned a random URL containing the media content.

Next the MMSC will need to tell Alice’s phone, that she has an MMS waiting for her. This is done by generating an SMS to send to Alice’s phone,

The user-data of this SMS is the Wireless Session Protocol with the method PUSH – Aka WAP Push.

SMS alerting the user of an MMS waiting for delivery

This specially encoded SMS is parsed by the Alice’s phone, which tells the her there is an MMS message waiting for her.

On some operating systems this is pulled automatically, on others, users need to select “Download” to actually get the file.

The UE then just runs an HTTP get to the address in the X-Mms-Content-Location: Header to pull the multimedia content that Bob sent.

HTTP GET from Alice’s Phone / UE to retrieve MMS sent by Bob (MT-MMS)

All going well the URL is valid and Alice’s phone retrieves the message, getting a 200 OK back from the server with the message content.

HTTP Response (200 OK) for MT-MMS, sent by the MMSC to Alice’s phone with the MMS Body

So now Alice’s phone has the MMS content and renders it on the screen, Alice can see the Photo Bob sent her.

Lastly Alice’s phone sends a HTTP POST again to the MMSC, this time indicating the message status is “Retrieved”,

And to close everything off the MMSC confirms receipt of the Retrieved status with a 200 OK, and we are done.

What didn’t we cover?

So that’s a basic MMS message flow, but there’s a few parts we didn’t cover.

The overall architecture beyond just the store-and forward behaviour, charging and authentication we didn’t cover. So let’s look at each of these points.

Overall Architecture

What we just covered what what’s defined as the MM1 interface.

There’s obviously a stack of other interfaces, such as for charging, messaging between MMSC/Carriers, subscriber locating / user database, etc.

Charging

MMSCs would typically have a connection to trigger charging events / credit-control events prior to processing the message.

For online charging the Ro interface can be used, as you would for IMS charging events.

3GPP 3GPP TS 32.270 covers the charging architecture for online/offline charging for MMS.

Authentication

Unfortunately authentication was a bit of an afterthought for the MMS standard, and can be done several different ways.

The most common is to correlate the IP Address on the MMS APN against a subscriber.

Telephony binary-coded decimal (TBCD) in Python with Examples

Chances are if you’re reading this, you’re trying to work out what Telephony Binary-Coded Decimal encoding is. I got you.

Again I found myself staring at encoding trying to guess how it worked, reading references that looped into other references, in this case I was encoding MSISDN AVPs in Diameter.

How to Encode a number using Telephony Binary-Coded Decimal encoding?

First, Group all the numbers into pairs, and reverse each pair.

So a phone number of 123456, becomes:

214365

Because 1 & 2 are swapped to become 21, 3 & 4 are swapped to become 34, 5 & 6 become 65, that’s how we get that result.

TBCD Encoding of numbers with an Odd Length?

If we’ve got an odd-number of digits, we add an F on the end and still flip the digits,

For example 789, we add the F to the end to pad it to an even length, and then flip each pair of digits, so it becomes:

87F9

That’s the abbreviated version of it. If you’re only encoding numbers that’s all you’ll need to know.

Detail Overload

Because the numbers 0-9 can be encoded using only 4 bits, the need for a whole 8 bit byte to store this information is considered excessive.

For example 1 represented as a binary 8-bit byte would be 00000001, while 9 would be 00001001, so even with our largest number, the first 4 bits would always going to be 0000 – we’d only use half the available space.

So TBCD encoding stores two numbers in each Byte (1 number in the first 4 bits, one number in the second 4 bits).

To go back to our previous example, 1 represented as a binary 4-bit word would be 0001, while 9 would be 1001. These are then swapped and concatenated, so the number 19 becomes 1001 0001 which is hex 0x91.

Let’s do another example, 82, so 8 represented as a 4-bit word is 1000 and 2 as a 4-bit word is 0010. We then swap the order and concatenate to get 00101000 which is hex 0x28 from our inputted 82.

Final example will be a 3 digit number, 123. As we saw earlier we’ll add an F to the end for padding, and then encode as we would any other number,

F is encoded as 1111.

1 becomes 0001, 2 becomes 0010, 3 becomes 0011 and F becomes 1111. Reverse each pair and concatenate 00100001 11110011 or hex 0x21 0xF3.

Special Symbols (#, * and friends)

Because TBCD Encoding was designed for use in Telephony networks, the # and * symbols are also present, as they are on a telephone keypad.

Astute readers may have noticed that so far we’ve covered 0-9 and F, which still doesn’t use all the available space in the 4 bit area.

The extended DTMF keys of A, B & C are also valid in TBCD (The D key was sacrificed to get the F in).

Symbol4 Bit Word
*1 0 1 0
#1 0 1 1
a1 1 0 0
b1 1 0 1
c1 1 1 0

So let’s run through some more examples,

*21 is an odd length, so we’ll slap an F on the end (*21F), and then encoded each pair of values into bytes, so * becomes 1010, 2 becomes 0010. Swap them and concatenate for our first byte of 00101010 (Hex 0x2A). F our second byte 1F, 1 becomes 0001 and F becomes 1111. Swap and concatenate to get 11110001 (Hex 0xF1). So *21 becomes 0x2A 0xF1.

And as promised, some Python code from PyHSS that does it for you:

    def TBCD_special_chars(self, input):
        if input == "*":
            return "1010"
        elif input == "#":
            return "1011"
        elif input == "a":
            return "1100"
        elif input == "b":
            return "1101"
        elif input == "c":
            return "1100"      
        else:
            print("input " + str(input) + " is not a special char, converting to bin ")
            return ("{:04b}".format(int(input)))


    def TBCD_encode(self, input):
        print("TBCD_encode input value is " + str(input))
        offset = 0
        output = ''
        matches = ['*', '#', 'a', 'b', 'c']
        while offset < len(input):
            if len(input[offset:offset+2]) == 2:
                bit = input[offset:offset+2]    #Get two digits at a time
                bit = bit[::-1]                 #Reverse them
                #Check if *, #, a, b or c
                if any(x in bit for x in matches):
                    new_bit = ''
                    new_bit = new_bit + str(TBCD_special_chars(bit[0]))
                    new_bit = new_bit + str(TBCD_special_chars(bit[1]))    
                    bit = str(int(new_bit, 2))
                output = output + bit
                offset = offset + 2
            else:
                bit = "f" + str(input[offset:offset+2])
                output = output + bit
                print("TBCD_encode output value is " + str(output))
                return output
    

    def TBCD_decode(self, input):
        print("TBCD_decode Input value is " + str(input))
        offset = 0
        output = ''
        while offset < len(input):
            if "f" not in input[offset:offset+2]:
                bit = input[offset:offset+2]    #Get two digits at a time
                bit = bit[::-1]                 #Reverse them
                output = output + bit
                offset = offset + 2
            else:   #If f in bit strip it
                bit = input[offset:offset+2]
                output = output + bit[1]
                print("TBCD_decode output value is " + str(output))
                return output

The PLMN Problem for Private LTE / 5G

So it’s the not to distant future and the pundits vision of private LTE and 5G Networks was proved correct, and private networks are plentiful.

But what PLMN do they use?

The PLMN (Public Land Mobile Network) ID is made up of a Mobile Country Code + Mobile Network Code. MCCs are 3 digits and MNCs are 2-3 digits. It’s how your phone knows to connect to a tower belonging to your carrier, and not one of their competitors.

For example in Australia (Mobile Country Code 505) the three operators each have their own MCC. Telstra as the first licenced Mobile Network were assigned 505/01, Optus got 505/02 and VHA / TPG got 505/03.

Each carrier was assigned a PLMN when they started operating their network. But the problem is, there’s not much space in this range.

The PLMN can be thought of as the SSID in WiFi terms, but with a restriction as to the size of the pool available for PLMNs, we’re facing an IPv4 exhaustion problem from the start if we’re facing an explosion of growth in the space.

Let’s look at some ways this could be approached.

Everyone gets a PLMN

If every private network were to be assigned a PLMN, we’d very quickly run out of space in the range. Best case you’ve got 3 digits, so only space for 1,000 networks.

In certain countries this might work, but in other areas these PLMNs may get gobbled up fast, and when they do, there’s no more. New operators will be locked out of the market.

Loaner PLMNs

Carriers already have their own PLMNs, they’ve been using for years, some kit vendors have been assigned their own as well.

If you’re buying a private network from an existing carrier, they may permit you to use their PLMN,

Or if you’re buying kit from an existing vendor you may be able to use their PLMN too.

But what happens then if you want to move to a different kit vendor or another service provider? Do you have to rebuild your towers, reconfigure your SIMs?

Are you contractually allowed to continue using the PLMN of a third party like a hardware vendor, even if you’re no longer purchasing hardware from them? What happens if they change their mind and no longer want others to use their PLMN?

Everyone uses 999 / 99

The ITU have tried to preempt this problem by reallocating 999/99 for use in Private Networks.

The problem here is if you’ve got multiple private networks in close proximity, especially if you’re using CBRS or in close proximity to other networks, you may find your devices attempting to attach to another network with the same PLMN but that isn’t part of your network,

Mobile Country or Geographical Area Codes
Note from TSB
Following the agreement on the Appendix to Recommendation ITU-T E.212 on “shared E.212 MCC 999 for internal use within a private network” at the closing plenary of ITU-T SG2 meeting of 4 to 13 July 2018, upon the advice of ITU-T Study Group 2, the Director of TSB has assigned the Mobile Country Code (MCC) “999” for internal use within a private network. 

Mobile Network Codes (MNCs) under this MCC are not subject to assignment and therefore may not be globally unique. No interaction with ITU is required for using a MNC value under this MCC for internal use within a private network. Any MNC value under this MCC used in a network has
significance only within that network. 

The MNCs under this MCC are not routable between networks. The MNCs under this MCC shall not be used for roaming. For purposes of testing and examples using this MCC, it is encouraged to use MNC value 99 or 999. MNCs under this MCC cannot be used outside of the network for which they apply. MNCs under this MCC may be 2- or 3-digit.

(Recommendation ITU-T E.212 (09/2016))

The Crystal Ball?

My bet is we’ll see the ITU allocate an MCC – or a range of MCCs – for private networks, allowing for a pool of PLMNs to use.

When deploying networks, Private network operators can try and pick something that’s not in use at the area from a pool of a few thousand options.

The major problem here is that there still won’t be an easy way to identify the operator of a particular network; the SPN is local only to the SIM and the Network Name is only present in the NAS messaging on an attach, and only after authentication.

If you’ve got a problem network, there’s no easy way to identify who’s operating it.

But as eSIMs become more prevalent and BIP / RFM on SIMs will hopefully allow operators to shift PLMNs without too much headache.

How UEs get Time in LTE

You may have noticed in the settings on your phone the time source can be set to “Network”, but what does this actually entail and how is this information transferred?

The answer is actually quite simple,

In the NAS PDU of the Downlink NAS Transport message from the MME to the UE, is the Time Zone & Time field, which contains (unsuprisingly) the Timezone and Time.

Time is provided in UTC form with the current Timezone to show the offset.

This means that in the configuration for each TAC on your MME, you have to make sure that the eNBs in that TAC have the Timezone set for the location of the cells in that TAC, which is especially important when working across timezones.

There is no parameter for the date/time when Daylight savings time may change. But as soon as a UE goes Idle and then comes out of Idle mode, it’ll be given the updated timezone information, and during handovers the network time is also provided.
This means if you were using your phone at the moment when DST begins / ends you’d only see the updated time once the UE toggles into/out of Idle mode, or when performing a tracking-area update.

Diameter Agents

Let’s take a look at each of the common Diameter agent variants in use today:

Diameter Relay Agent / Diameter Routing Agent (DRA)

This is the simplest of the Diameter agents, but also probably the most common. The Diameter Relay agent does not look at the contents of the AVPs, it just routes messages based on the Application ID or Destination realm.

A Diameter Relay Agent does not change any AVPs except routing AVPs.

DRAs are transaction aware, but not dialog aware. This means they know if the Diameter request made it to the destination, but have no tracking of getting a response.

DRAs are common as a central hub for all Diameter hub in a network. This allows for a star topology where every Diameter service connects to a central DRA (typically two DRAs for redundancy) for a central place to manage Diameter routing, instead of having to do a full-mesh topology, which would be a nightmare on larger networks.

I recently wrote about creating a simple but unstable DRA with Kamailio.

Diameter Edge Agent

A Diameter Edge Agent is a special DRA that sits on the border between two networks and acts as a gateway between them.

Imagine a roaming exchange scenario, where each operator has to expose their core Diameter servers or DRAs to all the other operators they have roaming agreements with. Like we saw with the DRA to do a full-mesh style connection arrangement would be a mess, and wouldn’t allow internal changes inside the network without significant headaches.

Instead by putting a Diameter Edge Agent at the edge of the network, the operators who wish to access our Diameter information for roaming, only need to connect to a single point, and we can change whatever we like on the inside of the network, adding and removing servers, without having to update our roaming information (IR 21).

We can also strictly enforce security policies on rate limits and admission control, centrally, for all connections in from other operators.

Diameter Proxy Agent

The Diameter Proxy Agent does everything a DRA does, and more!

The Diameter Proxy Agent is application aware, meaning it can decode the AVPs and make decisions based upon the contents of the AVPs. It’s also able to edit / add / delete AVPs and Sub-AVPs.

These are useful for interconnect scenarios where you might need to re-write the value of an AVP, or translate a realm etc, on a Diameter request/response journey.

Diameter Translation Agent

Diameter Translation agents are used for translating between protocols, for example Diameter into MAP for GSM authentication, or into HTTP for 5G authentication.

For 5GC a new network element – the “Binding Support Function” (BSF) is introduced to translate between HTTP for 5G and Diameter for LTE, however this can be thought of as another Diameter Translation Agent.

SCTP Parameter Tuning

There’s a lot to like about SCTP. No head of line blocking, MTU issues, sequenced, acknowledged delivery of messages, not to mention Multi-Homing and message bundling.

But if you really want to get the most bang for your buck, you’ll need to tune your SCTP parameters to match the network conditions.

While tuning the parameters per-association would be time consuming, most SCTP stacks allow you to set templates for SCTP parameters, for example you would have a different set of parameters for the SCTP stacks inside your network, compared to SCTP stacks for say a roaming scenario or across microwave links.

IETF kindly provides a table with their recommended starting values for SCTP parameter tuning:

RTO.Initial3 seconds
RTO.Min1 second
RTO.Max60 seconds
Max.Burst4
RTO.Alpha1/8
RTO.Beta1/4
Valid.Cookie.Life60 seconds
Association.Max.Retrans10 attempts
Path.Max.Retrans5 attempts (per destination address)
Max.Init.Retransmits8 attempts
HB.interval30 seconds
HB.Max.Burst1
IETF – RFC4960: SCTP – Suggested Protocol Parameter Values

But by adjusting the Max Retrans and Retransmission Timeout (RTO) values, we can detect failures on the network more quickly, and reduce the number of packets we’ll loose should we have a failure.

We begin with the engineered round-trip time (RTT) – that is made up of the time it takes to traverse the link, processing time for the remote SCTP stack and time for the response to traverse the link again. For the examples below we’ll take an imaginary engineered RTT of 200ms.

RTO.min is the minimum retransmission timeout.
If this value is set too low then before the other side has had time to receive the request, process it and send a response, we’ve already retransmitted it.

This should be set to the round trip delay plus processing needed to send and acknowledge a packet plus some allowance for variability due to jitter; a value of 1.15 times the Engineered RTT is often chosen

So for us, 200 * 1.15 = 230ms RTO.min value.

RTO.max is the maximum amount of time we should wait before transmitting a request.
Typically three times the Engineered RTT.

So for us, 200 * 3 = 600ms RTO.min value.

Path.Max.Retransmissions is the maximum number of retransmissions to be sent down a path before the path is considered to be failed.
For example if we loose a transmission path on a multi-homed server, how many retransmissions along that path should we send until we consider it to be down?

Values set are dependant on if you’re multi-homing or not (you can be more picky if you are) and the level of acceptable packet loss in your transmission link.

Typical values are 4 Retransmissions (per destination address) for a Single-Homed association, and 2 Retransmissions (per destination address) for a Multi-Homed association.

Association.Max.Retransmissions is the maximum number of retransmissions for an association. If a transmission link in a multi-homed SCTP scenario were to go down, we would pass the Path.Max.Retransmissions value and the SCTP stack would stop sending traffic out that path, and try another, but what if the remote side is down? In that scenario all our paths would fail, so we need another counter – Path.Max.Retransmissions to count the total number of retransmissions to an association / destination. When the Association.Max.Retransmissions is reached the association is considered down.

In practice this value would be the number of paths, multiplied by the Path.Max.Retransmissions.

Australia’s fake Phone Numbers

For TV, Books, Movies, etc, ACMA have allocated a series of fake phone numbers that can be used, that will not connect,

Check out this great Tom Scott video on the topic of fake phone numbers, for the why.

In Australia they are:

  • Premium Rate Number 1900 654 321
  • Central East (covering NSW and ACT) / (02) 5550 xxxx and (02) 7010 xxxx
  • South East (covering VIC and TAS) / (03) 5550 xxxx and (03) 7010 xxxx
  • North East (covering QLD) / (07) 5550 xxxx and (07) 7010 xxxx
  • Central West (covering SA, WA and NT) / (08) 5550 xxxx and (08) 7010 xxxx
  • Some (but not all) mobile starting with 0491 570 xxx
  • Some (but not all) Freephone / Local rate starting with 1800 975 xxx

More info on ACMA Website

IMS Routing with iFCs

SIP routing is complicated, there’s edge cases, traffic that can be switched locally and other traffic that needs to be proxied off to another Proxy or Application server. How can you define these rules and logic in a flexible way, that allows these rules to be distributed out to multiple different network elements and adjusted on a per-subscriber basis?

Enter iFCs – The Initial Filter Criteria.

iFCs are XML encoded rules to define which servers should handle traffic matching a set of rules.

Let’s look at some example rules we might want to handle through iFCs:

  • Send all SIP NOTIFY, SUBSCRIBE and PUBLISH requests to a presence server
  • Any Mobile Originated SMS to an SMSc
  • Calls to a specific destination to a MGC
  • Route any SIP INVITE requests with video codecs present to a VC bridge
  • Send calls to Subscribers who aren’t registered to a Voicemail server
  • Use 3rd party registration to alert a server that a Subscriber has registered

All of these can be defined and executed through iFCs, so let’s take a look,

iFC Structure

iFCs are encoded in XML and typically contained in the Cx-user-data AVP presented in a Cx Server Assignment Answer response.

Let’s take a look at an example iFC and then break down the details as to what we’re specifying.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

Each rule in an iFC is made up of a Priority, TriggerPoint and ApplicationServer.

So for starters we’ll look at the Priority tag.
The Priority tag allows us to have multiple-tiers of priority and multiple levels of matching,
For example if we had traffic matching the conditions outlined in this rule (TriggerPoint) but also matching another rule with a lower priority, the lower priority rule would take precedence.

Inside our <TriggerPoint> tag contains the specifics of the rules and how the rules will be joined / matched, which is what we’ll focus on predominantly, and is followed by the <ApplicationServer> which is where we will route the traffic to if the TriggerPoint is matched / triggered.

So let’s look a bit more about what’s going on inside the TriggerPoint.

Each TriggerPoint is made up of Service Point Trigger (SPTs) which are individual rules that are matched or not matched, that are either combined as logical AND or logical OR statements when evaluated.

By using fairly simple building blocks of SPTs we can create a complex set of rules by joining them together.

Service Point Triggers (SPTs)

Let’s take a closer look at what goes on in an SPT.
Below is a simple SPT that will match all SIP requests using the SIP MESSAGE method request type:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

So as you may have guessed, the <Method> tag inside the SPT defines what SIP request method we’re going to match.

But Method is only one example of the matching mechanism we can use, but we can also match on other attributes, such as Request URI, SIP Header, Session Case (Mobile Originated vs Mobile Terminated) and Session Description such as SDP.

Or an example of a SPT for anything Originating from the Subscriber utilizing the <SessionCase> tag inside the SPT.

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SessionCase>0</SessionCase>
        </SPT>

Below is another SPT that’s matching any requests where the request URI is sip:[email protected] by setting the <RequestURI> tag inside the SPT:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

We can match SIP headers, either looking for the existence of a header or the value it is set too,

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SIPHeader>
              <Header>To</Header>
              <Content>"Nick"</Content>
            </SIPHeader>
        </SPT>

Having <Header> will match if the header is present, while the optional Content tag can be used to match

In terms of the Content this is matched using Regular Expressions, but in this case, not so regular regular expressions. 3GPP selected Extended Regular Expressions (ERE) to be used (IEEE POSIX) which are similar to the de facto standard PCRE Regex, but with a few fewer parameters.

Condition Negated

The <ConditionNegated> tag inside the SPT allows us to do an inverse match.

In short it will match anything other than what is specified in the SPT.

For example if we wanted to match any SIP Methods other than MESSAGE, setting <ConditionNegated>1</ConditionNegated> would do just that, as shown below:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

And another example of ConditionNegated in use, this time we’re matching anything where the Request URI is not sip:[email protected]:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

Finally the <Group> tag allows us to group together a group of rules for the purpose of evaluating.
We’ll go into it more in in the below section.

ConditionTypeCNF / ConditionTypeDNF

As we touched on earlier, <TriggerPoints> contain all the SPTs, but also, very importantly, specify how they will be interpreted.

SPTs can be joined in AND or OR conditions.

For some scenarios we may want to match where METHOD is MESSAGE and RequestURI is sip:[email protected], which is different to matching where the METHOD is MESSAGE or RequestURI is sip:[email protected].

This behaviour is set by the presence of one of the ConditionTypeCNF (Conjunctive Normal Form) or ConditionTypeDNF (Disjunctive Normal Form) tags.

If each SPT has a unique number in the GroupTag and ConditionTypeCNF is set then we evaluate as AND.

If each SPT has a unique number in the GroupTag and ConditionTypeDNF is set then we evaluate as OR.

Let’s look at how the below rule is evaluated as AND as ConditionTypeCNF is set:

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated) as each SPT is in a different Group which leads to “and” behaviour.

If we were to flip to ConditionTypeDNF each of the SPTs are evaluated as OR.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeDNF>1</ConditionTypeDNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated).

Where this gets a little bit more complex is when we have multiple entries in the same Group tag.

Let’s say we have a trigger point made up of:

<SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
<SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

<SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

How would this be evaluated?

If we use ConditionTypeDNF every SPT inside the same Group are matched as AND, and SPTs with distinct are matched as OR.

Let’s look at our example rule evaluated as ConditionTypeDNF:

<ConditionTypeDNF>1</ConditionTypeDNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as AND – So Method is message and Session Case is 0, OR the header “P-Some-Header” is present.

Let’s do another one, this time as ConditionTypeCNF:

<ConditionTypeCNF>1</ConditionTypeCNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as OR – So Method is message OR Session Case is 0, AND the header “P-Some-Header” is present.

Pre-5G Network Slicing

Network Slicing, is a new 5G Technology. Or is it?

Pre 3GPP Release 16 the capability to “Slice” a network already existed, in fact the functionality was introduced way back at the advent of GPRS, so what is so new about 5G’s Network Slicing?

Network Slice: A logical network that provides specific network capabilities and network characteristics

3GPP TS 123 501 / 3 Definitions and Abbreviations

Let’s look at the old and the new ways, of slicing up networks, pre release 16, on LTE, UMTS and GSM.

Old Ways: APN Separation

The APN or “Access Point Name” is used so the SGSN / MME knows which gateway to that subscriber’s traffic should be terminated on when setting up the session.

APN separation is used heavily by MVNOs where the MVNO operates their own P-GW / GGSN.
This allows the MNVO can handle their own rating / billing / subscriber management when it comes to data.
A network operator just needs to setup their SGSN / MME to point all requests to setup a bearer on the MVNO’s APN to the MNVO’s gateways, and presoto, it’s no longer their problem.

Later as customers wanted MPLS solutions extended over mobile (Typically LTE), MNOs were able to offer “private APNs”.
An enterprise could be allocated an APN by the MNO that would ensure traffic on that APN would be routed into the enterprise’s MPLS VRF.
The MNO handles the P-GW / GGSN side of things, adding the APN configuration onto it and ensuring the traffic on that APN is routed into the enterprise’s VRF.

Different QCI values can be assigned to each APN, to allow some to have higher priority than others, but by slicing at an APN level you lock all traffic to those QoS characteristics (Typically mobile devices only support one primary APN used for routing all traffic), and don’t have the flexibility to steer which networks which traffic from a subscriber goes to.

It’s not really practical for everyone to have their own APNs, due in part to the namespace limitations, the architecture of how this is usually done limits this, and the simple fact of everyone having to populate an APN unique to them would be a real headache.

5G replaces APNs with “DNNs” – Data Network Names, but the functionality is otherwise the same.

In Summary:
APN separation slices all traffic from a subscriber using a special APN and provide a bearer with QoS/QCI values set for that APN, but does not allow granular slicing of individual traffic flows, it’s an all-or-nothing approach and all traffic in the APN is treated equally.

The old Ways: Dedicated Bearers

Dedicated bearers allow traffic matching a set rule to be provided a lower QCI value than the default bearer. This allows certain traffic to/from a UE to use GBR or Non-GBR bearers for traffic matching the rule.

The rule itself is known as a “TFT” (Traffic Flow Template) and is made up of a 5 value Tuple consisting of IP Source, IP Destination, Source Port, Destination Port & Protocol Number. Both the UE and core network need to be aware of these TFTs, so the traffic matching the TFT can get the QCI allocated to it.

This can be done a variety of different ways, in LTE this ranges from rules defined in a PCRF or an external interface like those of an IMS network using the Rx interface to request a dedicated bearers matching the specified TFTs via the PCRF.

Unlike with 5G network slicing, dedicated bearers still traverse the same network elements, the same MME, S-GW & P-GW is used for this traffic. This means you can’t “locally break out” certain traffic.

In Summary:
Dedicated bearers allow you to treat certain traffic to/from subscribers with different precedence & priority, but the traffic still takes the same path to it’s ultimate destination.

Old Ways: MOCN

Multi-Operator Core Network (MOCN) allows multiple MNOs to share the same active (tower) infrastructure.

This means one eNodeB can broadcast more than one PLMN and server more than one mobile network.

This slicing is very coarse – it allows two operators to share the same eNodeBs, but going beyond a handful of PLMNs on one eNB isn’t practical, and the PLMN space is quite limited (1000 PLMNs per country code max).

In Summary:
MOCN allows slicing of the RAN on a very coarse level, to slice traffic from different operators/PLMNs sharing the same RAN.

Its use is focused on sharing RAN rather than slicing traffic for users.

Adding Vlans to VMware Workstation

Just discovered you can add VLANs to Realtek NICs on Windows PCs,

I have a fairly grunty desktop I use for running anything that needs Windows, running VMware Workstation and occasional gaming,

I do have a big Dell machine running ESXi which supports VLAN tagging and trunking, but I try and avoid using it as it’s deafeningly loud and very power hungry.

Recently as the lab network I use grows and grows I’ve been struggling to run all the VMs running in Workstation as I’ve been running out of IP space and wanting some more separation between networks.

Now I can add VLANs onto the existing NIC using the Realtek Ethernet Diagnostic Utility, and then bridge each of these NICs to the respective VM in Workstation, and the port to the Mikrotik CRS is now a trunk with all the VLANs on it.

Perfect!

Adding support for AMR Codec in FreeSWITCH

If you’re building IMS Networks, the AMR config is a must, but FreeSWITCH does not ship with AMR due to licencing constraints, but has all the hard work done, you just need to add the headers for AMR support and compile.

LibOpenCore has support for AMR which we build, and then with a few minor tweaks to copy the C++ header files over to the FreeSWITCH source directory, and enable support in modules.conf.

Then when building FreeSWITCH you’ve got the AMR Codec to enable you to manage IMS / VoLTE media streams from mobile devices.

Instead of copying and pasting a list of commands to do this, I’ve published a Dockerfile here you can use to build a Docker image, or on a straight Debian Buster machine if you’re working on VMs or Bare Metal if you run the commands from the Dockerfile on the VM / bare metal.

You can find the Dockerfile on my Github here,