Category Archives: Mobile Networks

Importing Global Clutter data into Forsk Atoll

11/07/20255G SA, EUTRAN, GSM, LTE, Mobile Networks, Notes, RF, SoftwareAtoll, Forsk, RF, RF PlanningNick

Clutter data describes real world things on the planet’s surface that attenuate signals, for example trees, shrubs, buildings, bodies of water, etc, etc. There’s also different types of trees, some types of trees attenuate signals more than others, different types of buildings are the same.

Getting clutter data used to be crazy expensive, and done on a per country or even per region basis, until the European Space Agency dropped a global dataset free of charge for anyone to use, that covered the entire planet in a single source of data.

So we can use this inside Forsk Atoll for making our predictions.

First things first we’ll need to create an account with the ESA (This is not where they take astronaut applications unfortunately, it just gives you access to the datasets).

Then you can select the areas (tiles) you want to download after clicking the “Download” tab on the right.

We get a confirmation of the tiles we’re download and we’ll get a ZIP file containing the data.

We can load the whole ZIP file (Without needing to extract anything) into GlobalMapper which loads all the layers.

I found the _Map.tif files the highest resolution, so I’m only exporting these.

Then we need to export the data to GeoTiff for use in Atoll (The specific GeoTiff format ESA produces them in is not compatible with Atoll hence the need to convert), so we export the layers as Raster / Image format.

Atoll requires square pixels, and we need them in meters, so we select “Calculate Spacing in Other Units”.

Then set the spacing to meters (I use 1m to match everything else, but the data is actually only 10m accurate, so you could set this to 10m).

You probably want to set the Export Bounds to just the areas you’re interested in, otherwise the data gets really big, really quickly and takes forever to crunch.

Now for the fancy part, we need to import it into Atoll.

When we import the data we import it as Raster data (Clutter Classes) with a pixel size of 1m.

Alas when we exported the data we’ve lost the positioning information, so while we’ve got the clutter data, it’s just there somewhere on the planet, which with the planet being the size it is, is probably not where you need it.

So I cheat, I start put putting the West and North values to match the values from a Cell Site I’ve already got on the map (I put one in the top left and bottom right corners of the map) and use that as the initial value.

Then – and stick with me, this is very technical – I mess with the values until the maps line up into the correct position. Increase X, decrease Y, dialing it it in until the clutter map lines up with the other maps I’ve got.

Right, now we’ve got the data but we don’t have any values.

Each color represents a clutter class, but we haven’t set any actual height or losses for that material.

To know what each colour means we need to RTFM – ESA WorldCover 2020 User Manual.

Which has a table:

Alas the Map Code does not match with the table in the manual, but the colours do, here’s what mine map to:

Which means when hovering over a layer of clutter I can see the type:

Next we need to populate the heights, indoor and outdoor losses for that given clutter. This is a little more tricky as it’s going to vary geography to geography, but there’s indicative loss numbers available online pretty easily.

Once you’ve got that plugged in you can run your predictions and off you go!

Legacy BTS Site manager on Linux

27/06/2025EUTRAN, LTE, Mobile Networks, Notes, SoftwareBTS Site Manager, EUTRAN, Java, LTE, Nokia, RANNick

Another post in the “vendors thought Java would last forever but the web would just a fad” series, this one on getting Nokia BTS Site Manager (which is used to administer the pre-Airscale Nokia base stations) running on a modern Linux distro.

For starters we get the installers (you’ll need to get these from Nokia), and install openjdk-8-jre using whichever package manager your distro supports.

Once that’s installed, then extract the installer folder (Like BTS Site Manager FL18_BTSSM_0000_000434_000000-20250323T000206Z-001.zip).

Inside the extracted folder we’ve got a path like:

BTS Site Manager FL18_BTSSM_0000_000434_000000-20250323T000206Z-001/BTS Site Manager FL18_BTSSM_0000_000434_000000/C_Element/SE_UICA/Setup

The Setup folder contains a bunch of binaries.

We make these executable:

chmod +x BTSSiteEM-FL18-0000_000434_000000*

Then run the binary:

sudo ./BTSSiteEM-FL18-0000_000434_000000_x64.bin

By default it installs to /opt/Nokia/Managers/BTS\ Site/BTS\ Site\ Manager

And we’re done. Your OS may or may not have built a link to the app in your “start menu” / launcher.

You can use one BTS manager to manage several different versions of software, but you need the definitions for those software loaded.

If you want to load the Releases for other versions (Like other FLF or FL releases) the simplest way is just to install the BTS site manager for those versions and just use the latest, then you’ll get the table of installed versions in the “About” section that you can administer.

Kamailio Bytes: KEMI and UAC Module – Event Route

20/06/2025IMS / VoLTE, Kamailio, Mobile Networks, Notes, Python, VoIPKamailio, Kamailio Bytes, KEMI, Python, UACNick

The UAC module is super handy for creating and sending SIP requests from Kamailio. It could be triggered via HTTP requests using xHTTP, other SIP messages or on a scheduled basis like with Rtimer.

More and more I’ve been using KEMI to allow me to write Python based Kamailio dialplans, do to all sorts of funky stuff.

The UAC module can handle the replies to requests it originated, it’s handled through the event route blocks in the native Kamailio diaplan with

event_route[uac:reply] {}

But that doesn’t exist inside KEMI.

But inside our kamailio.cfg we can specify the event callback route:

loadmodule "uac"
modparam("uac", "event_callback", "ksr_uac_event")

Then in our Kemi code (mine is in Python) we can pick it up with:

    def ksr_uac_event(self, msg, evname):
        KSR.info("===== uac module triggered event: " + evname + "\n")
        return 1

And that’s it!

Presenting the caller Name in IMS

06/06/2025IMS / VoLTE, Mobile Networks, RFCs & Standards, VoIPCaller ID, IMS, VoIP, VoLTENick

SIP has got a multitude of ways of showing Caller ID, PAI, R-PAI, From, even Contact, but the other day I got a tip (Thanks John!) that you can set a name as the Caller ID in the “Username field “display name” part of the P-Asserted-Identity for the leg from the TAS to the UE, and it’ll show up on the phone, and they’re right.

For example I put:

P-Asserted-Identity: "Nick Jones" <sip:[email protected]>

And lo and behold when I called a test phone on my desk (A Samsung IMS debug phone) here’s what I saw:

There are no contacts defined in this phone, that name is just coming from the SIP INVITE that goes to the phone.

Support for this feature is hit-and-miss on different IMS stacks on different phones, and of course is Carrier Bundle dependent, but it does work.

One thing that it doesn’t do is show the name in the call history, and if you go to “Add as Contact” it still makes you enter the name, clearly that’s not linked in, but it’s a kinda neat feature.

MBR & GBR Values in Bearer Level QoS

23/05/20255G SA, EUTRAN, LTE, Mobile Networks, UncategorizedEPC, EUTRAN, GTP, GTP-C, GTPv2, GTPv2C, LTENick

The other day I had a query about a roaming network that was sending Bearer Level QoS parameters in the Create Session Request to 0Kbps, up and down rather than populating the MBR values.

I knew for Guaranteed Bit Rate bearers that this was of course set, but for non GBR bearers (QCI 5 to 9) I figured this would be set the to MBR, but that’s not the case.

So what gives?

Well, according to TS 29.274:

For non-GBR bearers, both the UL/DL MBR and GBR should be set to zero.

So there you have it, if it’s not a QCI 1-4 bearer then these values are always 0.

RAN Builds – Can we just get the same connectors thanks?

09/05/2025EUTRAN, LTE, Mobile Networks, RFBuild, RAN, RFNick

Concrete, steel and labor are some of the biggest costs in building a cell site, and yet all the focus on cost savings for cell sites seems to focus on the RAN, but the actual RAN equipment isn’t all that much when you put it into context.

I think this is mostly because there aren’t folks at MWC promoting concrete each year.

But while I can’t provide any fancy tricks to make towers stronger or need less concrete for foundations, there’s some potential low-hanging fruit in terms of installation of sites that could save time (and therefor cost) during network refreshes.

I don’t think many folks managing the RAN roll-outs for MNOs have actually spent a week with a tower crew rolling this stuff out. It’s hard work but a lot of it could be done more efficiently if those writing the MOPs and deciding on the processes had more experience in the field.

Disclaimer: I’m primarily a core networks person, this is the job done from a comfy chair. This is just some observations from the bits of work I’ve done in the field building RAN.

Standardize Power Connectors

Currently radio units from the biggest RAN vendors (Ericsson, Nokia, Huawei, ZTE & Samsung) each use different DC power connectors.

This means if you’re swapping from one of these vendors to another as part of a refresh, you need new power connectors.

If you’re lucky you’re able to reuse the existing DC power cables on the tower, but that means you’re up on a tower trying to re-terminate a cable which is a fiddly job to do on the ground, and far worse in the air. Or if you’re unlucky you don’t have enough spare distance on the DC cables to do the job, then you’re hauling new DC cables up a tower (and using more cables too).

The Nokia and Ericsson connectors are very similar, and with a pair of side cutters you can mangle an Ericsson RRU connector to work on a Nokia RRU and visa-versa.

While Huawei and ZTE have adopted for push connectors with the raw cables behind a little waterproof door.

If we could just settle on one approach (either is fine) this could save hours of install time on each cell site, extrapolate that across thousands of cell sites for each network, and this is a potentially large saving.

Standardize Fiber Cables

The same goes for waterproofing fibre, Ericsson has a boot kit that gets assembled inline over the connectors, Nokia has this too, as well as a rubber slide over cover boot on pre-term cables.

Again, the cost is fairly minimal, but the time to swap is not. If we could standardize a break out box format on the top of the tower and a LC waterproofing standard, we could save significant time during installs, and as long as you over-provision the breakout (The cost difference between a 6 core fiber vs a 48 core fibre is a few dollars), you can save significant time having to rerun cables.

Yes, we’ve all got horror stories about someone over-bending fiber, and if you reused fibre between hardware refresh cycles, but modern fiber is crazy tough so the chances of damaging the reused fiber is pretty slim, and spare pairs are always a good thing.

Preterm DC Cables

Every cell site install features some poor person squatting on the floor (if they’re savvy they’ve got a camping stool or gardening kneeling mat) with a “gut buster” crimping tool swaging on connectors for the DC lugs.

If we just used the same lugs / connectors for all the DC kit inside the cell sites, we could have premade DC cables in various lengths (like everyone does with Ethernet cables now), rather than making each and every cable off a spool (even if it is a good ab workout).

I dunno, I’m just some Core network person who looks at how long all this takes and wonders if there’s a way it could be done better, am I crazy?

What’s the point of Subscribe in IMS – Does it do anything useful?

02/05/20255G SA, IMS / VoLTE, Kamailio, LTE, Mobile Networks, VoIPIMS, LTE, SIP, Subscribe, VoLTENick

Nope – it doesn’t do anything useful. So why is it there?

The SUBSCRIBE method in SIP allows a SIP UAC to subscribe to events, and then get NOTIFY messages when that event happens.

In a plain SIP scenario (RFC 3261), we can imagine an IP Phone and a PBX scenario. I might have “Busy Lamp Field” aka BLF buttons on the screen of my phone, that change colour when the people I call often are themselves on calls or on DND, so I know not to transfer calls to them – This is often called the “presence” scenario as it allows us to monitor the presence of another user.

At a SIP level, this is done by sending a SUBSCRIBE to the PBX with the information about what I’m interested in being told about (State changes for specific users) and then the PBX will send NOTIFY messages when the state changes.

But in IMS you’ll see SUBSCRIBE messages every time the subscriber registers, so what are they subscribing for?

Well, you’re just subscribing to your own registration status, but your phone knows your own registration status, because it’s, well, the registration status of the phone.

So what does it achieve? Nothing.

The idea was in a fixed-mobile-convergence scenario (keeping in mind that’s one of the key goals from the 2008 IMS spec) you could have the BLF / presence functionality for fixed subscribers, but this rareley happens.

For the past few years we’ve just been sending a 200 OK to SUBSCRIBE messages to the IMS, with a super long expiry, just to avoid wasting clock cycles.

GTPv2 Instance IDs

18/04/20255G SA, EPC, LTE, Mobile Networks, Notes, RFCs & StandardsEPC, GTP, GTP-C, GTPv2, GTPv2C, LTENick

I was diffing two PCAPs the other day trying to work out what’s up, and noticed the Instance ID on a GTPv2 IE was different between the working and failing examples.

So what does it denote, well from TS 129.274:

If more than one grouped information elements of the same type, but for a different purpose are sent with a message,
these IEs shall have different Instance values.

So if we’ve got two IEs of the same IE type (As we often do; F-TEIDs with IE Type 87 may have multiple instances in the same message each with different F-TEID interface types), then we differentiate between them by Instance ID.

The only exception to this rule is where we’ve got the same data, so if you’ve got one IE with the exact same values and purpose that exists twice inside the message.

It’s not Rocket Science – Tracking performance of OneWeb terminals

11/04/2025Mobile Networks, Notes, Python, RF, SoftwareEutelsat, HL1120W, Hughes, OneWeb, satelliteNick

Last year we deployed some Hughes HL1120W OneWeb terminals in one of the remote cellular networks we support.

Unfortunately it was failing to meet our expectations in terms of performance and reliability – We were seeing multiple dropouts every few hours, for between 30 seconds and ~3 minutes at a time, and while our reseller was great, we weren’t really getting anywhere with Eutelsat in terms of understanding why it wasn’t working.

Luckily for us, Hughes (who manufacture the OneWeb terminals) have an unprotected API (*facepalm*) from which we can scrape all the information about what the terminal sees.

As that data is in an API we have to query, I knocked up a quick Python script to poll the API and convert the data from the API into Prometheus data so we could put it into Grafana and visualise what’s going on with the terminals and the constellation.

After getting all this into Grafana and combining it with the ICMP Blackbox exporter (we configured Blackbox to send HTTP requests and ICMP pings out of each of the different satellite terminals we had (a mix of OneWeb and others)) we could see a pattern emerging where certain “birds” (satellites) that passed overhead would come with packet loss and dropouts.

It was the same satellites each time that led to the drops, which allowed us to pinpoint to say when we see this satellite coming over the horizon, we know there’s going to be some packet loss.

In the end Eutelsat acknowledged they had two faulty satellites in the orbit we are using, hence seeing the dropouts, and they are currently working on resolving this (but that actually does require rockets, so we’re left without a usable service for the time being) but it was a fun problem to diagnose and a good chance to learn more about space.

Packet loss on the two OneWeb terminals (Not seen on other constellation) correlated with a given satellite pass

I’ve put the source code for the Hughes terminal Prometheus Exporter onto Github for anyone to use.

The repo has instructions for use and the Grafana templates we used.

At one point I started playing with the OneWeb Ephemeris data so I could calculate the azimuth and elevation of each of the birds from our relative position, and work out distances and angles from the terminal. The maths was kinda fun, but oddly the datetimes in the OneWeb ephemeris data set seems to be about 10 years and 10 days behind the current datetime – Possibly this gives an insight into OneWeb’s two day outage at the start of the year due to their software not handling leap years.

Despite all these teething issues I’m still optimistic about OneWeb, Kupler and Qianfan (Thousand Sails) opening up the LEO market and covering more people in more places.

Update: Thanks to Scott via email who sent this:
One note, there’s a difference between GPS time and Unix time of about 10 years 5 days. This is due to a) the Unix epoch starting 1970-01-01 and the gps epoch starting 1980-01-05 and b) gps time is not adjusted for leap seconds, and ends up being offset by an integer number of seconds.

Update: clarkzjw has published an open source tool for visualizing the pass data https://github.com/clarkzjw/LEOViz

Demystifying SS7 & Sigtran – Part 8 – M3UA

04/04/2025Mobile Networks, RFCs & StandardsM3UA, SIGTRAN, SS7Nick

This is part of a series of posts looking into SS7 and Sigtran networks. We cover some basic theory and then get into the weeds with GNS3 based labs where we will build real SS7/Sigtran based networks and use them to carry traffic.

In our last post we talked about moving MTP2 onto IP and the options available.

When we split the SS7 stack onto IP we don’t need to do this at the Data Link Layer, we can instead do it higher up the stack. This is where we introduce M3UA.

MTP Level 3 User Adaptation Layer – M3UA replaces MTP3 with an IP based equivilent.

This is different to how we’d handle it with M2UA or M2PA where MTP3 remained unchanged, when you deploy M3UA links, there is no MTP3 anymore – it’s replaced with an IP based protocol transported via SCTP designed to do the same role as MTP3 but over IP – That protocol is M3UA.

This means the roles handled in MTP3 such as managing which available point codes are reachable over which linksets, failover, load sharing and reporting are all now handled by the M3UA protocol, because we loose the ability to just rely on MTP3 to do those things like we did when using lower layer protocols like M2PA or MTP2.

So what do you need to know to use M3UA?

Well, the first concept we need to wrap our head around is that we no longer have linksets or pointcode routes (We do, but they’re different) but instead have Application Servers, Application Server Processes and Routing Contexts.

If you’re following along at home and you want to hook your M3UA compatible AS into the Cisco ITP STP, I’ll be including the commands as we go along. The first step on the Cisco (assuming you’ve already defined the basic SS7 config) is to create a local M3UA instance:

cs7 m3ua 2905
 local-ip 10.179.2.154

With that out of the way, let’s cover ASPs & ASs (hehe – Ass).

You can think of the Application Server Process (ASP) as the client end of the “link set” of our virtual SS7 stack, it handles getting the SCTP association up, what IPs, ports and SCTP parameters are needed, and listens and communicates based on that, here’s an example on the Cisco ITP:

cs7 asp NickLab_ASP 2905 2905 m3ua
 remote-ip 10.0.1.252
 remote-ip 172.30.1.12

The ASP connects to a Signaling Gateway (In practical terms this is an STP).

That’s simple enough and now we can do our SCTP handshake, but nothing is going to get routed without introducing the Application Server (AS) itself, which is where we configure the routing and link to 1 or more ASPs and how we want to share traffic among them.

Point codes are still used in M3UA for sending traffic from an M3UA AS but it’s not what controls the routing to an AS.

That probably sounds confusing, I send traffic based on point code, but the traffic does’t get to the M3UA AS via point code? What gives?

Well, first we’ve got to introduce the Routing Context in M3UA.

Routing Contexts define what destinations are served by this AS.
As an example, on our STP we’ll define a Routing Context inside the ITP inside the AS section, in this example we’re creating Routing Key 1 which will handle traffic to the point code 5.123.2, but we could equally define a routing-key for a given Global Title address too.

cs7 instance 0 as NickPC m3ua
 routing-key 1 5.123.2 
 asp NickLab_ASP
 traffic-mode broadcast

Notice we didn’t define Routing Key X -> Point Code Y -> ASP Z ? That’s because we may have one or more ASPs associated with this (remember ASPs are kinda like Linksets).

For example the Point Code for an HLR might have multiple ASPs behind it, with traffic-mode loadshare to load balance the requests among all the HLRs.

So what does it look like to bring this up? Let’s take a look at a link coming up.

Under the hood we’ve got the SCTP connection / handshake like normal, then our ASP sends an ASPUP (ASP is in state “up”) message to the Signaling Gateway (STP).

Now our ASP has told the Signaling Gateway it’s there, so our Signaling Gateway returns an ASPUP_ACK to confirm it’s got the message and the current AS state is inactive.

And with that our ASP is in “an up state, “inactive” state; it’s connected to the STP, but without any ASes associated with our ASP, it’s akin to having link layer but nothing else.

State in the STP showing an ASP without an active AS

So next our ASP will send an ASPAC (ASP Active) message for the given routing contexts the AS serves, in this case, Routing Context 1.

And with that, the Signaling Gateway (STP) send back an an ASPAC_ACK (ASP Active Ack) to confirm it’s got it, and the state changes.

ASP Active Ack Message from SG (STP) to ASP

Because of how MTP3 worked advertising available point codes, the SG (STP) needs to tell the AS/ASP how it sees the world and the state of the connection.

This is done with a NTFY (Notify) message from the STP/SG to indicate the state has changed to active, and what destinations are reachable, and at this point, we’re good to start handling traffic for that Routing Context.

And with that, we can start handling M3UA traffic.

There’s only one more key dialog to wrap your heads around that’s the DAVA and DUNA messages.

DAVA is Destination Available, and DUNA is Destination Unavailable. The SG (STP) will send these messages to ASP/AS every time the reachability of a neighboring point code changes.

That’s the basics covered, I’m in the process of developing an HLR (Running with MAP/TCAP/SCCP/M3UA) extension for PyHSS, which in the future will allow us to experiment with more M3UA endpoints.

Automatic Cell Planning with Atoll: Site Selection

28/03/2025Mobile Networks, Notes, SoftwareAtoll, Cell Planning, Forsk, RF PlanningNick

One of the really neat features about using automated RF planning tools like Forsk Atoll is you’re able to get it to automatically try out tweaks and look at how that impacts performance.

In the past you’d adjust something, run the simulation again, look at the results and compare to what you had before,

Atoll’s ACP (Automatic Cell Planning) module allows you to automate this, and in most cases, it does a better job than I would!

Today we’ll look at Cell Site Selection in Atoll.

To begin with we’ll limit the computation area down to a polygon we draw around the area in question,

In the Geo tab we’ll select Zones -> Computation Zone and select Edit

We’ll create a new Polygon and draw around the area we are going to analyze. You can automate this step based on population levels, etc, if you’ve got that data present.

So now we’ve set our computation area to the selection, but if we didn’t do this, we’d be computing for the whole world, and that might take a while…

Generating Candidate Sites

Atoll sucks at this, I’ve found if your computation zone is set, and it’s not a rectangle, bad things happen, so I’ve written a little script to generate candidates for me.

Creating an new ACP Job

From the Network tab, right click on ACP Automatic Cell Planning and select New

Optimization Tab

Before we can define all the specifics of what we’re looking to plan / improve, we need to set some limits on the software itself and tell it what we’re looking to improve.

The resolution defines how precise the results should be, and the iterations defines how many changes the software should run through.

The higher the number of iterations, the better the results, but it’s not linear – The improvement between 1000 iterations and 1,000,000,000 iterations is typically pretty minor, and this is because ACP works kind of a “getting warmer” philosophy, where it changes a value up or down, looks at the overall result and then if the result was better, changes the value again until it stops getting better.

As I’m working in a fairly small area I’m going to set 100 iterations and a 50m resolution.

In the optimization tab we can also set constraints, for example we’re looking at where to place cell sites in an area, and as far as Atoll is concerned if we just throw hundreds of sites at an area we’ll have pretty good results, but the economics of that doesn’t work, so we can set constraints, for example for site selection we may want to set the max number of cell sites. As we are importing ~5k candidate locations, we probably don’t want to build 5k cell sites 20m apart, so set this to be a reasonable number for your geography.

When using ACP for Optimization as we can see later on, we can also set cost constraints regarding the cost to make changes, but for now this is just going to pick best cell sites locations for us.

Objectives Tab

Next up we’ll need to setup Automatic Cell Plannings’ objectives.

For ACP to be an effective tool we need to define what we’re looking for in terms of success, you can’t just throw it some values and say “Make it better” – we need to define what parameters we’re looking to improve. We do this by setting Objectives.

Your objectives are going to be based on your needs and wants, but for this example we’re building greenfield networks, so want to offer coverage over an area, as well as good RSRP and RSRQ, so we will set the objectives to Coverage of 95% of the Computation Zone for this post, with a secondary objective of increasing RSRP and RSRQ.

But today I’m modeling for coverage, so let’s set that:

As we’re planning for LTE we need to set the UE parameters, as I’m planning for a mobile network, I’ll need to set the service type and terminal.

Reconfiguration

Now we’ve defined the Objectives, it’s now time to define what values ACP can mess with to try and achieve these objectives, for some ACP runs you may be adjusting tilts or azimuths, swapping out antennas, etc, but today we’re looking for where we can put cell sites to be the most effective to serve our target area.

Now we import our candidate list. This might be a list of potential towers you can use, or in my case, for something greenfield, I’m just importing a list of points on a map every X meters to find the best locations to place towers.

From the “Reconfiguration”, we’ll select “Setup” to add the sites we want to evalute.

Atoll has “Automatic Candidate Positioning” which allows it to generate pins on the map, but I’ve not had any luck with it, instead I’m importing a list of candidates I’ve generated via a little Python script, so I’ll select “Import from File”.

Pick my file and set the parameters for importing the data like so.

Now we’ve got candidates for cell sites defined, we set the station template to populate and then we’re good to go.

Running ACP

Once you’ve tweaked all your ACP values as required, we can run the ACP job,

As ACP runs you’ll see a graph showing the objectives and the levels it needs to reach to satisfy them, this step can take a super dooper long time – Especially if your computation zone is large or your number of candidates is large.

But eventually we’ll be a lot older and wearier, but ACP will have completed, and we can checkout the Optimization it’s created.

In my case the objectives failed to be met, but that’s OK for me,

One it’s completed the Changes tab outlines the recommended changes, and the Objectives outlines how this has performed against the criteria we outlined at the start, and if we’re happy with the result, we can Commit the changes to put them on the map from the commit tab.

With that done I weed out the sites in impractical locations, the the ones in the sea…

Now we’ve got the sites plugged in, the next thing we’ll start doing is optimizing them.

When we’re dealing with greenfield builds like we are today, the “Move to highest location with X Meters” function is super useful. If you’ve got a high point on a property, we want to build our tower on the highest point, so the tower is moved to the highest point.

One thing to note is this just plans our grid. It won’t adjust azimuths, downtilts, etc, in one operation. We need to use another ACP operation to achieve that, and that’s the content of a different post!

Call forwarding in SS7/ISUP

14/03/2025GSM, History, Mobile Networks, Notes, RFCs & Standards, VoIPForwarding, ISUP, Redirect, SS7Nick

Had an interesting fault come across my desk the other day; calls were failing when the called party (an SSP we talk to via SS7/ISUP) had an exchange based call forward in place.

In SIP, we can do call forwarding one of two ways, we can send a 302 Redirect or we can generate a new SIP invite.

But in ISUP how is it done?

We’re a SIP based network, but we do talk some SS7/ISUP on the edges, and it was important that we handled this correctly.

I could see in the Address Complete Message (ACM) sent back to our network that there was redirection information here:

We would see the B party SSP release the call as soon as it sent this.

This made me wonder if we, as the originating network, were supposed to redirect to the new B party and send a new Initial Address Message?

After a lot of digging in the ITU Q.7xx docs (I’m not where near as fast at finding information in specs written prior to my birth, than I am with the 3GPP specs) I found my answer – These headers are informational only, the B party SSP is meant to re-target the message, and send us an Alerting or Answer message when it’s done so.

Basic CAMEL Charging Flow

14/02/2025GSM, Mobile Networks, Notes, RFCs & StandardsCAMEL, Charging, GSM, MAP, OCS, Roaming, SS7Nick

CAMEL handles charging in 2G and 3G networks, much like Diameter handles charging in LTE.

CAMEL runs on top of SS7, specifically it sits on top of TCAP, which sits on top of SCCP, which can ride on M3UA or MTP3 (so it sits at the same layer as MAP).

CAMEL is primarily focused on charging for Voice & SMS services, as data generally uses Diameter, so it’s voice and SMS we’ll focus on.

CAMEL is spoken between the MSC (gsmSSF) and the OCS (gsmSCF).

Basic Call State Model

CAMEL is closely related to the Intelligent Network stuff on the 1980s, and steals a lot of it’s ideas from there, unfortunately if you’re to read the CAMEL standard it also implies you were involved in IN stuff and had been born at that point, alas I was neither.

So the key to understanding CAMEL is the Basic Call State Model (BCSM) which is a model of all the different states a call can be in, such as ringing, answered, abandoned, call failed, etc, etc.

Over CAMEL, our OCS can be told by the MSC when a certain event happens; the MSC can tell the OCS, that the call has changed state. For example a BCSM event might indicate the call has hung up, is ringing, cancelled, etc.

Below is the list of all the valid BCSM states:

Basic MO Call with CAMEL

Our subscriber makes an outbound call.

Based on the data the MSC has in it from the HLR, it knows that we should use CAMEL for this call, and it has the SCCP Address of the OCS (gsmSCF) it needs to send the CAMEL messages to.

So the MSC sends an InitialDP message to the OCS (via it’s Global Title Address) to Authorize the call that the user is trying to make.

This is like any other Authorization step for an OCS, which allows the OCS to authorize the call by checking the subscriber is valid, check if they’re allowed to call that destination and they’ve got the balance to do so, etc.

The initialDP (Initial Detection Point) is telling our OCS all about the call event that’s being requested, who’s calling, what number they’ve dialed, where they are in the network (of note especially if they’re roaming), etc, etc.

The OCS runs through it’s own checks to see if it wants to allow the call to proceed by checking if the subscriber has got enough balance, unit reservation, etc, etc, and if it does, the OCS sends back a Continue message to the MSC to allow the call to continue.

Generally the OCS also uses this message as a chance to subscribe to BCSM Events using RequestReportBCSMEventArg so the OCS will get notified by the MSC when the state of the call changes. This means the MSC will tell us when the state of the call changes; events like the call getting answered, disconnected, etc. This is critical so we know when the call gets answered and hung-up, so we can charge correctly.

In the below example, as well as sending the Continue and RequestReportBCSMEventArg the OCS is also setting the ChargingArgs for this call, so the MSC knows who to charge (the caller) set via sendingSide and that the MSC must send an Apply Charging Report (ACR) messages every 300 units (1 unit = 100 ms, so a value of 300 = 300 x 100 milliseconds = 30 seconds) so the OCS keeps track of what’s going on.

`continue` sent by the OCS to the MSC, also including `reportBCSMEvent` and `applyCharging` messages

At this point the call can start to proceed – In ISUP terms the InitialDP is triggered between the Initial Address Message and the Address Complete message is sent after the continue is sent back.

Or in a slightly less appropriate analogy but easier to understand for SIP folks, the InitialDP is sent for INVITE and the 180 RINGING is sent once the continue message is received.

Call is Answered

So at this stage our call can start to ring.

As we’ve subscribed to BCSM events in our last message, the MSC is going to tell us when the call gets answered or the call times out, is abandoned or the sun burns out.

The MSC provides this info a eventReportBCSM, which is very simple and just tells us the event that’s been triggered, in the example below, the call was answered.

These eventReportBCSM are informational from the MSC to the OCS, so the OCS doesn’t need to send anything back, but the OCS does need to mark the call as answered so it can start timing the call.

At this stage, the call is connected and our two parties are talking, but our MSC has been told it needs to send us applyChargingReports every 30 seconds (due to the value of 300 in maxCallPeriodDuration) after the call was connected, so the MSC sends the OCS it’s first applyChargingReport 30 seconds after the call was answered:

applyChargingReport sent by the MSC to the OCS every reporting period

We can calculate the duration of the call so far based on the time of the eventReportBCSM, then the OCS must make a decision of if it should allow the call to continue or not.

For simplicity’s sake, let’s imagine we’re still got a balance in the OCS and the OCS wants the call to continue, the OCS send back an applyCharging message to the MSC in response, and includes the current allowed maxCallPeriodDuration, keeping in mind the value is x100 and in nanoseconds (so this is 30 seconds).

`applyCharging` from the OCS back to the MSC

Perfect, our call is good to go for another 30 more seconds, son in 30 seconds we’ll get another ACR messages from MSC to the OCS to keep it abreast of what’s going on.

Now one of two things is going to happen, either subscriber is going to burn through all of their minutes, and get their call cutoff, or the call will end while they’ve still got balance, let’s look at both scenarios.

Normal Hangup Scenario

When the call ends, we get an applyChargingReport from the MSC to the OCS.

As we’ve subscribed to reportBCSMEvent we get both the applyChargingReport with legActive: False` so we know the call has hungup, and we’ve got an event report to tell us more about the event, in this case a hangup from the Originating Side.

`reportBCSMEvent` and `applyChargingReport` Sent by the MSC to the OCS to indicate the call has ended, note the `legActive` flag is now false

Lastly the OCS confirms by sending a releaseCall to the MSC, to indicate all legs should now terminate.

`releaseCall` Sent by OCS to MSC at the very end

So that’s it!

Obviously there are other flows, such as running out of balance mid-call, rejecting a call, SMS and PBX / VPN services that rely on CAMEL, but hopefully you now understand the basics of how CAMEL based charging looks and works.

If you’re looking for a CAMEL capable OCS or a CAMEL to Diameter or API gateway, get in touch!

Enabling logging on Cisco ITP Signaling Transfer Point

31/01/2025GSM, Mobile Networks, NotesCisco, ITP, SS7, STPNick

Mostly just for my own notes, but when debugging SCCP translation on a Cisco ITP STP, this is probably obvious for folks who are more Cisco focused:

Enabling debug:

debug cs7 m3ua packet
debug cs7 m3ua all
debug cs7 sccp event ALL
debug cs7 sccp gtt-accounting
terminal monitor

Disabling debug:

no debug cs7 m3ua packet
no debug cs7 m3ua all
no debug cs7 sccp event ALL
no debug cs7 sccp gtt-accounting

GTPv2 Source Ports

24/01/2025EPC, LTE, Mobile Networks, Notes, RFCs & StandardsGTP, GTP-C, GTPv2CNick

Ask anyone in the industry and they’ll tell you that GTPv2-C (aka GTP-C) uses port 2123, and they’re right, kinda.

Per TS 129.274 the Create Session Request should be sent to port 2123, but the source port can be any port:

The UDP Source Port for a GTPv2 Initial message is a locally allocated port number at the sending GTP entity.

So this means that while the Destination Port is 2123, the source port is not always 2123.

So what about a response to this? Our Create Session Response must go where?

Create Session request coming from 166.x.y.z from a random port 36225
Going to the PGW on 172.x.y.z port 2123

The response goes to the same port the request came on, so for the example above, as the source port was 36225, the Create Session Response must be sent to port 36225.

Because:

The UDP Destination Port value of a GTPv2 Triggered message and for a Triggered Reply message shall be the value of the UDP Source Port of the corresponding message to which this GTPv2 entity is replying, except in the case of the SGSN pool scenario.

But that’s where the association ends.

So if our PGW wants to send a Create Bearer Request to the SGW, that’s an initial message, so must go to port 2123, even if the Create Session Request came from a random different port.

A tale of two CPRIs

09/01/20255G SA, EUTRAN, GSM, History, LTE, Mobile Networks, RFCs & StandardsOBSAI, OpenRAN, PLMN, RAN, StandardsNick

It was the best of times, it was the worst of times. It was the age of wisdom, it was the age of foolishness. It was the epoch of belief, it was the epoch of incredulity. It was the season of Light, it was the season of Darkness. It was the spring of hope, it was the winter of despair.
A tale of two Cities

When Dickens wrote of Doctor Manette in the 1859, I doubt his intention was to write about the repeating history of RAN fronthaul standards – but I can’t really say for sure.

Setting the Scene

Our story starts with introducing CPRI (Common Public Radio Interface) interface, having been imprisoned in the Bastille of vendor lock in for the better part of twenty years.

Think of CPRI is less of a hard interoperable standard and more like how the Italian and French languages are both derived from Latin; it doesn’t mean that the two languages are the same, but they’ve got the same root and may share some common words and structures.

In practice this means that taking an Ericsson Radio and plugging it into a Huawei Baseband simply won’t work – With CPRI you must use the same vendor for the Baseband and the Radios.

Huawei BBU 3900 Architecture — Image from my post on setting up Huawei Base stations, showing the Huawei Baseband (BBU) connecting to the Huawei Radios (RRUs) via CPRI (in Yellow)

The Unexpected Plot Twist

“Nuts to this” the industry said after being stuck locked between the same radios and baseband for years; we should create a standard so we can mix and match between radio vendors, and even standardize some other stuff that’s been bothering us, so we’ll have a happy world of interoperability.

With kit created that followed this standard, we’d be able to take components from vendor A, B & C, and fit them together like Lego, saving you some money along the way and giving you’ve got a working solution made of “best of breed” components, where everything is interoperable.

*Omnitouch Lego base stations, which also fit together like Lego – Part of the Omnitouch Network Services “swag” from 2024*

So the industry created a group to chart a path for a better tomorrow by standardizing these interfaces.

The group had many industry heavyweights like Nokia, NEC, LG, ZTE and Samsung joining.

The key benefits espoused on their website:

An open market will substantially reduce the development effort and costs that have been traditionally associated with creating new base station product ranges. The availability of off-the-shelf base station modules will enable manufacturers to focus their development efforts on creating further added value within the base station, encouraging greater innovation and more cost-effective products. Furthermore, as product development cycles will be reduced, new base station functions will become available on the market more quickly.
Mission statement of the group

In addition to being able to mix and match radios and basebands from different vendors, the group defined standards for centralized baseband, and interoperable standards, to allow a multi-vendor ecosystem to flourish.

And here’s the plot twist – The text above, was not written about OpenRAN, and it was not written about the benefits of eCPRI.

It was written about Open Base Station Architecture Initiative (OBSAI) and it was written 22 years ago.

*record screech sound*

This image was called "Confused Ernie" but it's clearly Bert...

Standards War you’ve never heard of: OBSAI vs CPRI

When OBSAI was defined it was not without competition; there was another competing fronthaul standard; that’s right, the mustache twirling lowlife from earlier in the story – CPRI.

Supported by Huawei, Nortel, NEC & Ericsson (among others), CPRI took a “gentle parenting” approach to the standards world, in contrast to OBSAI.
Instead of telling all the vendors to agree on an interoperable front haul standard, CPRI just encouraged everyone to implement what their heart told them and what felt right to them.

As it happened, the industry favored the CPRI approach.

If a vendor wanted to add a new “word” in their CPRI “language” to add a new feature, they just went ahead and added it – It didn’t require anyone else to agree with them or changes to a common standard used by the industry, vendors could talk to the kit they made how they wanted.

CPRI has been the defacto non-standard used by all the big kit vendors for the past ~10 years.

The Death of OBSAI & the Birth of OpenRAN’s eCPRI

Why haven’t you heard of OBSAI? Why didn’t the OBSAI standard just serve as the basis for eCPRI – After all the last OBSAI release was less than 5 years before TIP started working on eCPRI publicly.

Did a schism over “uplink performance improvement” options lead to “irreconcilable differences” between parties leading to the breakup of the OBSAI group?

Nope.

Customers (MNOs) didn’t buy OBSAI based equipment in measurably larger quantities than CPRI kit. That’s it.

This meant the vendors invested less in paying teams to further develop the standards, the OBSAI group met less frequently, and in the end, member vendors didn’t bother adding support for OBSAI to new equipment and just used the easier and more flexible CPRI option instead.

At some point someone just stopped paying for the domain renewal and that was it, OBSAI was no more.

This is how the standards body ends, not with a bang, but with a whimper.
T.S. Elliot’s writings on the death of obsai

Those who do not learn from history…

The goals of the OBSAI Group and OpenRAN working groups are almost identical, so what lessons did Marconi, Motorola and Alcatel learn as members of OBSAI that other vendors could learn about OpenRAN strategy?

There are no mentions of OBSAI in any of the information published by OpenRAN advocates, and I’m wondering if folks aren’t aware that history tends to repeat and are ignorant to what came before it, or they’re just not learning lessons from the past?

So what can the OpenRAN industry learn from OBSAI?

Being a nerd, I started detailing the technical challenges, but that’s all window dressing; The biggest hurdle facing CPRI vs eCPRI are the same challenges OBSAI vs CPRI faced a decade prior:

To be relevant, OpenRAN kit has to be demonstrably better than what we have today AND provide a tangible cost saving.

OBSAI failed at achieving this, and so failed to meet it’s other more noble goals.

[At the time of writing this at least] I’d contend that neither of those two criteria have been met by OpenRAN.

What does the future hold for OpenRAN?

Looking into the crystal ball, will OpenRAN and eCPRI go the way of OBSAI, or will someone keep the OpenRAN dream alive?

Today, we’re still seeing the MNOs continue to provide tokenistic investment in OpenRAN. But being a cynic, I’d say the MNOs are feigning interest in OpenRAN products because it’s advantageous for them to do so.

The threat of OpenRAN has proven to be a great stick to beat the traditional vendors with to force them to lower their prices.

Think about the $14 billion USD Ericsson deal with AT&T, if chucking a few million at OpenRAN pilots / trials lead to AT&T getting even a 0.1% reduction in what they’re paying Ericsson, then the numbers would have worked out well in AT&Ts favor.

From the MNOs perspective, the cost to throw the odd pilot or trial to a hungry OpenRAN vendor to keep them on the hook is negligible, but the threat of OpenRAN provides leverage and bargaining power every time it’s contract renewal time with the big RAN vendors.

Already we’ve seen all the traditional RAN vendors move to neutralize this threat by introducing “OpenRAN compatible” equipment and talking up their commitment to openness.

This move by the RAN vendors takes this sting out of the OpenRAN threat, and means MNOs won’t have much reason to continue supporting OpenRAN.

This leaves the remaining OpenRAN vendors like Miss Havisham, forever waiting in their proverbial wedding dresses, having being left at the altar.

Okay, I’m mixing my Dickens’ references here, but it was too good not to.

Appendix

I’ve been enjoying writing more analysis than just technical content, let me know if this is something you’re interested in seeing more of.

I’ve been involved in two big OpenRAN integration projects, both of which went poorly and probably tainted my perspective. Enough time has passed to probably write up how it all went with the vendor names removed, but that’s a post for another time!

If you wanted to learn more about OBSAI Archive.org has their old website available for reading.

TFTs & Create Bearer Requests

27/12/20245G SA, EPC, IMS / VoLTE, LTE, Mobile Networks, Notes, RFCs & StandardsCharging Rule, Diameter, EPC, Gx, TFT, Traffic Flow TemplateNick

What is included in the Charging Rule on Gx ultimately turns into a Create Bearer Request on GTPv2-C.

But the mapping it’s always obvious, today I got stuck on the difference between a Single remote port type, and a single local port type, thinking that the Packet Filter Direction in the TFT controlled this – It doesn’t – It’s controlled by the order of your Traffic Flow Template rule.

Input TFT:

"permit out 17 from any 50000 to any"

Leads to Packet filter component type identifier: Single remote port type

Whereas a TFT of:

permit out 17 from any to any 50000

Leads to Packet filter component type identifier: Single local port type (64)

Flash SMS Messages

13/12/2024IMS / VoLTE, Mobile Networks, Notes, RFCs & StandardsIMS, LTE, SIP, SMS, VoIP, VoLTENick

Stumbled across these the other day, while messing around with some values on our SMSc.

Setting the Data Coding Scheme to 16 with GSM7 encoding flags the SMS as “Flash message”, which means it pops up on the screen of the phone on top of whatever the user is doing.

While reading a quality telecom blog bam! There’s the flash SMS popping over whatever I was reading.

Oddly while there’s plenty of info online about Flash SMS, it does not appear in the 3GPP specifications for SMS.

Turns out they still work, move over RCS and A2P, it’s all about Flash messages!

There’s no real secret to this other than to set the Data Coding Scheme to 16, which is GSM7 with Flash class set. That’s it.

If you’re interested in the internal machinations of how SMS works, I’ve got a few posts on the topic – You can find a list of them here.

Obviously to take advantage of this you’d need to be a network operator, or have access to the network you wish to deliver to. Recently more A2P providers are filtering non vanilla SMS traffic to filter out stuff like SMS OTA message or SIM specific messages, so there’s a good chance this may not work through A2P providers.

Background to the “VoLTE Mess”

06/12/20245G SA, IMS / VoLTE, LTE, Mobile Networks, RFCs & Standards, VoIPIMS, LTE, SIP, VoIP, VoLTENick

I’ve been writing a fair bit recently about the “VoLTE Mess” – It’s something that’s been around for a long time, mostly impacting greenfield players rolling out LTE only, but now the big carriers are starting to feel it as they shut off their 2G and 3G networks, so I figured a brief history was in order to understand how we got here.

Note: I use the terms 4G or LTE interchangeably

The Introduction of LTE

LTE (4G) is more “spectrally efficient” than the technologies that came before it. In simple terms, 1 “chunk” of spectrum will get you more speed (capacity) on LTE than the same size chunk of spectrum would on 2G or 3G.

From my post on 5G being a bit overhyped

So imagine it’s 2008 and you’re the CTO of a mobile network operator.
Your network is congested thanks to carrying more data traffic than it was ever designed for (the first iPhone had launched the year before) and the network is struggling under the weight of all this new data traffic.
You have two options here, to build more cell sites for more density (very expensive) or buy more spectrum (extremely expensive) – Both options see you going cap in hand to the finance team and asking for eye-wateringly large amounts of capital for either option.

But then the answer to your prayers arrives in the form of 3GPP’s Release 8 specification with the introduction of LTE. Now by taking some 2G or 3G spectrum, and by using it on 4G, you can get ~5x more capacity from the same spectrum. So just by changing spectrum you own from 2G or 3G to 4G, you’ve got 5x more capacity. Hallelujah!

So you go to Nortel and buy a packet core, and Alcatel and Siemens provide 4G RAN (eNodeBs) which you selectively deploy on the cell sites that are the most congested.
The finance team and the board are happy and your marketing team runs amok with claims of 4G data speeds.
You’ve dodged the crisis, phew.

This is the path that all established mobile operators took; throw LTE at the congested cell sites, to cheaply and easily free up capacity, and as the natural hardware replacement cycle kicked in, or cell sites reached capacity, swap out the hardware to kit that supports LTE in addition to the 2G and 3G tech.

Circuit Switched Fallback

But it’s hard to talk about the machinations of late 2000s telecom executives, without at least mentioning Hitler.

This video below from 15 years ago is pretty obscure and fairly technical, but the crux of it it is that Hitler is livid because LTE does not have a “CS Domain” aka circuit switched voice (the way 2G and 3G had handled voice calls).

It was optional to include support for voice calls in the LTE network (Voice over LTE) when you launched LTE services. So if you already had a 2G or 3G network (CS Network) you could just keep using 2G and 3G for your voice calls, while getting that sweet capacity relief.

So our hypothetical CTO, strapped for cash and data capacity, just didn’t bother to support VoLTE when they launched LTE – Doing so would have taken more time to launch, during which time the capacity problem would become worse, so “don’t worry about VoLTE for now” was the mantra.

All the operators who still had 2G and 3G networks, opted to just “Fallback” to using the 2G / 3G network for calling. This is called “Circuit Switched Fallback” aka CSFB.

Operators loved this as they got the capacity relief provided by shifting to 4G/LTE (more capacity in the network is always good) and could all rant about how their network was the fastest and had 4G first, this however was what could be described as a “Foot gun” – Something you can shoot yourself in the foot with in the future.

Operators eventually introduce VoLTE

Time ticked on an operators built out their 4G networks, and many in the past 10 years or so have launched VoLTE in their own networks.

For phones that support it, in areas with blanket 4G coverage, they can use VoLTE for all their calls.

But that’s the sticking point right there – If the phones support it.

But if the phones don’t support it, they’re roaming or making emergency calls, there is always been the safety blanket of 2G or 3G and Circuit Switched fallback to well, fall back to.

There’s no driver for operators who plan to (or are required to) operate a 2G or 3G network for the foreseeable future, to ensure a high level of VoLTE support in their devices.

For an operator today with 2G or 3G, Voice over LTE is still optional.
Many operators still rely exclusively on Circuit Switched Fallback, and there are only a handful of countries that have turned off 2G and 3G and rely solely on VoLTE.

VoLTE Handset Support

For the past 16 years phone manufacturers have been making LTE capable phones.

But that does not mean they’ve been making phones that support Voice over LTE.

But it’s never been an issue up until this point, as there’s always been a circuit switched (2G/3G) network to fall back to, so the fact that these chips may not support VoLTE was not a big problem.

Many of the cheaper chipsets that power phones simply don’t support VoLTE – These chips do support LTE for data connections but rely on Circuit Switched Fallback for voice calls. This is in part due to the increased complexity, but also because some of the technologies for VoLTE (like AMR) required intellectual property deals to licence to use, so would add to the component cost to manufacture, and in the chips game, keeping down component cost is critical.

Even for chips that do support Voice over LTE, it’s “special”. Unlike calling in 2G or 3G that worked the same for every operator, phone manufacturers require a “Carrier Bundle” for each operator, containing that specific operators’ special flavor of VoLTE, that operator uses in their network.

This is because while VoLTE is standardized (Despite some claims to the contrary) a lot of “optional” bits have existed, and different operators built networks with subtle differences in the “flavor” of their Voice over LTE (IMS) stack they used. The OEMs (Phone / Chip manufacturers) had to handle these changes in the devices they made, for in order to sell their phones through that operator.

This means I can have a phone from vendor X that works with VoLTE on Network Y, but does not support VoLTE on Network Z.

Worse still, knowing which phones are supported is a bit of a guessing game.

Most operators sell phones directly to their customer base, so buying an Network Y branded phone from Vendor X, you know it’s going to support Network Y’s VoLTE settings, but if you change carriers, who knows if it’ll still support it?

When you’ve still got a Circuit Switched network it’s not the end of the world, you’ll just use CSFB and probably not realize it, until operators go to shut down 2G / 3G networks…

IMS Profile selection on an engineering mode MTK based Android handset

Navigating the Maze of VoLTE Compatibility

Here are some simple checklist you can ask your elderly family members if they ask if their phone is VoLTE compatible:

Does the underlying chipset the phone is based on support VoLTE? (you can find this out by disassembling the phone and checking the datasheets for the components from the OEMs after signing NDAs for each)
Does the underlying chipset require a “carrier bundle” of settings to have been loaded for this operator in order to support VoLTE (See Qualcomm MBM as an example)?
What version of this list am I currently on (generally set in the factory) and does it support this operator? (You can check by decapping the ICs and dumping their NVRAM and then running it through a decompiler)
Does my phones OS (Android / iOS) require a “carrier bundle” of it’s own to enable VoLTE? Is my operator in the version of the database on the phone? (See Android’s Carrier Database for example) (You can find the answer by rooting the phone and running some privileged commands to poke around the internal file system)
Does my operator / MNO support VoLTE – Does my plan / package support VoLTE? (You can easily find the answer by visiting the store and asking questions that don’t appear on the script)

If you managed to answer yes to all of the above, congratulations! You have conditional VoLTE support on your phone, although you probably don’t have a working phone anymore.

Wait, conditional VoLTE support?

That’s right folks, VoLTE will work in some scenarios with your operator!

If you plan on traveling, well your phone may support VoLTE at home, but does the phone have VoLTE roaming enabled?
Many phones support VoLTE in the home network, but resort to CSFB when roaming.

If it does support VoLTE roaming, does the network you’re visiting support VoLTE roaming? Has the roaming agreement (IRA) between the operator you’re using while traveling and your home operator been updated to include VoLTE Roaming? These IRAs (AA.12 / AA.13 docs) also indicate if the network must turn off IPsec encryption for the VoLTE traffic when roaming, which is controlled by the phone anyway.

Phew, all this talk of VoLTE roaming while traveling scares me, I think I’ll stay home in the safety of the Australian bush with all these great friendly animals around a phone that supports VoLTE on my home network.

Ah – After spending some time in the Australian bush one of our many deadly animals bit me. Time to call for help! Wait, what about emergency calls over VoLTE? Again, many phones support VoLTE for normal calls, fall back to 2G or 3G for the emergency call, so if you have one of those phones (You’ll only find out if you try to make an emergency call and it fails) and try to make an emergency call in a country without 2G or 3G, you’d better find a payphone.

There’s many real world examples of this, our friends at OptimERA have been lobbying the FCC since 2019 on this.

Sarcasm aside, there’s no dataset or compatibility matrix here – No simple way to see if your phone will work for VoLTE on a given operator, even if the underlying chip does support VoLTE.

Operators in Australia which recently shut down their 3G network, were mandated to block devices that didn’t support VoLTE for emergency calling. They did this using an Equipment Identity Register, and blocking devices based on the Type Allocation Code, but this scattergun approach just blocked non-carrier issued devices, regardless of it they supported VoLTE or VoLTE emergency calling.

Blame Game

So who’s to blame here?

There’s no one group to blame here, the industry has created a shitty cycle here:

Standards orgs for having too many “flavors” available
Operators deploying their own “Flavors” of VoLTE then mandating OEMs / Chip manufacturers comply with their “flavor”.
OEMs / Chip manufactures respond by adding “Carrier Bundles” to account for this per-operator customization

I’ve got some ideas on a way to unscramble this egg, and it’s going to take a push from the industry.

If you’re in the industry and keen to push for a fix, get in touch!

It’s time to get a long term solution to this problem, and we as an industry need to lead the change.

Tales from the Trenches – IMS TCP Socket Handling

28/11/2024IMS / VoLTE, Mobile Networks, RFCs & Standards, VoIPIMS, LTE, SIP, VoIP, VoLTENick

Oh boy this has been a pain in the backside with IMS / VoLTE devices using TCP and how they handle the underlying TCP sockets.

A mobile phone from manufacturer A, wants every SIP dialog to be in it’s own TCP session, while a phone from manufacturer B wants a unique TCP session per transaction, while manufacturer C thinks that every SIP message should reuse the same transaction.

So an MT call to manufacturer A, who wants every SIP dialog in it’s own transaction would look something like this:

PCSCF:44738 -> UE:5060; TCP SYN
UE:5060 -> PCSCF:44738; TCP SYN/ACK
PCSCF:44738 -> UE:5060; TCP ACK
--- TCP connection is now open to UE from P-CSCF---
--- Start of new SIP Transaction 1 & Dialog ---
PCSCF:44738 -> UE:5060; TCP PSH - SIP INVITE....
UE:5060 -> PCSCF:44738; TCP ACK


UE:5060 -> PCSCF:44738; TCP PSH - SIP 200....
PCSCF:44738 -> UE:5060; TCP ACK, PSH - SIP ACK....
UE:5060 -> PCSCF:44738; TCP ACK
--- End of SIP Transaction 1 ---

--- Start of SIP Transaction 2 ---
PCSCF:44738 -> UE:5060; TCP PSH - SIP BYE....
UE:5060 -> PCSCF:44738; TCP ACK, PSH - SIP 200....
--- End of SIP Transaction 2 & SIP Dialog ---
PCSCF:44738 -> UE:5060; TCP FIN
UE:5060 -> PCSCF:44738; TCP ACK
--- End of TCP Connection ---

Where UE:5060 – is the IP & port of the UE, as advertised in the Contact: header, while PCSCF:44738 is the PCSCF IP and a random TCP port used for this connection.

But for manufacturer B, who wants a unique TCP session per transaction, they want it to look like this:

PCSCF:44738 -> UE:5060; TCP SYN
UE:5060 -> PCSCF:44738; TCP SYN/ACK
PCSCF:44738 -> UE:5060; TCP ACK
--- TCP connection is now open to UE from P-CSCF---
--- Start of new SIP Transaction 1 & Dialog ---
PCSCF:44738 -> UE:5060; TCP PSH - SIP INVITE....
UE:5060 -> PCSCF:44738; TCP ACK


UE:5060 -> PCSCF:44738; TCP PSH - SIP 200....
PCSCF:44738 -> UE:5060; TCP ACK, PSH - SIP ACK....
UE:5060 -> PCSCF:44738; TCP ACK
PCSCF:44738 -> UE:5060; TCP FIN
UE:5060 -> PCSCF:44738; TCP ACK
--- End of SIP Transaction 1 & TCP Session 1 ---

--- Start of TCP Session 2 ----
PCSCF:32627 -> UE:5060; TCP SYN
UE:5060 -> PCSCF:32627; TCP SYN/ACK
PCSCF:32627 -> UE:5060; TCP ACK
--- Start of SIP Transaction 2 ---
PCSCF:32627 -> UE:5060; TCP PSH - SIP BYE....
UE:5060 -> PCSCF:32627; TCP ACK, PSH - SIP 200....
--- End of SIP Transaction 2 & SIP Dialog ---
PCSCF:32627 -> UE:5060; TCP FIN
UE:5060 -> PCSCF:32627; TCP ACK
--- End of TCP Connection 2 ---

And then manufacturer C wants just the one TCP session to be used for everything, so they open the TCP connection when they register, and that’s all we use for everything.

Is there any logic to this? Nope, seems to be tied to the underlying chipset (Qualcomm vs Mediatek vs Unisoc) and the SIP stack used (Qualcomm, MTK, Unisoc, Samsung, Apple).

We’ve profiled devices into one of 3 behaviors, and then we tag them based on user agent as to what “persona” they demand from the network.

I can’t believe I’m still talking about VoLTE / IMS handset support and it’s almost 2025…. For context IMS was “standardized” 17 years ago.