Category Archives: Mobile Networks

Background to the “VoLTE Mess”

06/12/20245G SA, IMS / VoLTE, LTE, Mobile Networks, RFCs & Standards, VoIPIMS, LTE, SIP, VoIP, VoLTENick

I’ve been writing a fair bit recently about the “VoLTE Mess” – It’s something that’s been around for a long time, mostly impacting greenfield players rolling out LTE only, but now the big carriers are starting to feel it as they shut off their 2G and 3G networks, so I figured a brief history was in order to understand how we got here.

Note: I use the terms 4G or LTE interchangeably

The Introduction of LTE

LTE (4G) is more “spectrally efficient” than the technologies that came before it. In simple terms, 1 “chunk” of spectrum will get you more speed (capacity) on LTE than the same size chunk of spectrum would on 2G or 3G.

From my post on 5G being a bit overhyped

So imagine it’s 2008 and you’re the CTO of a mobile network operator.
Your network is congested thanks to carrying more data traffic than it was ever designed for (the first iPhone had launched the year before) and the network is struggling under the weight of all this new data traffic.
You have two options here, to build more cell sites for more density (very expensive) or buy more spectrum (extremely expensive) – Both options see you going cap in hand to the finance team and asking for eye-wateringly large amounts of capital for either option.

But then the answer to your prayers arrives in the form of 3GPP’s Release 8 specification with the introduction of LTE. Now by taking some 2G or 3G spectrum, and by using it on 4G, you can get ~5x more capacity from the same spectrum. So just by changing spectrum you own from 2G or 3G to 4G, you’ve got 5x more capacity. Hallelujah!

So you go to Nortel and buy a packet core, and Alcatel and Siemens provide 4G RAN (eNodeBs) which you selectively deploy on the cell sites that are the most congested.
The finance team and the board are happy and your marketing team runs amok with claims of 4G data speeds.
You’ve dodged the crisis, phew.

This is the path that all established mobile operators took; throw LTE at the congested cell sites, to cheaply and easily free up capacity, and as the natural hardware replacement cycle kicked in, or cell sites reached capacity, swap out the hardware to kit that supports LTE in addition to the 2G and 3G tech.

Circuit Switched Fallback

But it’s hard to talk about the machinations of late 2000s telecom executives, without at least mentioning Hitler.

This video below from 15 years ago is pretty obscure and fairly technical, but the crux of it it is that Hitler is livid because LTE does not have a “CS Domain” aka circuit switched voice (the way 2G and 3G had handled voice calls).

It was optional to include support for voice calls in the LTE network (Voice over LTE) when you launched LTE services. So if you already had a 2G or 3G network (CS Network) you could just keep using 2G and 3G for your voice calls, while getting that sweet capacity relief.

So our hypothetical CTO, strapped for cash and data capacity, just didn’t bother to support VoLTE when they launched LTE – Doing so would have taken more time to launch, during which time the capacity problem would become worse, so “don’t worry about VoLTE for now” was the mantra.

All the operators who still had 2G and 3G networks, opted to just “Fallback” to using the 2G / 3G network for calling. This is called “Circuit Switched Fallback” aka CSFB.

Operators loved this as they got the capacity relief provided by shifting to 4G/LTE (more capacity in the network is always good) and could all rant about how their network was the fastest and had 4G first, this however was what could be described as a “Foot gun” – Something you can shoot yourself in the foot with in the future.

Operators eventually introduce VoLTE

Time ticked on an operators built out their 4G networks, and many in the past 10 years or so have launched VoLTE in their own networks.

For phones that support it, in areas with blanket 4G coverage, they can use VoLTE for all their calls.

But that’s the sticking point right there – If the phones support it.

But if the phones don’t support it, they’re roaming or making emergency calls, there is always been the safety blanket of 2G or 3G and Circuit Switched fallback to well, fall back to.

There’s no driver for operators who plan to (or are required to) operate a 2G or 3G network for the foreseeable future, to ensure a high level of VoLTE support in their devices.

For an operator today with 2G or 3G, Voice over LTE is still optional.
Many operators still rely exclusively on Circuit Switched Fallback, and there are only a handful of countries that have turned off 2G and 3G and rely solely on VoLTE.

VoLTE Handset Support

For the past 16 years phone manufacturers have been making LTE capable phones.

But that does not mean they’ve been making phones that support Voice over LTE.

But it’s never been an issue up until this point, as there’s always been a circuit switched (2G/3G) network to fall back to, so the fact that these chips may not support VoLTE was not a big problem.

Many of the cheaper chipsets that power phones simply don’t support VoLTE – These chips do support LTE for data connections but rely on Circuit Switched Fallback for voice calls. This is in part due to the increased complexity, but also because some of the technologies for VoLTE (like AMR) required intellectual property deals to licence to use, so would add to the component cost to manufacture, and in the chips game, keeping down component cost is critical.

Even for chips that do support Voice over LTE, it’s “special”. Unlike calling in 2G or 3G that worked the same for every operator, phone manufacturers require a “Carrier Bundle” for each operator, containing that specific operators’ special flavor of VoLTE, that operator uses in their network.

This is because while VoLTE is standardized (Despite some claims to the contrary) a lot of “optional” bits have existed, and different operators built networks with subtle differences in the “flavor” of their Voice over LTE (IMS) stack they used. The OEMs (Phone / Chip manufacturers) had to handle these changes in the devices they made, for in order to sell their phones through that operator.

This means I can have a phone from vendor X that works with VoLTE on Network Y, but does not support VoLTE on Network Z.

Worse still, knowing which phones are supported is a bit of a guessing game.

Most operators sell phones directly to their customer base, so buying an Network Y branded phone from Vendor X, you know it’s going to support Network Y’s VoLTE settings, but if you change carriers, who knows if it’ll still support it?

When you’ve still got a Circuit Switched network it’s not the end of the world, you’ll just use CSFB and probably not realize it, until operators go to shut down 2G / 3G networks…

IMS Profile selection on an engineering mode MTK based Android handset

Navigating the Maze of VoLTE Compatibility

Here are some simple checklist you can ask your elderly family members if they ask if their phone is VoLTE compatible:

Does the underlying chipset the phone is based on support VoLTE? (you can find this out by disassembling the phone and checking the datasheets for the components from the OEMs after signing NDAs for each)
Does the underlying chipset require a “carrier bundle” of settings to have been loaded for this operator in order to support VoLTE (See Qualcomm MBM as an example)?
What version of this list am I currently on (generally set in the factory) and does it support this operator? (You can check by decapping the ICs and dumping their NVRAM and then running it through a decompiler)
Does my phones OS (Android / iOS) require a “carrier bundle” of it’s own to enable VoLTE? Is my operator in the version of the database on the phone? (See Android’s Carrier Database for example) (You can find the answer by rooting the phone and running some privileged commands to poke around the internal file system)
Does my operator / MNO support VoLTE – Does my plan / package support VoLTE? (You can easily find the answer by visiting the store and asking questions that don’t appear on the script)

If you managed to answer yes to all of the above, congratulations! You have conditional VoLTE support on your phone, although you probably don’t have a working phone anymore.

Wait, conditional VoLTE support?

That’s right folks, VoLTE will work in some scenarios with your operator!

If you plan on traveling, well your phone may support VoLTE at home, but does the phone have VoLTE roaming enabled?
Many phones support VoLTE in the home network, but resort to CSFB when roaming.

If it does support VoLTE roaming, does the network you’re visiting support VoLTE roaming? Has the roaming agreement (IRA) between the operator you’re using while traveling and your home operator been updated to include VoLTE Roaming? These IRAs (AA.12 / AA.13 docs) also indicate if the network must turn off IPsec encryption for the VoLTE traffic when roaming, which is controlled by the phone anyway.

Phew, all this talk of VoLTE roaming while traveling scares me, I think I’ll stay home in the safety of the Australian bush with all these great friendly animals around a phone that supports VoLTE on my home network.

Ah – After spending some time in the Australian bush one of our many deadly animals bit me. Time to call for help! Wait, what about emergency calls over VoLTE? Again, many phones support VoLTE for normal calls, fall back to 2G or 3G for the emergency call, so if you have one of those phones (You’ll only find out if you try to make an emergency call and it fails) and try to make an emergency call in a country without 2G or 3G, you’d better find a payphone.

There’s many real world examples of this, our friends at OptimERA have been lobbying the FCC since 2019 on this.

Sarcasm aside, there’s no dataset or compatibility matrix here – No simple way to see if your phone will work for VoLTE on a given operator, even if the underlying chip does support VoLTE.

Operators in Australia which recently shut down their 3G network, were mandated to block devices that didn’t support VoLTE for emergency calling. They did this using an Equipment Identity Register, and blocking devices based on the Type Allocation Code, but this scattergun approach just blocked non-carrier issued devices, regardless of it they supported VoLTE or VoLTE emergency calling.

Blame Game

So who’s to blame here?

There’s no one group to blame here, the industry has created a shitty cycle here:

Standards orgs for having too many “flavors” available
Operators deploying their own “Flavors” of VoLTE then mandating OEMs / Chip manufacturers comply with their “flavor”.
OEMs / Chip manufactures respond by adding “Carrier Bundles” to account for this per-operator customization

I’ve got some ideas on a way to unscramble this egg, and it’s going to take a push from the industry.

If you’re in the industry and keen to push for a fix, get in touch!

It’s time to get a long term solution to this problem, and we as an industry need to lead the change.

Tales from the Trenches – IMS TCP Socket Handling

28/11/2024IMS / VoLTE, Mobile Networks, RFCs & Standards, VoIPIMS, LTE, SIP, VoIP, VoLTENick

Oh boy this has been a pain in the backside with IMS / VoLTE devices using TCP and how they handle the underlying TCP sockets.

A mobile phone from manufacturer A, wants every SIP dialog to be in it’s own TCP session, while a phone from manufacturer B wants a unique TCP session per transaction, while manufacturer C thinks that every SIP message should reuse the same transaction.

So an MT call to manufacturer A, who wants every SIP dialog in it’s own transaction would look something like this:

PCSCF:44738 -> UE:5060; TCP SYN
UE:5060 -> PCSCF:44738; TCP SYN/ACK
PCSCF:44738 -> UE:5060; TCP ACK
--- TCP connection is now open to UE from P-CSCF---
--- Start of new SIP Transaction 1 & Dialog ---
PCSCF:44738 -> UE:5060; TCP PSH - SIP INVITE....
UE:5060 -> PCSCF:44738; TCP ACK


UE:5060 -> PCSCF:44738; TCP PSH - SIP 200....
PCSCF:44738 -> UE:5060; TCP ACK, PSH - SIP ACK....
UE:5060 -> PCSCF:44738; TCP ACK
--- End of SIP Transaction 1 ---

--- Start of SIP Transaction 2 ---
PCSCF:44738 -> UE:5060; TCP PSH - SIP BYE....
UE:5060 -> PCSCF:44738; TCP ACK, PSH - SIP 200....
--- End of SIP Transaction 2 & SIP Dialog ---
PCSCF:44738 -> UE:5060; TCP FIN
UE:5060 -> PCSCF:44738; TCP ACK
--- End of TCP Connection ---

Where UE:5060 – is the IP & port of the UE, as advertised in the Contact: header, while PCSCF:44738 is the PCSCF IP and a random TCP port used for this connection.

But for manufacturer B, who wants a unique TCP session per transaction, they want it to look like this:

PCSCF:44738 -> UE:5060; TCP SYN
UE:5060 -> PCSCF:44738; TCP SYN/ACK
PCSCF:44738 -> UE:5060; TCP ACK
--- TCP connection is now open to UE from P-CSCF---
--- Start of new SIP Transaction 1 & Dialog ---
PCSCF:44738 -> UE:5060; TCP PSH - SIP INVITE....
UE:5060 -> PCSCF:44738; TCP ACK


UE:5060 -> PCSCF:44738; TCP PSH - SIP 200....
PCSCF:44738 -> UE:5060; TCP ACK, PSH - SIP ACK....
UE:5060 -> PCSCF:44738; TCP ACK
PCSCF:44738 -> UE:5060; TCP FIN
UE:5060 -> PCSCF:44738; TCP ACK
--- End of SIP Transaction 1 & TCP Session 1 ---

--- Start of TCP Session 2 ----
PCSCF:32627 -> UE:5060; TCP SYN
UE:5060 -> PCSCF:32627; TCP SYN/ACK
PCSCF:32627 -> UE:5060; TCP ACK
--- Start of SIP Transaction 2 ---
PCSCF:32627 -> UE:5060; TCP PSH - SIP BYE....
UE:5060 -> PCSCF:32627; TCP ACK, PSH - SIP 200....
--- End of SIP Transaction 2 & SIP Dialog ---
PCSCF:32627 -> UE:5060; TCP FIN
UE:5060 -> PCSCF:32627; TCP ACK
--- End of TCP Connection 2 ---

And then manufacturer C wants just the one TCP session to be used for everything, so they open the TCP connection when they register, and that’s all we use for everything.

Is there any logic to this? Nope, seems to be tied to the underlying chipset (Qualcomm vs Mediatek vs Unisoc) and the SIP stack used (Qualcomm, MTK, Unisoc, Samsung, Apple).

We’ve profiled devices into one of 3 behaviors, and then we tag them based on user agent as to what “persona” they demand from the network.

I can’t believe I’m still talking about VoLTE / IMS handset support and it’s almost 2025…. For context IMS was “standardized” 17 years ago.

Mobile Network Code – 2 or 3 Digits?

08/11/2024Mobile Networks, Notes, SDM, SIM CardsLTE, SDM, SIM CardNick

Every mobile network broadcasts a Public Mobile Network Code – aka a PLMN. This 6 octet value is used to identify the network (Although this gets murky with shared codes and Private networks and when OEMs make them for codes they don’t own).

It’s made up of a Mobile Country Code followed by a Mobile Network code.

One of the guys at work asked a seemingly simple question, is the PLMN with MCC 505 and MNC 57 the same as MCC 505 MNC 057 – It’s on 6 octets after all.

So is Mobile Network Code 57 the same as Mobile Network Code 057 in the PLMN code?

The answer is no, and it’s a massive pain in the butt.

All countries use 3 digit Mobile Country Codes, so Australia, is 505. That part is easy.

The tricky part is that some countries (Like Australia) use 2 digit Mobile Network Codes, while others (Like the US) use 3 digit mobile network codes.

This means our 6 digit PLMN has to get padded when encoding a 2 digit Mobile Network Code, which is a pain, but also that by looking at the IMSI alone, you don’t know if the PLMN is 2 digit or 3 digit – IMSI 5055710000001 could be parsed as MCC 505, MNC 571 or MCC 505, MNC 57.

Why would you do this? Why would a regulator opt to have 1/10th the addressable size of network codes – I don’t know, and I haven’t been able to find an answer – If you know please drop a comment, I’d love to know.

So how do we handle this?

There are files in the SIM profile to indicate the length of the MNC, the Administrative Domain EF on the SIM allow us to indicate if the MNC is 2 digit or 3 digit, and the HPLMNwAct and the other *PLMN* EFs encode the PLMN as 6 digit, with or without padding, to allow differentiation.

That’s all well and good from a SIM perspective, but less useful for scenarios where you might be the Visited PLMN for example, and only see the IMSI of a Subscriber.

We worked on a project in a country that mixed both 2 digit and 3 digit Mobile Network Codes, under the same Mobile Country Code. Certain Qualcomm phones would do very very strange things, and it took us a long time and a lot of SIM OTA to resolve the issue, but that’s a story for another day…

All about Global Title Translation & SCCP Routing

25/10/2024GSM, History, Mobile Networks, RFCs & StandardsGlobal Title, SCCP, SIGTRAN, SS7Nick

This is the next post in my series on SS7, and today we’re taking a look at SCCP the Signalling Connection Control Part (SCCP).

High Level

Global Title uses the routing features from SCCP, which is another layer on top of MTP3.

SCCP allows us to route on more than just point code, instead we can route based on two new fields, Subsystem Number and Global Title.

Subsystem Number is the type of system we are looking to reach, ie an HLR, MSC, CAMEL Gateway, etc.

The Global Title generally looks like an E.164 formatted phone number, and often it is just that.

Somewhere along the chain (typically at the end of it) an STP somewhere needs to perform Global Title Translation to analyse the SCCP header (Subsystem Number, Point Code & Global Title) and finally turn that into a single point code to route the MTP3 message to.

The advantage of this is we are no longer just limited to routing messages based on Point Code.

This is how the international SS7 Network used for roaming is structured and addressed – All using Global Title rather than Point Codes.

The need for SCCP

For starters, after all this talk of MTP3 and Point Codes, why the need to add SCCP?

Let’s go back in time and look at the motivators…

1. Address space is finite

Point codes are great, and when we’ve spoken about them before, I’ve compared them to IPv4 address, but rather than ranging from 0.0.0.0 to 255.255.255.255 (32 bits on IPv4) international signaling point codes range from 0.0.0 to 7.255.7 (14 bits).

The problem with IPv4’s 32 bit addresses is they run out. The problem with the ITU International Signaling Point Codes is that they too, are a limited resource with only 16,383 possible ISPCs.

~700 operators worldwide each with ~100 network elements would be 70k point codes to address them all – That’s not going to fit into our 16k possible Point Codes.

Global Title fixes this, because we’re able to use E.164 phone number ranges (which are plentiful) for addressing, we’re still not at IPv6 levels of address space, but pretty hefty.

2. Service Discovery by Subsystem

Now imagine you’re a VLR looking to find an HLR. The VLR and the HLR are both connected to an STP, but how does the VLR know where to reach the HLR?

One option would be to statically set every route for the Point Code of every HLR into every possible VLR and visa-versa, but that gets messy fast.

What if the VLR could just send a request to the STP and indicate that the request needs to be routed to any HLR, and the STP takes care of finding a SS7 node capable of handling the request, much a Diameter Routing Agent routes based on Application ID.

SCCP’s “Subsystem Number” routing can handle this as we can route based on SSN.

3. Service Discovery by MSISDN

Having an SMS destined to a given MSISDN requires the SMSc to know where to route it.

Likewise an MSC wanting to call a given number.

There’s a lot of MSISDN ranges. Like a lot. Like every phone mobile number.

Having every a table on every SSP/SCP in the network know where every MSISDN range is in the world and what point code to go through to reach it is not practical.

Instead, being able to have the SCP/SSPs (like our MSC or SMSc) send all off-net traffic to an STP frees us the individual SCP/SSPs from this role; they just forward it to their connected STP.

Our STP can analyse the destination MSISDN and make these routing decisions for us, using Global Title Translation based on rules in the Global Title Table on the STP.

For example by adding each of the domestic / national MSISDN ranges/prefixes into the Global Title Table on the STP (along with the corresponding point code to route each one to), the STP can look at the destination MSISDN in the message and forward to the STP for the correct operator.

Likewise a route can match anything where the Global Title address is outside of the local country and send it to an international signaling provider.

Global title takes care of this as we can route based on a phone number.

4. Tokenistic Security

By “Hiding” network elements behind Global Titles, you don’t expose as much information about your internal network, and the only way people can “find” your network elements would be scanning through all the possible addresses in your (publicly advertised) Global Title range (wardialing is back baby!).

But the phrases “Security” and “SS7” don’t really belong together…

The SCCP Header

The SCCP header has a Called Party and a Calling Party, and this is where the magic happens.

These can be made up for any number of 3 parts:

Global Title Address
Subsystem Number
Point Code

We can route on any combination of these.

To indicate we’re using SCCP, we set the Signaling Indicator bit in the M3UA / MTP3 message to SCCP:

Great, now we can look at our SCCP header.

It looks like there’s a lot going on, but we can see the calling and called party (888888888 is called by 9999999999) with the Subsystem number set (888888888 is called for subsystem HLR, from 999999999 which is a VLR).

The closest TCP/IP analogy I can think of here is that of port numbers, there’s still an IP (Point code) but the port number allows us to specify multiple applications that run at a higher layer. This analogy falls down when we consider that the Point Code is generally set to that of your STP, not the final STP.

For this to work, we’ve got to have at least one Signaling Transfer Point in the flow, where we send the request to.

Somewhere (generally at the end of the chain of STPs), an STP is going to perform Global Title Translation.

What does this look like? Well let’s have a look at my GT table for the example above, in my lab network, I’ve got two nodes attached (via M3UA but could equally be on MTP3 links), my test MAP client where I’m originating this traffic, and an SMS Firewall, I can see they’re both up here:

Now knowing this I need to setup my SCCP routing for Global Title. In the screenshot above, the Called Party was 888888888 with Subsystem Number 7. Inside the SCCP request, there’s a few other fields, the Translation Type we have set to 0, Global Title Indicator is 4 (route on Global Title), while Numbering Plan Indicator is 1 (ISDN) and Nature of Address Indicator is 4 (International).

So on my Cisco ITP I define a GTT Selector to target traffic with these values, Translation Type is 0, Global Title Indicator is 4, Number Plan is 1 and the Nature of Address Indicator is 4.

So we’d define a Global Title Translation selector like the one below to match this traffic:

cs7 instance 0 gtt selector GLOBAL_tt0 tt 0 gti 4 np 1 nai 4

But that’s only matching the group of traffic, it’s not going to match based on the actual SCCP Called Party. So now I need to define a translation for each Global Title address (Called /Calling party) or prefix I want to route, I’ve setup anything starting with 888 to route to the `SMSFirewall` ASP endpoint.

cs7 instance 0 gtt selector GLOBAL_tt0 tt 0 gti 4 np 1 nai 4
     gta 888  asname SMSFirewall gt ssn 6

I could stop here and my request addressed to 888888888 would make it to the SMSFirewall ASP, but the response never would, like in all SS7 routing, we need to define the return route translation too, which is what I’ve done for 999999 to route to the TestClient.

Lastly I’ve added a wildcard route, this means if this STP doesn’t know how to resolve a GT address matching the rules in the top line, it’ll forward the request to the STP at point code 1.2.3 – This is how you’d do your connection to an IPX / Signaling exchange.

Debugging this can be a massive pain in the backside, but if you enable logging you can see when GT rules are not matched, like in the example below.

If your network is quiet enough, it’s sometimes easier to just make your rules based on what you observe failing to route.

So with those routes in place, when we send a request with the Global Title called party starting with 8888888 it’s routed to M3UA ASP SMSFirewall, which handles the request, and then sends the response back to the MAPClient M3UA ASP.

Tales from the Trenches – SMS Data Coding Scheme 0

18/10/2024IMS / VoLTE, Mobile Networks, Notes, RFCs & StandardsDCS, IMS, SIP, SMPP, SMS, VoLTENick

The Data Coding Scheme (DCS or TP-DCS) header in an SMS body indicates what encoding is used in that message.

It means if we’re using UCS-2 (UTF16) special characters like Emojis etc, in in our message, the phone knows to decode the data in the message body using UTF, because the Data Coding Scheme (DCS) header indicates the contents are encoded in UTF.

Likewise, if we’re not using any fancy characters in our message and the message is encoded as plain old GSM7, we set set the DCS to 0 to indicate this is using GSM7.

From my experience, I’d always assumed that DCS0 (Default) == GSM7, but today I learned, that’s not always the case. Some SMSc entities treat DCS0 as Latin.

Let me explain why this is stupid and why I wasted a lot of time on this.

We can indicate that a message is encoded as Latin by setting the DCS to 0x03:

We cannot indicate that the message is encoded as GSM7 through anything other than the default alphabet (DCS 0).

Latin has it’s own encoding flag, if I wanted the message treated as Latin, I’d indicate the message encoding is Latin in the DCS bit!

I spent a bunch of time trying to work out why a customer was having issues getting messages to subscribers on another operator, and it turned out the other operator treats messages we send to them on SMPP with DCS0 as Latin encoding, and then cracks the sads when trying to deliver it.

The above diff shows the message we send (Right), and the message they dry to deliver (left).

Well, lesson learned…

MAP – SendRouting Information for SM – locationInfoWithLMSIMAP

11/10/2024GSM, Mobile NetworksMAP, SRI-for-SM, SS7Nick

The other day I was facing an issue with our SMSc inter-working with another operator via MAP.

Our SRI-for-SM responses were relayed back to their nodes, but it was like it couldn’t parse the message.

I got some “known good” traffic to compare this against to work out what we’re doing wrong.

The difference in the two examples below is subtle, but it’s there – On the example on the left (failing) we are including an msc-number in the locationInfoWithLMSI field, while on the right we’ve got a “Network Node Number”.

“Okay” I thought to myself, we’re just doing something wrong with the encoding of the MAP body, so I did my usual diff trick from Wireshark:

Oddly in the raw form both these values decode the same, if I feed the values on the left into our decoder, and then encode, I get the values on the right, with the exact same hex body – and this is all ASN.1; so there’s very little room for error anyway.

So what gives? Why does Wireshark show one MAP body differently to the other, with the same hex bytes?

Well, the issue is not within my GSM MAP body and the content I include there, but rather in the TCAP layer above it, specifically the `application-context-name`.

We’re indicating support for GSM MAP v3 (0.4.0.0.1.0.20.3), while the other operator is using MAPv2 ( 0.4.0.0.1.0.20.2 ) even though their IR.21 indicates it should be GSM MAP v3.

Now our SRI-for-SM responses are based on the GSM MAP version received, rather than reported as supported, and we don’t have to deal with this for handling requests anymore.

Using fancy test kit to measure”Cell Antenna” stickers

27/09/20245G SA, EUTRAN, LTE, Mobile Networks, RF5G, Antennas, LTE, RAN, TestingNick

Technology is constantly evolving, new research papers are published every day.

But recently I was shocked to discover I’d missed a critical development in communications, that upended Shannon’s “A mathematical theory of communication”.

I’m talking of course, about the GENERATION X PLUS SP-11 PRO CELL ANTENNA.

I’ve been doing telecom work for a long time, while I mostly write here about Core & IMS, I am a licenced rigger, I’ve bolted a few things to towers and built my fair share of mobile coverage over the years, which is why I found this development so astounding.

With this, existing antennas can be extended, mobile phone antennas, walkie talkies and cordless phones can all benefit from the improvement of this small adhesive sticker, which is “Like having a four foot antenna on your phone”.

So for the bargain price of $32.95 (Or $2 on AliExpress) I secured myself this amazing technology and couldn’t wait to quantify it’s performance.

Think of the applications – We could put these stickers on 6 ft panel antennas and they’d become 10ft panels. This would have a huge effect on new site builds, minimize wind loading, less need for tower strengthening, more room for collocation on the towers due to smaller equipment footprint.

Luckily I have access to some fancy test equipment to really understand exactly how revolutionary this is.

The packaging says it’s like having a 4 foot antenna on your phone, let’s do some very simple calculations, let’s assume the antenna in the phone is currently 10cm, and that with this it will improve to be 121cm (four feet).

According to some basic projections we should see ~21dB gain by adding the sticker, that’s a 146x increase in performance!

Man am I excited to see this in action.

Fortunately I have access to some fun cellular test equipment, including the Viavi CellAdvisor and ~~an environmentally controlled lab~~ my kitchen bench.

I put up a 1800Mhz (band 3) LTE carrier in my office in the other room as a reference and placed the test equipment into the test jig (between the sink and the kettle).

We then took baseline readings from the omni shown in the pictures, to get a reading on the power levels before adding the sticker.

We are reading exactly -80dBm without the sticker in place, so we expertly put some masking tape on the omni (so we could peel it off) and applied the sticker antenna to the tape on the omni antenna.

At -80dBm before, by adding the 21dB of gain, we should be put just under -60dBm, these Viavi units are solid, but I was fearful of potentially overloading the receive end from the gain, after a long discussion we agreed at these levels it was unlikely to blow the unit, so no in-line attenuation was used.

Okay, </sarcasm> I was genuinely a little surprised by what we found; there was some gain, as shown in the screenshot below.

Marker 1 was our reference without the sticker, while reference 2 was our marker with the sticker, that’s a 1.12dB gain with the sticker in place. In linear terms that’s a ~30% increase in signal strength.

So does this magic sticker work? Well, kinda, in as much that holding onto the Omni changes the characteristics, as would wrapping a few turns of wire around it, putting it in the kettle or wrapping it in aluminum foil. Anything you do to an antenna to change it is going to cause minor changes in characteristic behavior, and generally if you’re getting better at one frequency, you get worse at another, so the small gain on band 3 may also lead to a small loss on band 1, or something similar.

So what to make of all this? Maybe this difference is an artifact from moving the unit to make a cup of tea, the tape we applied or just a jump in the LTE carrier, or maybe the performance of this sticker is amazing after all…

Seeing the Airwaves – QCsuper

06/09/2024EUTRAN, Mobile Networks, RF, UncategorizedLTE, NAS, NR, QCSuper, Qualcomm, RF, RRCNick

Recently we were on a project and our RAN guy was seeing UEs hand between one layer and another over and over. The hysteresis and handover parameters seemed correct, but we needed a way to see what was going on, what the eNB was actually advertising and what the UE was sending back.

In a past life I had access to expensive complicated dedicated tooling that could view this information transmitted by the eNB, but now, all I need is a cellphone or a modem with a Qualcomm chip.

QCsuper is a super handy tool that gives access to the Diag interface on Qualcomm chipsets. That’s the same interface QXDM uses, but without the massive headache and usability issues that come with QXDM, plus it puts everything in Wireshark to make it super easy to view everything.

You can see RRC, NAS and even the user plane traffic, all from Wireshark.

I’ve tested this with a rooted Xiaomi Pro 12 and a Quectel modem in a M.2 to USB adapter.

This is really cool and I’m looking forward to using it more in the field, or just if I’m bored on the train scanning the airwaves!

Converting OP Key to OPc

30/08/20245G SA, LTE, Mobile Networks, Python, SDM, SecurityAuC, HSS, OPc, Security, SIM, SIM Ca, SIM CardNick

I wrote recently about the difference between OP and OPc keys in SIM cards, and why you should avoid using OP keys for best-practice.

PyHSS only supports storing OPc keys for this reason, but plenty of folks still use OP, so how can you convert OP keys to OPc keys?

Well, inside PyHSS we have a handy utility for this, CryptoTool.py

https://github.com/nickvsnetworking/pyhss/blob/master/lib/CryptoTool.py

Usage is really simple, just plug in the Ki and OP Key and it’ll generate you a full set of vectors, including the OPc key.

nick@amanaki:~/Documents/pyhss/lib$ python3 CryptoTool.py --k 11111111111111111111111111111111 --op 22222222222222222222222222222222

Namespace(k='11111111111111111111111111111111', op='22222222222222222222222222222222', opc=None)
Generating OPc key from OP & K
Generating Multimedia Authentication Vector
Input K:    b'11111111111111111111111111111111'
Input OPc:  b'2f3466bd1bea1ac9a8e1ab05f6f43245'
Input AMF:  b'\x80\x00'

Of course, being open source, you can grab the functions out of this and make a little script to convert everything in a CSV or whatever format your key data is in.

So what about OPc to OP?
Well, this is a one-way transaction, we can’t get the OP Key from an OPc & Ki.

5G / LTE Milenage Security Exploit – Dumping the Vectors

23/08/20245G SA, EPC, LTE, Mobile Networks, SDM, Security, SIM Cards5G, Diameter, HSS, LTE, SecurityNick

I’ve written about Milenage and SIM based security in the past on this blog, and the component that prevents replay attacks in cellular network authentication is the Sequence Number (Aka SQN) stored on the SIM.

Think of the SQN as an incrementing odometer of authentication vectors. Odometers can go forward, but never backwards. So if a challenge comes in with an SQN behind the odometer (a lower number), it’s no good.

Why the SQN is important for Milenage Security

Every time the SIM authenticates it ticks up the SQN value, and when authenticating it checks the challenge from the network doesn’t have an SQN that’s behind (lower than) the SQN on the SIM.

Let’s take a practical example of this:

The HSS in the network has SQN for the SIM as 8232, and generates an authentication challenge vector for the SIM which includes the SQN of 8232.
The SIM receives this challenge, and makes sure that the SQN in the SIM, is equal to or less than 8232.
If the authentication passes, the new SQN stored in the SIM is equal to 8232 + 1, as that’s the next valid SQN we’d be expecting, and the HSS incriments the counters it has in the same way.

By constantly increasing the SQN and not allowing it to go backwards, means that even if we pre-generated a valid authentication vector for the SIM, it’d only be valid for as long as the SQN hasn’t been authenticated on the SIM by another authentication request.

Imagine for example that I get sneaky access to an operator’s HSS/AuC, I could get it to generate a stack of authentication challenges that I could use for my nefarious moustache-twirling purposes whenever I wanted.

This attack would work, but this all comes crumbling down if the SIM was to attach to the real network after I’ve generated my stack of authentication challenges.

If the SQN on the SIM passes where it was when the vectors were generated, those vectors would become unusable.

It’s worth pointing out, that it’s not just evil purposes that lead your SQN to get out of Sync; this happens when you’ve got subscriber data split across multiple HSSes for example, and there’s a mechanism to securely catch the HSS’s SQN counter up with the SQN counter in the SIM, without exposing any secrets, but it just ticks the HSS’s SQN up – It never rolls back the SQN in the SIM.

The Flaw – Draining the Pool

The Authentication Information Request is used by a cellular network to authenticate a subscriber, and the Authentication Information Answer is sent back by the HSS containing the challenges (vectors).

When we send this request, we can specify how many authentication challenges (vectors) we want the HSS to generate for us, so how many vectors can you generate?

TS 129 272 says the Number-of-Requested-Vectors AVP is an Unsigned32, which gives us a possible pool of 4,294,967,295 combinations. This means it would be legal / valid to send an Authentication Information Request asking for 4.2 billion vectors.

It’s worth noting that that won’t give us the whole pool.

Sequence numbers (SQN) shall have a length of 48 bits.
TS 133 102

While the SQN in the SIM is 48 bits, that gives us a maximum number of values before we “tick over” the odometer of 281,474,976,710,656.

If we were to send 65,536 Authentication-Information-Requests asking for 4,294,967,295 a piece, we’d have got enough vectors to serve the sub for life.

Except the standard allows for an unlimited number of vectors to be requested, this would allow us to “drain the pool” from an HSS to allow every combination of SQN to be captured, to provide a high degree of certainty that the SQN provided to a SIM is far enough ahead of the current SQN that the SIM does not reject the challenges.

Can we do this?

Our lab has access to HSSes from several major vendors of HSS.

Out of the gate, the Oracle HSS does not allow more than 32 vectors to be requested at the same time, so props to them, but the same is not true of the others, all from major HSS vendors (I won’t name them publicly here).

For the other 3 HSSes we tried from big vendors, all eventually timed out when asking for 4.2 billion vectors (don’t know why that would be *shrug*) from these HSSes, it didn’t get rejected.

This is a lab so monitoring isn’t great but I did see a CPU spike on at least one of the HSSes which suggests maybe it was actually trying to generate this.

Of course, we’ve got PyHSS, the greatest open source HSS out there, and how did this handle the request?

Well, being standards compliant, it did what it was asked – I tested with 1024 vectors I’ll admit, on my little laptop it did take a while. But lo, it worked, spewing forth 1024 vectors to use.

So with that working, I tried with 4,294,967,295…

And I waited. And waited.

And after pegging my CPU for a good while, I had to get back to real life work, and killed the request on the HSS.

In part there’s the fact that PyHSS writes back to a database for each time the SQN is incremented, which is costly in terms of resources, but also that generating Milenage vectors in LTE is doing some pretty heavy cryptographic lifting.

The Risk

Dumping a complete set of vectors with every possible SQN would allow an attacker to spoof base stations, and the subscriber would attach without issue.

Historically this has been very difficult to do for LTE, due to the mutual network authentication, however this would be bypassed in this scenario.

The UE would try for a resync if the SQN is too far forward, which mitigates this somewhat.

Cryptographically, I don’t know enough about the Milenage auth to know if a complete set of possible vectors would widen the attack surface to try and learn something about the keys.

Mitigations / Protections

So how can operators protect ourselves against this kind of attack?

Different commercial HSS vendors handle this differently, Oracle limits this to 32 vectors, and that’s what I’ve updated PyHSS to do, but another big HSS vendor (who I won’t publicly shame) accepts the full 4294967295 vectors, and it crashes that thread, or at least times it out after a period.

If you’ve got a decent Diameter Routing Agent in place you can set your DRA to check to see if someone is using this exploit against your network, and to rewrite the number of requested vectors to a lower number, alert you, or drop the request entirely.

Having common OP keys is dumb, and I advocate to all our operator customers to use OP keys that are unique to each SIM, and use the OPc key derived anyway. This means if one SIM spilled it’s keys, the blast doesn’t extend beyond that card.

In the long term, it’d be good to see 3GPP limit the practical size of the Number-of-Requested-Vectors AVP.

2G/3G Impact

Full disclosure – I don’t really work with 2G/3G stacks much these days, and have not tested this.

MAP is generally pretty bandwidth constrained, and to transfer 280 billion vectors might raise some eyebrows, burn out some STPs and take a long time…

But our “Send Authentication Info” message functions much the same as the Authentication Information Request in Diameter, 3GPP TS 29.002 shows we can set the number of vectors we want:

5GC Vulnerability

This only impacts LTE and 5G NSA subscribers.

TS 29.509 outlines the schema for the Nausf reference point, used for requesting vectors, and there is no option to request multiple vectors.

Summary

If you’ve got baddies with access to your HSS / HLR, you’ve got some problems.

But, with enough time, your pool could get drained for one subscriber at a time.

This isn’t going to get the master OP Key or plaintext Ki values, but this could potentially weaken the Milenage security of your system.

5G Core – The Service Based Architecture concept

16/08/20245G SA, Mobile Networks, RFCs & Standards5G, SBA, SBI, StandaloneNick

One of the new features of 5GC is the introduction of Service Based Interfaces (SBI) which is part of 5GC’s Service Based Architecture (SBA).

Let’s start with the description from the specs:

3GPP TS 23.501 [3] defines the 5G System Architecture as a Service Based Architecture, i.e. a system architecture in which the system functionality is achieved by a set of NFs providing services to other authorized NFs to access their services.
3GPP TS 29.500 – 4.1 NF Services

For that we have two key concepts, service discovery, and service consumption

Services Consumer / Producers

That’s some nice words, but let’s break down what this actually means, for starters, let’s talk about services.

In previous generations of core network we had interfaces instead of services. Interfaces were the reference point between two network elements, describing how the two would talk. The interfaces were the protocols the two interfaces used to communicate.

For example, in EPC / LTE S6a is the interface between the MME and the HSS, S5 is the interface between the S-GW and P-GW. You could lookup the 3GPP spec for each interface to understand exactly how it works, or decode it in Wireshark to see it in action.

5GC moves from interfaces to services. Interfaces are strictly between two network elements, the S6a interface is only used between the MME and the HSS, while a service is designed to be reusable.

This means the Service Based Interface N5g-eir can be used by the AMF, but it could equally be used by anyone else who wants access to that information.

3GPP defines the service in the form of service producer (The EIR produces the N5g-eir service) and the service consumer (The client connecting to the N5g-eir service), but doens’t restrict which network elements can

This gets away from the soup of interfaces available, and instead just defines the services being offered, rather than locking the

“service consumers” (which can be thought of clients in a client/server model) can discover “service producers” (like servers in a client/server model).

Our AMF, which acts as a “service consumer” consuming services from the UDM/UDR and SMF.

Service Discovery – Automated Discovery of NF Services

Service-Based Architecture enables 5G Core Network Function service discovery.

In simple terms, this means rather that your MME being told about your SGW, the nodes all talk to a “Network Repository Function” that returns a list of available nodes.

I’ve written a full blog post on the NRF that explains the concepts.

CM-Idle / CM-Connected and RM-Deregistered / RM-Registered States in 5GC

09/08/20245G SA, Mobile Networks, Uncategorized5G, 5GCNick

The mobility management and connection management process in 5GC focuses on Connection Management (CM) and Registration Management (RM).

Registration Management (RM)

The Registration Management state (RM) of a UE can either be RM-Registered or RM-Deregistered. This is akin to the EMM state used in LTE.

RM-Deregistered Mode

From the Core Network’s perspective (Our AMF) a UE that is in RM-Deregistered state has no valid location information in the AMF for that UE.
The AMF can’t page it, it doesn’t know where the UE is or if it’s even turned on.

From the UE’s perspective, being in RM-Deregistered state could mean one of a few things:

UE is in an area without coverage
UE is turned off
SIM Card in the UE is not permitted to access the network

In short, RM-Deregistered means the UE cannot be reached, and cannot get any services.

RM-Registered Mode

From the Core Network’s perspective (the AMF) a UE in RM-Registered state has sucesfully registered onto the network.

The UE can perform tracking area updates, period registration updates and registration updates.

There is a location stored in the AMF for the UE (The AMF knows at least down to a Tracking Area Code/List level where the UE is).

The UE can request services.

Connection Management (CM)

Connection Management (CM) focuses on the NAS signaling connection between the UE and the AMF.

To have a Connection Management state, the Registration Management procedure must have successfully completed (the UE being in RM-Registered) state.

A UE in CM-Connected state has an active signaling connection on the N1 interface between the UE and the AMF.

CM-Idle Mode

In CM-Idle mode the UE has no active NAS connection to the AMF.

UEs typically enter this state when they have no data to send / recieve for a period of time, this conserves battery on the UE and saves network resources.

If the UE wants to send some data, it performs a Service Request procedure to bring itself back into CM-Connected mode.

If the Network wants to send some data to the UE, the AMF sends a paging request for the UE, and upon hearing it’s identifier (5G-S-TMSI) on the paging channel, the UE performs the Service Request procedure to bring itself back into CM-Connected mode.

CM-Connected Mode

In CM-Connected mode the UE has an active NAS connection with the AMF over the N1 interface from the UE to the AMF.

When the access network (The gNodeB) determines this state should change (typically based on the UE being idle for longer than a set period of time) the gNodeB releases the connection and the UE transitions to CM-Idle Mode.

MTU in LTE & 5G Transmission Networks – Part 2

26/07/20245G SA, EUTRAN, LTE, Mobile Networks5G, LTE, MTU, TransportNick

So let’s roll up our sleeves and get a Lab scenario happening,

To keep things (relatively) simple, I’ve put the eNodeB on the same subnet as the MME and Serving/Packet-Gateway.

So the traffic will flow from the eNodeB to the S/P-GW, via a simple Network Switch (I’m using a Mikrotik).

While life is complicated, I’ll try and keep this lab easy.

Experiment 1: MTU of 1500 everywhere

Network Element	MTU
Advertised MTU in PCO	1500
eNodeB	1500
Switch	1500
Core Network (S/P-GW)	1500

So everything attaches and traffic flows fine. There is no problem right?

Well, not a problem that is immediately visible.

While the PCO advertises the MTU value at 1500 if we look at the maximum payload we can actually get through the network, we find that’s not the case.

This means if our end user on a mobile device tried to send a 1500 byte payload, it’d never get through.

While DNS would work, most TCP traffic would flow fine, certain UDP applications would start to fail if they were sending payloads nearing 1500 bytes.

So why is this?

Well GTP adds overhead.

8 bytes for the GTP header
8 bytes for the transport UDP header
20 bytes for the transport IPv4 header
14 bytes if our transport is using Ethernet

For a total of 50 bytes of overhead, assuming we’re not using MPLS, QinQ or anything else funky on our transport network and just using Ethernet.

So we have two options here – We can either lower the MTU advertised in our Protocol Configuration Options, or we can increase the MTU across our transport network. Let’s look at each.

Experiment 2: Lower Advertised MTU in PCO to 1300

Well this works, and looks the same as our previous example, except now we know we can’t handle payloads larger than 1300 without fragmentation.

Experiment 3: Increase MTU across transmission Network

While we need to account for the 50 bytes of overhead added by GTP, I’ve gone the safer option and upped the MTU across the transport to 1600 bytes.

With this, we can transport a full 1500 byte MTU on the UE layer, and we’ve got the extra space by enabling jumbo frames.

Obviously this requires a change on all of the transmission layer – And if you have any hops without support for this, you’ll loose packets.

Conclusions?

Well, fragmentation is bad, and we want to avoid it.

For this we up the MTU across the transmission network to support jumbo frames (greater than 1500 bytes) so we can handle the 1500 byte payloads that users want.

Creating a Fixed Line IMS subscriber in PyHSS

19/07/2024IMS / VoLTE, Mobile Networks, Software, VoIPDiameter, IMS, PyHSS, SIP, VoIP, VoLTENick

I generally do this with Python or via the Swagger UI for the Web UI, but here’s how we can create a fixed-line IMS subscriber in PyHSS, so we can register it with a softphone, without using EAP-AKA.

Firstly we create the AuC object for this password combo.

curl -X 'PUT' \
  'http://10.97.0.36:8080/auc/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "ki": "yoursippassword123",
  "opc": "",
  "amf": "8000",
  "sqn": "1"
}'

Get back the AuC ID from the JSON body, we’ll use this to provision the Sub:

 "auc_id": 110,

Next we create the subscriber, the speeds will be 0 as there is no data on the service, but we will add an default APN so the validation passes:

curl -X 'PUT' \
  'http://10.97.0.36:8080/subscriber/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "default_apn": 1,
    "roaming_rule_list": null,
    "apn_list": "0",
    "subscribed_rau_tau_timer": 600,
    "msisdn": "123451000001",
    "ue_ambr_dl": 0,
    "ue_ambr_ul": 0,
    "imsi": "001001000090001",
    "nam": 2,
    "enabled": true,
    "roaming_enabled": null,
    "auc_id": 110
  }'

Alright, that’s the basics done, now we’ll create the IMS subscriber.

Provision it:

curl -X 'PUT' \
  'http://10.97.0.36:8080/ims_subscriber/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "pcscf_realm": null,
  "scscf_realm": null,
  "pcscf_active_session": null,
  "scscf_peer": null,
  "msisdn": "123451000001",
  "pcscf_timestamp": null,
  "sh_template_path": "default_sh_user_data.xml",
  "msisdn_list": "123451000001",
  "pcscf_peer": null,
  "last_modified": "2024-04-25T00:33:37Z",
  "imsi": "001001000090001",
  "xcap_profile": null,
  "ifc_path": "default_ifc.xml",
  "sh_profile": "\n<!-- This container for the XCAP Data for the Subscriber -->\n<RepositoryData>\n    <ServiceIndication>ApplicationServer</ServiceIndication>\n    <SequenceNumber>0</SequenceNumber>\n    <ServiceData>\n        <!-- This is the actual XCAP Data for the Subscriber -->\n        \n        <!-- XCAP Default Template (no XCAP Data stored in Database) -->\n        <simservs xmlns=\"http://uri.etsi.org/ngn/params/xml/simservs/xcap\" xmlns:cp=\"urn:ietf:params:xml:ns:common-policy\">\n            <originating-identity-presentation active=\"true\" />\n            <originating-identity-presentation-restriction active=\"true\">\n                <default-behaviour>presentation-not-restricted</default-behaviour>\n            </originating-identity-presentation-restriction>\n            <communication-diversion active=\"true\">\n                <!-- No Answer Time -->\n                <NoReplyTimer>20</NoReplyTimer>\n                <cp:ruleset>\n                    <!-- Call Forward All Rule -->\n                    <cp:rule id=\"rule0\">\n                        <cp:conditions>\n                            <communication-diverted />\n                        </cp:conditions>\n                        <cp:actions>\n                            <forward-to>\n                                <target>sip:[email protected]</target>\n                            </forward-to>\n                        </cp:actions>\n                    </cp:rule>\n                    <!-- Call Forward Not Registered Rule -->\n                    <cp:rule id=\"rule1\">\n                        <cp:conditions>\n                            <not-registered />\n                        </cp:conditions>\n                        <cp:actions>\n                            <forward-to>\n                                <target>sip:[email protected]</target>\n                            </forward-to>\n                        </cp:actions>\n                    </cp:rule>\n                    <!-- Call Forward No Answer Rule -->\n                    <cp:rule id=\"rule2\">\n                        <cp:conditions>\n                            <no-answer />\n                        </cp:conditions>\n                        <cp:actions>\n                            <forward-to>\n                                <target>sip:[email protected]</target>\n                            </forward-to>\n                        </cp:actions>\n                    </cp:rule>\n                    <!-- Call Forward Busy Rule -->\n                    <cp:rule id=\"rule3\">\n                        <cp:conditions>\n                            <busy />\n                        </cp:conditions>\n                        <cp:actions>\n                            <forward-to>\n                                <target>sip:[email protected]</target>\n                            </forward-to>\n                        </cp:actions>\n                    </cp:rule>\n                    <!-- Call Forward Unreachable Rule -->\n                    <cp:rule id=\"rule4\">\n                        <cp:conditions>\n                            <not-reachable />\n                        </cp:conditions>\n                        <cp:actions>\n                            <forward-to>\n                                <target>sip:[email protected]</target>\n                            </forward-to>\n                        </cp:actions>\n                    </cp:rule>\n                </cp:ruleset>\n            </communication-diversion>\n        \n            <incoming-communication-barring active=\"true\">\n                <cp:ruleset>\n                    <cp:rule id=\"rule0\">\n                        <cp:conditions />\n                        <cp:actions>\n                            <allow>true</allow>\n                        </cp:actions>\n                    </cp:rule>\n                </cp:ruleset>\n            </incoming-communication-barring>\n\n            <outgoing-communication-barring active=\"false\">\n            </outgoing-communication-barring>\n        </simservs>\n\n    </ServiceData>\n    \n</RepositoryData>\n",
  "pcscf": null,
  "scscf": null,
  "scscf_timestamp": null
}'

And with that we’re done,

We can now register 001001000090001 at our IMS, with password yoursippassword123 which has the MSISDN / phone number 123451000001.

Easy!

Transport Keys & A4 / K4 Keys in EPC & 5GC Networks

12/07/20245G SA, EPC, LTE, Mobile Networks, RFCs & Standards, SDM, SecurityA4, Diameter, HSS, K4, SIM, SIM Card, Transport KeysNick

If you’re working with the larger SIM vendors, there’s a good chance they key material they send you won’t actually contain the raw Ki values for each card – If it fell into the wrong hands you’d be in big trouble.

Instead, what is more likely is that the SIM vendor shares the Ki generated when mixed with a transport key – So what you receive is not the plaintext version of the Ki data, but rather a ciphered version of it.

But as long as you and the SIM vendor have agreed on the ciphering to use, an the secret to protect it with beforehand, you can read the data as needed.

This is a tricky topic to broach, as transport key implementation, is not covered by the 3GPP, instead it’s a quasi-standard, that is commonly used by SIM vendors and HSS vendors alike – the A4 / K4 Transport Encryption Algorithm.

It’s made up of a few components:

K2 is our plaintext key data (Ki or OP)
K4 is the secret key used to cipher the Ki value.
K7 is the algorithm used (Usually AES128 or AES256).

I won’t explain too much about the crypto, but here’s an example from IoT Connectivity’s KiOpcGenerator tool:

def aes_128_cbc_encrypt(key, text):
    """
    implements aes 128b encryption with cbc.
    """
    keyb = binascii.unhexlify(key)
    textb = binascii.unhexlify(text)
    encryptor = AES.new(keyb, AES.MODE_CBC, IV=IV)
    ciphertext = encryptor.encrypt(textb)
    return ciphertext.hex().upper()

It’s important when defining your electrical profile and the reuqired parameters, to make sure the operator, HSS vendor and SIM vendor are all on the same page regarding if transport keys will be used, what the cipher used will be, and the keys for each batch of SIMs.

Here’s an example from a Huawei HSS with SIMs from G&D:

%%LST K4: HLRSN=1;%%
RETCODE = 0 SUCCESS0001:Operation is successful

        "K4SNO" "ALGTYPE"     "K7SNO" "KEYNAME"
        1       AES128        NONE    G+D

We’re using AES128, and any SIMs produced by G&D for this batch will use that transport key (transport key ID 1 in the HSS), so when adding new SIMs we’ll need to specify what transport key to use.

NB-IoT Flows for NIDD

05/07/2024EPC, EUTRAN, LTE, Mobile Networks, RFCs & Standards, SDMDiameter, IoT, LTE, NB-IoTNick

In our last post we covered the basics of NB-IoT Non-IP Data Deliver (NIDD), and if that acronym soup wasn’t enough for you, we’re going to take a deep dive into the flows for attaching, sending, receiving and closing a NIDD session.

The attach for NIDD is very similar to the standard attach for wideband LTE, except the MME establishes a connection on the T6a Diameter interface toward the SCEF, to indicate the sub is online and available.

The NIDD Attach

The SCEF is now able to send/receive NIDD traffic from the subscriber on the T6a interface, but in reality developers don’t / won’t interact with Diameter, so the SCEF exposes the T8 API that developers can interact with to access an abstraction layer to interact with the SCEF, and then through onto the UE.

If you’re wondering what the status of Open Source SCEF implementations are, then you may have already guessed we’re working on one! PyHSS should have support for NB-IoT SCEF features in the future.

NB-IoT provides support for Non-IP Data Delivery (NIDD) over 3GPP Networks, but to handle this, some new network elements are introduced, in a home network scenario that’s the SCEF and the SCF/AS.

On the 3GPP side the SCEF it communicates to the MME via the T6a Interface, which is based upon Diameter.

On the side towards our IoT Service Consumers (in the standards referred to as “SCS/AS” or “Service Capabilities Server Application Servers” (catchy names as always), via the RESTful HTTP based T8 interface.

I’ve written about Non-IP Data support in 5G for transporting Ethernet, but there’s another non-IP use case in 3GPP networks – This time for NB-IoT services.

Procedures

S1 Attach

The start of the S1 Attach procedure is very similar to a regular S1 attach.

The initial S1 PDU Connectivity Request indicates in the ESM Message Container that the PDN Type is Non IP.

S1 PDU Connectivity Request from attach procedure

Other than that, the initial attach procedure looks very similar to the regular S1 attach procedure.

On the S6a interface the Update Location Request from the MME to the HSS indicates that this is an EUTRAN-NB-IoT Radio Access Type.

And the Update Location Answer APN Configuration contains some additional AVPs on the APN to indicate that the APN supports Non-IP-PDN-Type and that the SCEF is used for Data Delivery.

The SCEF-ID (Diameter Host) and SCEF-Realm (Diameter realm) to serve this user is also specified in the APN Configuration in the Update Location Answer.

This is how our MME determines where to send the T6a traffic.

With this, the MME sends a Connection Management Request (CMR) towards the SCEF specified in the SCEF-ID returned by the HSS.

The Connection Management Request / Response

The MME now sends a Diameter T6a Connection Management Request to the SCEF in the Update Location Answer,

In it we have a Session-Id, which continues for the life of our NIDD session, the service-selection which contains our APN (In our case “non-ip”) and the User-Identifier AVP which contains the MSISDN and/or IMSI of the subscriber.

To accept this, the SCEF sends back a Connection-Management-Answer to confirm we’re all good to go:

At this point our SCEF now knows about the subscriber who’s just attached to our network, and correlates it with the APN and the session-ID.

On the S1 side the connection is confirmed and we’re ready to roll.

Mobile Originated Data Request / Response

When the UE wants to send NIDD it’s carried in NAS messaging, so we see an Uplink NAS transport from the UE and inside the NAS payload itself is our HEX data.

Our MME grabs this out and sends it in the form of of a Mobile-Originated-Data-Request (MODR) to the SCEF, along with the same Session-ID that was setup earlier:

At this stage our Non-IP Data is exposed over the T8 RESTful API, which we won’t cover in this post.

eMBMS Architecture in LTE EPC

28/06/2024EPC, LTE, Mobile NetworksBroadcast, eMBMS, MBMS, MulticastNick

Note: I’m lazily posting this as its been in my drafts folder for an exceedingly long time – Before going too much further, it’s worth pointing out that eMBMS never really made it anywhere – no production networks of note use eMBMS. I started researching it and my interest petered out once I discovered I couldn’t get any UEs or hardware that supported eMBMS.

Mobile networks are designed as point to point, all traffic is unicast.

But multicast and broadcast traffic is real, and becoming more common in some applications.

In areas where users stream the same radio program, or TV show, live, each of them is consuming the same data stream, but each one gets sent a unique copy of the data, on a resource block allocated to them for reception of the data.

If we have 10 users on a cell, each streaming a 5Mbps live video, that’s 50Mbps of capacity taken up on the radio / air interface.
If that stream was moved onto a eMBMS service, only 5Mbps of capacity would be used, regardless of how many people on the cell are consuming it.

For Mission Critical Push to Talk applications, the lack of broadcast/multicast support was highlighted again. For a PTT app with 10 users in a talk group, you’d need to schedule resource blocks for 10 users, and allocate 10 radio resources 10 times, send GTP packets 10 times, all to send the same data to 10 people.

So enter eMBMS – The Evolved Multimedia Broadcast and Multicast Service, providing multicast service for LTE.

Overall Architecture

eMBMS introduces a few changes to the RAN side to handle support for a shared data channel, which is sent by the eNodeB and that UEs can listen on to get data. (More on admission control later)

From a core perspective two new network elements are introduced, the Broadcast/Multicast Service Center (BM-SC) and Multimedia Broadcast Multicast Services Gateway (MBMS GW), these elements function in much the same was the P-GW and S-GW retrospectively, but in regards to Multicast services.

Like so many 3GPP specs before it, MBMS relies on GTP for transporting the data to be distributed, and relies on GTPv2-C for control plane data.

BM-SC – Broadcast Media Service Centre

The Broadcast Multicast Service Centre acts as the gateway between content providers (providing streams of data to be distributed) and the EPC.

The BM-SC sets up eMBMS sessions and pulls broadcast data from the content providers and collects receipts from subscribers of some streams to charge / track consumption of the services.

In this regard the BM-SC is akin to the P-GW, which as the border for the EPC and external networks, except it’s largely unidirectional.

MBMS Gateway

The MBMS Gateway (MBMS-GW) encapsulates the broadcast data stream from the BM-SC and encapsulates it into GTP packets to be distributed to eNBs across the network.

The MBMS-GW allocates a multicast transport address for each broadcast data stream?

MME Interaction

For this a new interface is introduced on the MME – the Sm interface, which interconnects the MME and the MBMS-Gateways assigned to it.

Key Interfaces / Reference Points

Sm Interface (MME <-> MBMS GW)

MBMS Session Start Request/Response
MBMS Session Update Request/Response
MBMS Session Stop Request/Response

SGmb Interface (MBMS GW <-> BM-SC)

Control plane signaling

SGimb Interface (MBMS GW <-> BM-SC)

User Plane Signalling (Media)

Uncomfortable Questions to ask about 5G Standalone at MWC – Part 3 – We have to do 5GC.

21/06/20245G SA, Mobile Networks, RFCs & Standards5G, SA, StandaloneNick

This post follows on from Part 1 and Part 2 of this 3 part series.

We are forced to move to 5G-SA

Claim: We must use 5G-SA with this spectrum (It’s a condition of the license)

I’ll concede that if it is a requirement for a license or funding, that 5G-SA be used, then that’s a pretty ironclad reason to introduce 5G-SA.

Claim: Users will Leave if you don’t have 5G-SA

We could argue the opposite effect will happen; Shifting to SA will reduce your user base.
Here’s why:

Users experiencing 5G-NSA (Non-Standalone) today, are already getting the speed boost from “5G”.

From a user perspective, while 5G-NSA support has been becoming common on mid-to-high priced handsets, handsets supporting 5G-SA are far less common.

Dish’ Project Genesis is one of the only examples of a 5G SA network deployed on a large scale. It launched with only a single supported phone (A Motorola branded handset) and today the supported phone list is very short, limited to expensive flagships. This lack of handset support means users must purchase a handset through Dish rather than being able to bring their own phones, as the only way that compatibility can be guaranteed is by controlling the whole ecosystem.

Unless you are in a highly developed market with 2G and 3G turned off, where the majority of your user base has recent generation flagship phones capable of supporting these features, you’re shrinking your addressable market with 5G-SA, rather than expanding it.

Conclusion – 5G-SA doesn’t stack up, what do I do?

SA doesn’t make sense for a lot of operators and markets – for now. I’m sure this post will look pretty dated in a few years time as many of these factors change and as operators sunset 2G and 3G networks.

I’m not advocating for 5G-SA never, I’m advocating not 5G-SA today.

There are simply better options out there for spending that operations budget to make network improvements.

Off the bat some ideas to expore:

Optimize your existing network.
Roll out NSA to an even larger area.
Shutdown 2G/3G layers.
Simplify your operations.
Cut down the number of vendors and moving parts.
Simplify again.
Automate.
Simplify more.

Doing this will mean you can enjoy cost savings from reduced headcount thanks to a simpler network.
Simpler networks have better up-time, thanks to operating a network that’s less frankensteiny – less cobbled together from disparate legacy parts.
You’ll also Enjoy reduced opex from all the systems you’ve shut down and cheaper roaming from all the bilaterals you moved to VoLTE.

All of these tasks will keep project teams busy for years and put the MNO in a stronger position moving forward, without getting distracted by slick marketing and shiny brochures.

Getting started with PyHSS

07/06/20245G SA, IMS / VoLTE, LTE, Mobile NetworksDiameter, IMS, LTE, PyHSS, VoIPNick

PyHSS is our open source Home Subscriber Server, it’s written in Python, has a variety of different backends, and is highly perforate (We benchmark to 10K transactions per second) and infinitely scaleable.

In this post I’ll cover the basics of setting up PyHSS in your enviroment and getting some Diameter peers connected.

For starters, we’ll need a database (We’ll use MySQL for this demo) and an account on that database for a MySQL user.

So let’s get that rolling (I’m using Ubuntu 24.04):

sudo apt update
sudo apt install mysql-server

Next we’ll create the MySQL user for PyHSS to use:

CREATE USER 'pyhss_user'@'%' IDENTIFIED BY 'pyhss_password';
GRANT ALL PRIVILEGES ON *.* TO 'pyhss_user'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;

We’ll also need Redis as well (PyHSS uses Redis for inter-service communications and for caching), so go ahead an install that for your distro:

sudo apt install redis-server

So that’s our prerequisites sorted, let’s clone the PyHSS repo:

git clone https://github.com/nickvsnetworking/pyhss /etc/pyhss

And install the requirements with pip from the PyHSS repo:

pip3 install -r requirements.txt

Next we’ll need to configure PyHSS, for that we update the config file (config.yaml) with the settings we want to use.

We’ll start by setting the bind_ip to a list of IPs you want to listen on, and your transport – We can use either TCP or SCTP.

For Diameter, we will set OriginHost and OriginRealm to match the Diameter hostname you want to use for this peer, and the Realm of your Diameter network.

Lastly we’ll need to set the database parameters, updating the database: section to populate your credentials, setting your username and password and the database to match your SQL installation we setup at the start.

With that done, we can start PyHSS, which we do using systemctl.

Because there’s multiple microservices that make up PyHSS, there’s multiple systemctl files use to run PyHSS as a service, they’re all in the /systemd folder.

We’ll copy them all to our systemd folder.

cp /etc/pyhss/systemd/ /lib/systemd/system/
systemctl daemon-reload
systemctl start pyhss

And with that we’ve got PyHSS running and ready for a Diameter peer to connect.

Now you should be able to bring our Diameter peers up.

If you’re using something like Kamailio, with C-Diameter Peer, you can read about the config for that here, or FreeDiameter you can read about here.

In the next post, we’ll cover subscriber provisioning via the API.

The power of the PyHSS EIR

24/05/2024Mobile Networks, SDMDiameter, EIR, IMS, PyHSS, S13, VoLTENick

The Equipment Identity Register (EIR) is a pretty handy function in 3GPP networks.

Via the Diameter based S13 interface, the MME, is able to query the EIR to ask if a given IMEI & IMSI combination should be allowed to attach.

This allows stolen / grey market / unauthorized devices (IMEIs) to be rejected from the network, the EIR can have a list of “bad” IMEIs that if seen will reject the request.

It also allows us to lock a SIM (IMSI) to a given device (IMEI) or type of device – We can use this for say a Fixed Wireless service, to lock the SIMs (IMSIs) to a range of modems (IMEI Prefixes).

Lastly it gives us insight and analytics into the devices used on the network, by mapping the IMEI to a device, we can say that IMEI 1234567890 is an Apple iPhone 12 Pro Max, or a Nokia Fastmile 5G-24W-A.

PyHSS supports all these capabilities, so let’s have a look at how we’d manage / access them.

Setting up EIR Rules

These rules are set via the RESTful API in PyHSS.

The Equipment Identity Register built into PyHSS supports matching in one of two modes, set by regex_mode.

In Exact Mode (regex_mode: 0) matches are based on an exact matching IMEI, and matching the IMSI if set (If IMSI is set to nothing (”), then only the IMEI is evaluated).

Exact Mode is suited for IMEI/IMSI locking, to ensure a SIM is locked to a particular device, or to blacklist stolen devices.

Regex Mode (regex_mode: 1) matches based on Regex, this is suited for whitelisting IMEI prefixes for say, specific validated vendors.

The match_response_code maps to the Equipment-Status AVP output, so specified values are:

0 : ‘Whitelist’
1: ‘Blacklist’
2: ‘Greylist’

Some end to end examples of this provisioned into the API:

IMSI / IMEI Binding

{
      'imei': '1234', 
      'imsi': '567',
      'regex_mode': 0, 
      'match_response_code': 0
}

If IMSI is equal to 567 and is in use in IMEI 1234, then the response code returned is 0 (Whitelist).

IMEI Matching (Blacklist lost / stolen devices)

{
      'imei': '99881232',
      'imsi': '', 
      'regex_mode': 0, 
      'match_response_code': 1
}

If the IMEI is equal to 99881232 used with any IMSI, then the response code returned is 1 (Blacklist). This would be used for devices reported stolen.

IMEI Prefix Match (Blacklist / Whitelist all devices of type)

{
      'imei': '^666.*',
      'imsi': '', 
      'regex_mode': 1, 
      'match_response_code': 1
}

If the IMEI starts with 666, then the response code returned is 1 (Blacklist).

IMEI & IMSI Regex Match

{
      'imei': '^777.*',
      'imsi': '^1234123412341234$', 
      'regex_mode': 1, 
      'match_response_code': 2
}

If the IMEI starts with 777 and the IMSI is 1234123412341234 then return 2 (Greylist).

No Match Behaviour

If there is no match from the backend, then the config parameter no_match_response dictates the response code returned (Blacklist/Whitelist/Greylist).

Mapping Type Allocation Codes (TACs) to IMEIs

There are several data feeds of the Type Allocation Codes (TACs) which map a given IMEI prefix to a model number.

Unfortunately, this data is not freely available, so we can’t bundle it with PyHSS, but if you have the IMEI Database, you can load it into PyHSS using Redis, to allow us to report on this data.

In your config.yaml you’ll just need to set the tac_database parameter, which will read the data on startup.

Triggering on SIM Swap

If we keep track of the current IMSI/IMEI combination used for each SIM/Device, we can get notified every time it changes.

You might want to use this to trigger OTA provisioning or clear old data in your IMS.

For that we can use the sim_swap_notify_webhook in the config to send a HTTP POST to a given endpoint to inform it that a SIM is now in a different device.

We also have to have imsi_imei_logging set to true in the Config in order to log the history.

Reporting on IMEIs

We can also log/capture historical data about IMSI/IMEI combinations.

We use this from a customer support perspective to be able to see if a customer has recently changed phones, so if they call support, our staff can ask the customer about it to help troubleshoot.

“I can see you were connected previously on a Samsung Galaxy S22, but now you’re using a Nokia 3310, did the issues happen before you moved phones?”

This is super handy.

We can get a general log of IMSI vs IMEI like this:

Feed of IMSI vs IMEI along with a timestamp and the response that was sent back

But what’s more useful is searching for a IMSI or an IMEI and then getting back a full list of devices / SIMs that have been used.

Searching for an IMSI I can see it’s only ever been used in this Samsung Galaxy

Lastly via Grafana we export all this data, which allows us to visualize this data and build dashboards showing the devices on the network.

Visualizing EIR Data in Grafana

PyHSS includes a Promethues exporter, when it comes to prom_eir_devices_total it lists each seen Type Allocation Code / UE in the network, along with the number we’ve seen of each.

Raw it looks like this:

But visualized in Grafana we can get a dashboard to give us a breakdown per vendor: