Category Archives: 5G SA

Importing building footprints into Forsk Atoll from OpenStreetMap data

01/08/20255G SA, EUTRAN, GSM, LTE, Mobile Networks, RF, SoftwareAtoll, Forsk, RF, RF PlanningNick

Having building footprints inside Atoll is super-duper valuable, this means you can calculate your percentage of homes / buildings covered, after all geographic coverage and population coverage are two very different things.

Download the data from OSM data – If you only need a small are you can use the Export OSM page, or if you need a wider area Geofabrik provides country level exports of the data, or if you’re really keen you can download all the OSM data.

Once you’ve got the export, we’ll load the .gpkg file (or files) into GlobalMapper

Select one layer at a time that you want to export into Atoll. (This also works for roads, geographic boundaries, POIs, etc)

Export the selected layer from Export -> Export Vector / Lidar Format

Set output type to “Shapefile”

Set output filename in “Export Areas” (This will be the output file). If you want to limit the export to a given area you can do that in Export Bounds.

Now we can import this data into Atoll.

File -> Import

Select the exported Shapefile we just created.

Set the projection and import

Bingo now we’ve got our building footprints,

We can change the style of the layer and the labels as needed.

Now we can use the buildings as the Focus Zone / Compute Zone and then run reports and predictions based on those areas.

For example I can run Automatic Cell Planning with the building layers as the Focus zones, to optimize azimuths, tilts and powers to provide coverage to where people live, not just vacant land.

Importing Global Clutter data into Forsk Atoll

11/07/20255G SA, EUTRAN, GSM, LTE, Mobile Networks, Notes, RF, SoftwareAtoll, Forsk, RF, RF PlanningNick

Clutter data describes real world things on the planet’s surface that attenuate signals, for example trees, shrubs, buildings, bodies of water, etc, etc. There’s also different types of trees, some types of trees attenuate signals more than others, different types of buildings are the same.

Getting clutter data used to be crazy expensive, and done on a per country or even per region basis, until the European Space Agency dropped a global dataset free of charge for anyone to use, that covered the entire planet in a single source of data.

So we can use this inside Forsk Atoll for making our predictions.

First things first we’ll need to create an account with the ESA (This is not where they take astronaut applications unfortunately, it just gives you access to the datasets).

Then you can select the areas (tiles) you want to download after clicking the “Download” tab on the right.

We get a confirmation of the tiles we’re download and we’ll get a ZIP file containing the data.

We can load the whole ZIP file (Without needing to extract anything) into GlobalMapper which loads all the layers.

I found the _Map.tif files the highest resolution, so I’m only exporting these.

Then we need to export the data to GeoTiff for use in Atoll (The specific GeoTiff format ESA produces them in is not compatible with Atoll hence the need to convert), so we export the layers as Raster / Image format.

Atoll requires square pixels, and we need them in meters, so we select “Calculate Spacing in Other Units”.

Then set the spacing to meters (I use 1m to match everything else, but the data is actually only 10m accurate, so you could set this to 10m).

You probably want to set the Export Bounds to just the areas you’re interested in, otherwise the data gets really big, really quickly and takes forever to crunch.

Now for the fancy part, we need to import it into Atoll.

When we import the data we import it as Raster data (Clutter Classes) with a pixel size of 1m.

Alas when we exported the data we’ve lost the positioning information, so while we’ve got the clutter data, it’s just there somewhere on the planet, which with the planet being the size it is, is probably not where you need it.

So I cheat, I start put putting the West and North values to match the values from a Cell Site I’ve already got on the map (I put one in the top left and bottom right corners of the map) and use that as the initial value.

Then – and stick with me, this is very technical – I mess with the values until the maps line up into the correct position. Increase X, decrease Y, dialing it it in until the clutter map lines up with the other maps I’ve got.

Right, now we’ve got the data but we don’t have any values.

Each color represents a clutter class, but we haven’t set any actual height or losses for that material.

To know what each colour means we need to RTFM – ESA WorldCover 2020 User Manual.

Which has a table:

Alas the Map Code does not match with the table in the manual, but the colours do, here’s what mine map to:

Which means when hovering over a layer of clutter I can see the type:

Next we need to populate the heights, indoor and outdoor losses for that given clutter. This is a little more tricky as it’s going to vary geography to geography, but there’s indicative loss numbers available online pretty easily.

Once you’ve got that plugged in you can run your predictions and off you go!

MBR & GBR Values in Bearer Level QoS

23/05/20255G SA, EUTRAN, LTE, Mobile Networks, UncategorizedEPC, EUTRAN, GTP, GTP-C, GTPv2, GTPv2C, LTENick

The other day I had a query about a roaming network that was sending Bearer Level QoS parameters in the Create Session Request to 0Kbps, up and down rather than populating the MBR values.

I knew for Guaranteed Bit Rate bearers that this was of course set, but for non GBR bearers (QCI 5 to 9) I figured this would be set the to MBR, but that’s not the case.

So what gives?

Well, according to TS 29.274:

For non-GBR bearers, both the UL/DL MBR and GBR should be set to zero.

So there you have it, if it’s not a QCI 1-4 bearer then these values are always 0.

What’s the point of Subscribe in IMS – Does it do anything useful?

02/05/20255G SA, IMS / VoLTE, Kamailio, LTE, Mobile Networks, VoIPIMS, LTE, SIP, Subscribe, VoLTENick

Nope – it doesn’t do anything useful. So why is it there?

The SUBSCRIBE method in SIP allows a SIP UAC to subscribe to events, and then get NOTIFY messages when that event happens.

In a plain SIP scenario (RFC 3261), we can imagine an IP Phone and a PBX scenario. I might have “Busy Lamp Field” aka BLF buttons on the screen of my phone, that change colour when the people I call often are themselves on calls or on DND, so I know not to transfer calls to them – This is often called the “presence” scenario as it allows us to monitor the presence of another user.

At a SIP level, this is done by sending a SUBSCRIBE to the PBX with the information about what I’m interested in being told about (State changes for specific users) and then the PBX will send NOTIFY messages when the state changes.

But in IMS you’ll see SUBSCRIBE messages every time the subscriber registers, so what are they subscribing for?

Well, you’re just subscribing to your own registration status, but your phone knows your own registration status, because it’s, well, the registration status of the phone.

So what does it achieve? Nothing.

The idea was in a fixed-mobile-convergence scenario (keeping in mind that’s one of the key goals from the 2008 IMS spec) you could have the BLF / presence functionality for fixed subscribers, but this rareley happens.

For the past few years we’ve just been sending a 200 OK to SUBSCRIBE messages to the IMS, with a super long expiry, just to avoid wasting clock cycles.

GTPv2 Instance IDs

18/04/20255G SA, EPC, LTE, Mobile Networks, Notes, RFCs & StandardsEPC, GTP, GTP-C, GTPv2, GTPv2C, LTENick

I was diffing two PCAPs the other day trying to work out what’s up, and noticed the Instance ID on a GTPv2 IE was different between the working and failing examples.

So what does it denote, well from TS 129.274:

If more than one grouped information elements of the same type, but for a different purpose are sent with a message,
these IEs shall have different Instance values.

So if we’ve got two IEs of the same IE type (As we often do; F-TEIDs with IE Type 87 may have multiple instances in the same message each with different F-TEID interface types), then we differentiate between them by Instance ID.

The only exception to this rule is where we’ve got the same data, so if you’ve got one IE with the exact same values and purpose that exists twice inside the message.

A tale of two CPRIs

09/01/20255G SA, EUTRAN, GSM, History, LTE, Mobile Networks, RFCs & StandardsOBSAI, OpenRAN, PLMN, RAN, StandardsNick

It was the best of times, it was the worst of times. It was the age of wisdom, it was the age of foolishness. It was the epoch of belief, it was the epoch of incredulity. It was the season of Light, it was the season of Darkness. It was the spring of hope, it was the winter of despair.
A tale of two Cities

When Dickens wrote of Doctor Manette in the 1859, I doubt his intention was to write about the repeating history of RAN fronthaul standards – but I can’t really say for sure.

Setting the Scene

Our story starts with introducing CPRI (Common Public Radio Interface) interface, having been imprisoned in the Bastille of vendor lock in for the better part of twenty years.

Think of CPRI is less of a hard interoperable standard and more like how the Italian and French languages are both derived from Latin; it doesn’t mean that the two languages are the same, but they’ve got the same root and may share some common words and structures.

In practice this means that taking an Ericsson Radio and plugging it into a Huawei Baseband simply won’t work – With CPRI you must use the same vendor for the Baseband and the Radios.

Huawei BBU 3900 Architecture — Image from my post on setting up Huawei Base stations, showing the Huawei Baseband (BBU) connecting to the Huawei Radios (RRUs) via CPRI (in Yellow)

The Unexpected Plot Twist

“Nuts to this” the industry said after being stuck locked between the same radios and baseband for years; we should create a standard so we can mix and match between radio vendors, and even standardize some other stuff that’s been bothering us, so we’ll have a happy world of interoperability.

With kit created that followed this standard, we’d be able to take components from vendor A, B & C, and fit them together like Lego, saving you some money along the way and giving you’ve got a working solution made of “best of breed” components, where everything is interoperable.

*Omnitouch Lego base stations, which also fit together like Lego – Part of the Omnitouch Network Services “swag” from 2024*

So the industry created a group to chart a path for a better tomorrow by standardizing these interfaces.

The group had many industry heavyweights like Nokia, NEC, LG, ZTE and Samsung joining.

The key benefits espoused on their website:

An open market will substantially reduce the development effort and costs that have been traditionally associated with creating new base station product ranges. The availability of off-the-shelf base station modules will enable manufacturers to focus their development efforts on creating further added value within the base station, encouraging greater innovation and more cost-effective products. Furthermore, as product development cycles will be reduced, new base station functions will become available on the market more quickly.
Mission statement of the group

In addition to being able to mix and match radios and basebands from different vendors, the group defined standards for centralized baseband, and interoperable standards, to allow a multi-vendor ecosystem to flourish.

And here’s the plot twist – The text above, was not written about OpenRAN, and it was not written about the benefits of eCPRI.

It was written about Open Base Station Architecture Initiative (OBSAI) and it was written 22 years ago.

*record screech sound*

This image was called "Confused Ernie" but it's clearly Bert...

Standards War you’ve never heard of: OBSAI vs CPRI

When OBSAI was defined it was not without competition; there was another competing fronthaul standard; that’s right, the mustache twirling lowlife from earlier in the story – CPRI.

Supported by Huawei, Nortel, NEC & Ericsson (among others), CPRI took a “gentle parenting” approach to the standards world, in contrast to OBSAI.
Instead of telling all the vendors to agree on an interoperable front haul standard, CPRI just encouraged everyone to implement what their heart told them and what felt right to them.

As it happened, the industry favored the CPRI approach.

If a vendor wanted to add a new “word” in their CPRI “language” to add a new feature, they just went ahead and added it – It didn’t require anyone else to agree with them or changes to a common standard used by the industry, vendors could talk to the kit they made how they wanted.

CPRI has been the defacto non-standard used by all the big kit vendors for the past ~10 years.

The Death of OBSAI & the Birth of OpenRAN’s eCPRI

Why haven’t you heard of OBSAI? Why didn’t the OBSAI standard just serve as the basis for eCPRI – After all the last OBSAI release was less than 5 years before TIP started working on eCPRI publicly.

Did a schism over “uplink performance improvement” options lead to “irreconcilable differences” between parties leading to the breakup of the OBSAI group?

Nope.

Customers (MNOs) didn’t buy OBSAI based equipment in measurably larger quantities than CPRI kit. That’s it.

This meant the vendors invested less in paying teams to further develop the standards, the OBSAI group met less frequently, and in the end, member vendors didn’t bother adding support for OBSAI to new equipment and just used the easier and more flexible CPRI option instead.

At some point someone just stopped paying for the domain renewal and that was it, OBSAI was no more.

This is how the standards body ends, not with a bang, but with a whimper.
T.S. Elliot’s writings on the death of obsai

Those who do not learn from history…

The goals of the OBSAI Group and OpenRAN working groups are almost identical, so what lessons did Marconi, Motorola and Alcatel learn as members of OBSAI that other vendors could learn about OpenRAN strategy?

There are no mentions of OBSAI in any of the information published by OpenRAN advocates, and I’m wondering if folks aren’t aware that history tends to repeat and are ignorant to what came before it, or they’re just not learning lessons from the past?

So what can the OpenRAN industry learn from OBSAI?

Being a nerd, I started detailing the technical challenges, but that’s all window dressing; The biggest hurdle facing CPRI vs eCPRI are the same challenges OBSAI vs CPRI faced a decade prior:

To be relevant, OpenRAN kit has to be demonstrably better than what we have today AND provide a tangible cost saving.

OBSAI failed at achieving this, and so failed to meet it’s other more noble goals.

[At the time of writing this at least] I’d contend that neither of those two criteria have been met by OpenRAN.

What does the future hold for OpenRAN?

Looking into the crystal ball, will OpenRAN and eCPRI go the way of OBSAI, or will someone keep the OpenRAN dream alive?

Today, we’re still seeing the MNOs continue to provide tokenistic investment in OpenRAN. But being a cynic, I’d say the MNOs are feigning interest in OpenRAN products because it’s advantageous for them to do so.

The threat of OpenRAN has proven to be a great stick to beat the traditional vendors with to force them to lower their prices.

Think about the $14 billion USD Ericsson deal with AT&T, if chucking a few million at OpenRAN pilots / trials lead to AT&T getting even a 0.1% reduction in what they’re paying Ericsson, then the numbers would have worked out well in AT&Ts favor.

From the MNOs perspective, the cost to throw the odd pilot or trial to a hungry OpenRAN vendor to keep them on the hook is negligible, but the threat of OpenRAN provides leverage and bargaining power every time it’s contract renewal time with the big RAN vendors.

Already we’ve seen all the traditional RAN vendors move to neutralize this threat by introducing “OpenRAN compatible” equipment and talking up their commitment to openness.

This move by the RAN vendors takes this sting out of the OpenRAN threat, and means MNOs won’t have much reason to continue supporting OpenRAN.

This leaves the remaining OpenRAN vendors like Miss Havisham, forever waiting in their proverbial wedding dresses, having being left at the altar.

Okay, I’m mixing my Dickens’ references here, but it was too good not to.

Appendix

I’ve been enjoying writing more analysis than just technical content, let me know if this is something you’re interested in seeing more of.

I’ve been involved in two big OpenRAN integration projects, both of which went poorly and probably tainted my perspective. Enough time has passed to probably write up how it all went with the vendor names removed, but that’s a post for another time!

If you wanted to learn more about OBSAI Archive.org has their old website available for reading.

TFTs & Create Bearer Requests

27/12/20245G SA, EPC, IMS / VoLTE, LTE, Mobile Networks, Notes, RFCs & StandardsCharging Rule, Diameter, EPC, Gx, TFT, Traffic Flow TemplateNick

What is included in the Charging Rule on Gx ultimately turns into a Create Bearer Request on GTPv2-C.

But the mapping it’s always obvious, today I got stuck on the difference between a Single remote port type, and a single local port type, thinking that the Packet Filter Direction in the TFT controlled this – It doesn’t – It’s controlled by the order of your Traffic Flow Template rule.

Input TFT:

"permit out 17 from any 50000 to any"

Leads to Packet filter component type identifier: Single remote port type

Whereas a TFT of:

permit out 17 from any to any 50000

Leads to Packet filter component type identifier: Single local port type (64)

Background to the “VoLTE Mess”

06/12/20245G SA, IMS / VoLTE, LTE, Mobile Networks, RFCs & Standards, VoIPIMS, LTE, SIP, VoIP, VoLTENick

I’ve been writing a fair bit recently about the “VoLTE Mess” – It’s something that’s been around for a long time, mostly impacting greenfield players rolling out LTE only, but now the big carriers are starting to feel it as they shut off their 2G and 3G networks, so I figured a brief history was in order to understand how we got here.

Note: I use the terms 4G or LTE interchangeably

The Introduction of LTE

LTE (4G) is more “spectrally efficient” than the technologies that came before it. In simple terms, 1 “chunk” of spectrum will get you more speed (capacity) on LTE than the same size chunk of spectrum would on 2G or 3G.

From my post on 5G being a bit overhyped

So imagine it’s 2008 and you’re the CTO of a mobile network operator.
Your network is congested thanks to carrying more data traffic than it was ever designed for (the first iPhone had launched the year before) and the network is struggling under the weight of all this new data traffic.
You have two options here, to build more cell sites for more density (very expensive) or buy more spectrum (extremely expensive) – Both options see you going cap in hand to the finance team and asking for eye-wateringly large amounts of capital for either option.

But then the answer to your prayers arrives in the form of 3GPP’s Release 8 specification with the introduction of LTE. Now by taking some 2G or 3G spectrum, and by using it on 4G, you can get ~5x more capacity from the same spectrum. So just by changing spectrum you own from 2G or 3G to 4G, you’ve got 5x more capacity. Hallelujah!

So you go to Nortel and buy a packet core, and Alcatel and Siemens provide 4G RAN (eNodeBs) which you selectively deploy on the cell sites that are the most congested.
The finance team and the board are happy and your marketing team runs amok with claims of 4G data speeds.
You’ve dodged the crisis, phew.

This is the path that all established mobile operators took; throw LTE at the congested cell sites, to cheaply and easily free up capacity, and as the natural hardware replacement cycle kicked in, or cell sites reached capacity, swap out the hardware to kit that supports LTE in addition to the 2G and 3G tech.

Circuit Switched Fallback

But it’s hard to talk about the machinations of late 2000s telecom executives, without at least mentioning Hitler.

This video below from 15 years ago is pretty obscure and fairly technical, but the crux of it it is that Hitler is livid because LTE does not have a “CS Domain” aka circuit switched voice (the way 2G and 3G had handled voice calls).

It was optional to include support for voice calls in the LTE network (Voice over LTE) when you launched LTE services. So if you already had a 2G or 3G network (CS Network) you could just keep using 2G and 3G for your voice calls, while getting that sweet capacity relief.

So our hypothetical CTO, strapped for cash and data capacity, just didn’t bother to support VoLTE when they launched LTE – Doing so would have taken more time to launch, during which time the capacity problem would become worse, so “don’t worry about VoLTE for now” was the mantra.

All the operators who still had 2G and 3G networks, opted to just “Fallback” to using the 2G / 3G network for calling. This is called “Circuit Switched Fallback” aka CSFB.

Operators loved this as they got the capacity relief provided by shifting to 4G/LTE (more capacity in the network is always good) and could all rant about how their network was the fastest and had 4G first, this however was what could be described as a “Foot gun” – Something you can shoot yourself in the foot with in the future.

Operators eventually introduce VoLTE

Time ticked on an operators built out their 4G networks, and many in the past 10 years or so have launched VoLTE in their own networks.

For phones that support it, in areas with blanket 4G coverage, they can use VoLTE for all their calls.

But that’s the sticking point right there – If the phones support it.

But if the phones don’t support it, they’re roaming or making emergency calls, there is always been the safety blanket of 2G or 3G and Circuit Switched fallback to well, fall back to.

There’s no driver for operators who plan to (or are required to) operate a 2G or 3G network for the foreseeable future, to ensure a high level of VoLTE support in their devices.

For an operator today with 2G or 3G, Voice over LTE is still optional.
Many operators still rely exclusively on Circuit Switched Fallback, and there are only a handful of countries that have turned off 2G and 3G and rely solely on VoLTE.

VoLTE Handset Support

For the past 16 years phone manufacturers have been making LTE capable phones.

But that does not mean they’ve been making phones that support Voice over LTE.

But it’s never been an issue up until this point, as there’s always been a circuit switched (2G/3G) network to fall back to, so the fact that these chips may not support VoLTE was not a big problem.

Many of the cheaper chipsets that power phones simply don’t support VoLTE – These chips do support LTE for data connections but rely on Circuit Switched Fallback for voice calls. This is in part due to the increased complexity, but also because some of the technologies for VoLTE (like AMR) required intellectual property deals to licence to use, so would add to the component cost to manufacture, and in the chips game, keeping down component cost is critical.

Even for chips that do support Voice over LTE, it’s “special”. Unlike calling in 2G or 3G that worked the same for every operator, phone manufacturers require a “Carrier Bundle” for each operator, containing that specific operators’ special flavor of VoLTE, that operator uses in their network.

This is because while VoLTE is standardized (Despite some claims to the contrary) a lot of “optional” bits have existed, and different operators built networks with subtle differences in the “flavor” of their Voice over LTE (IMS) stack they used. The OEMs (Phone / Chip manufacturers) had to handle these changes in the devices they made, for in order to sell their phones through that operator.

This means I can have a phone from vendor X that works with VoLTE on Network Y, but does not support VoLTE on Network Z.

Worse still, knowing which phones are supported is a bit of a guessing game.

Most operators sell phones directly to their customer base, so buying an Network Y branded phone from Vendor X, you know it’s going to support Network Y’s VoLTE settings, but if you change carriers, who knows if it’ll still support it?

When you’ve still got a Circuit Switched network it’s not the end of the world, you’ll just use CSFB and probably not realize it, until operators go to shut down 2G / 3G networks…

IMS Profile selection on an engineering mode MTK based Android handset

Navigating the Maze of VoLTE Compatibility

Here are some simple checklist you can ask your elderly family members if they ask if their phone is VoLTE compatible:

Does the underlying chipset the phone is based on support VoLTE? (you can find this out by disassembling the phone and checking the datasheets for the components from the OEMs after signing NDAs for each)
Does the underlying chipset require a “carrier bundle” of settings to have been loaded for this operator in order to support VoLTE (See Qualcomm MBM as an example)?
What version of this list am I currently on (generally set in the factory) and does it support this operator? (You can check by decapping the ICs and dumping their NVRAM and then running it through a decompiler)
Does my phones OS (Android / iOS) require a “carrier bundle” of it’s own to enable VoLTE? Is my operator in the version of the database on the phone? (See Android’s Carrier Database for example) (You can find the answer by rooting the phone and running some privileged commands to poke around the internal file system)
Does my operator / MNO support VoLTE – Does my plan / package support VoLTE? (You can easily find the answer by visiting the store and asking questions that don’t appear on the script)

If you managed to answer yes to all of the above, congratulations! You have conditional VoLTE support on your phone, although you probably don’t have a working phone anymore.

Wait, conditional VoLTE support?

That’s right folks, VoLTE will work in some scenarios with your operator!

If you plan on traveling, well your phone may support VoLTE at home, but does the phone have VoLTE roaming enabled?
Many phones support VoLTE in the home network, but resort to CSFB when roaming.

If it does support VoLTE roaming, does the network you’re visiting support VoLTE roaming? Has the roaming agreement (IRA) between the operator you’re using while traveling and your home operator been updated to include VoLTE Roaming? These IRAs (AA.12 / AA.13 docs) also indicate if the network must turn off IPsec encryption for the VoLTE traffic when roaming, which is controlled by the phone anyway.

Phew, all this talk of VoLTE roaming while traveling scares me, I think I’ll stay home in the safety of the Australian bush with all these great friendly animals around a phone that supports VoLTE on my home network.

Ah – After spending some time in the Australian bush one of our many deadly animals bit me. Time to call for help! Wait, what about emergency calls over VoLTE? Again, many phones support VoLTE for normal calls, fall back to 2G or 3G for the emergency call, so if you have one of those phones (You’ll only find out if you try to make an emergency call and it fails) and try to make an emergency call in a country without 2G or 3G, you’d better find a payphone.

There’s many real world examples of this, our friends at OptimERA have been lobbying the FCC since 2019 on this.

Sarcasm aside, there’s no dataset or compatibility matrix here – No simple way to see if your phone will work for VoLTE on a given operator, even if the underlying chip does support VoLTE.

Operators in Australia which recently shut down their 3G network, were mandated to block devices that didn’t support VoLTE for emergency calling. They did this using an Equipment Identity Register, and blocking devices based on the Type Allocation Code, but this scattergun approach just blocked non-carrier issued devices, regardless of it they supported VoLTE or VoLTE emergency calling.

Blame Game

So who’s to blame here?

There’s no one group to blame here, the industry has created a shitty cycle here:

Standards orgs for having too many “flavors” available
Operators deploying their own “Flavors” of VoLTE then mandating OEMs / Chip manufacturers comply with their “flavor”.
OEMs / Chip manufactures respond by adding “Carrier Bundles” to account for this per-operator customization

I’ve got some ideas on a way to unscramble this egg, and it’s going to take a push from the industry.

If you’re in the industry and keen to push for a fix, get in touch!

It’s time to get a long term solution to this problem, and we as an industry need to lead the change.

Using fancy test kit to measure”Cell Antenna” stickers

27/09/20245G SA, EUTRAN, LTE, Mobile Networks, RF5G, Antennas, LTE, RAN, TestingNick

Technology is constantly evolving, new research papers are published every day.

But recently I was shocked to discover I’d missed a critical development in communications, that upended Shannon’s “A mathematical theory of communication”.

I’m talking of course, about the GENERATION X PLUS SP-11 PRO CELL ANTENNA.

I’ve been doing telecom work for a long time, while I mostly write here about Core & IMS, I am a licenced rigger, I’ve bolted a few things to towers and built my fair share of mobile coverage over the years, which is why I found this development so astounding.

With this, existing antennas can be extended, mobile phone antennas, walkie talkies and cordless phones can all benefit from the improvement of this small adhesive sticker, which is “Like having a four foot antenna on your phone”.

So for the bargain price of $32.95 (Or $2 on AliExpress) I secured myself this amazing technology and couldn’t wait to quantify it’s performance.

Think of the applications – We could put these stickers on 6 ft panel antennas and they’d become 10ft panels. This would have a huge effect on new site builds, minimize wind loading, less need for tower strengthening, more room for collocation on the towers due to smaller equipment footprint.

Luckily I have access to some fancy test equipment to really understand exactly how revolutionary this is.

The packaging says it’s like having a 4 foot antenna on your phone, let’s do some very simple calculations, let’s assume the antenna in the phone is currently 10cm, and that with this it will improve to be 121cm (four feet).

According to some basic projections we should see ~21dB gain by adding the sticker, that’s a 146x increase in performance!

Man am I excited to see this in action.

Fortunately I have access to some fun cellular test equipment, including the Viavi CellAdvisor and ~~an environmentally controlled lab~~ my kitchen bench.

I put up a 1800Mhz (band 3) LTE carrier in my office in the other room as a reference and placed the test equipment into the test jig (between the sink and the kettle).

We then took baseline readings from the omni shown in the pictures, to get a reading on the power levels before adding the sticker.

We are reading exactly -80dBm without the sticker in place, so we expertly put some masking tape on the omni (so we could peel it off) and applied the sticker antenna to the tape on the omni antenna.

At -80dBm before, by adding the 21dB of gain, we should be put just under -60dBm, these Viavi units are solid, but I was fearful of potentially overloading the receive end from the gain, after a long discussion we agreed at these levels it was unlikely to blow the unit, so no in-line attenuation was used.

Okay, </sarcasm> I was genuinely a little surprised by what we found; there was some gain, as shown in the screenshot below.

Marker 1 was our reference without the sticker, while reference 2 was our marker with the sticker, that’s a 1.12dB gain with the sticker in place. In linear terms that’s a ~30% increase in signal strength.

So does this magic sticker work? Well, kinda, in as much that holding onto the Omni changes the characteristics, as would wrapping a few turns of wire around it, putting it in the kettle or wrapping it in aluminum foil. Anything you do to an antenna to change it is going to cause minor changes in characteristic behavior, and generally if you’re getting better at one frequency, you get worse at another, so the small gain on band 3 may also lead to a small loss on band 1, or something similar.

So what to make of all this? Maybe this difference is an artifact from moving the unit to make a cup of tea, the tape we applied or just a jump in the LTE carrier, or maybe the performance of this sticker is amazing after all…

Converting OP Key to OPc

30/08/20245G SA, LTE, Mobile Networks, Python, SDM, SecurityAuC, HSS, OPc, Security, SIM, SIM Ca, SIM CardNick

I wrote recently about the difference between OP and OPc keys in SIM cards, and why you should avoid using OP keys for best-practice.

PyHSS only supports storing OPc keys for this reason, but plenty of folks still use OP, so how can you convert OP keys to OPc keys?

Well, inside PyHSS we have a handy utility for this, CryptoTool.py

https://github.com/nickvsnetworking/pyhss/blob/master/lib/CryptoTool.py

Usage is really simple, just plug in the Ki and OP Key and it’ll generate you a full set of vectors, including the OPc key.

nick@amanaki:~/Documents/pyhss/lib$ python3 CryptoTool.py --k 11111111111111111111111111111111 --op 22222222222222222222222222222222

Namespace(k='11111111111111111111111111111111', op='22222222222222222222222222222222', opc=None)
Generating OPc key from OP & K
Generating Multimedia Authentication Vector
Input K:    b'11111111111111111111111111111111'
Input OPc:  b'2f3466bd1bea1ac9a8e1ab05f6f43245'
Input AMF:  b'\x80\x00'

Of course, being open source, you can grab the functions out of this and make a little script to convert everything in a CSV or whatever format your key data is in.

So what about OPc to OP?
Well, this is a one-way transaction, we can’t get the OP Key from an OPc & Ki.

5G / LTE Milenage Security Exploit – Dumping the Vectors

23/08/20245G SA, EPC, LTE, Mobile Networks, SDM, Security, SIM Cards5G, Diameter, HSS, LTE, SecurityNick

I’ve written about Milenage and SIM based security in the past on this blog, and the component that prevents replay attacks in cellular network authentication is the Sequence Number (Aka SQN) stored on the SIM.

Think of the SQN as an incrementing odometer of authentication vectors. Odometers can go forward, but never backwards. So if a challenge comes in with an SQN behind the odometer (a lower number), it’s no good.

Why the SQN is important for Milenage Security

Every time the SIM authenticates it ticks up the SQN value, and when authenticating it checks the challenge from the network doesn’t have an SQN that’s behind (lower than) the SQN on the SIM.

Let’s take a practical example of this:

The HSS in the network has SQN for the SIM as 8232, and generates an authentication challenge vector for the SIM which includes the SQN of 8232.
The SIM receives this challenge, and makes sure that the SQN in the SIM, is equal to or less than 8232.
If the authentication passes, the new SQN stored in the SIM is equal to 8232 + 1, as that’s the next valid SQN we’d be expecting, and the HSS incriments the counters it has in the same way.

By constantly increasing the SQN and not allowing it to go backwards, means that even if we pre-generated a valid authentication vector for the SIM, it’d only be valid for as long as the SQN hasn’t been authenticated on the SIM by another authentication request.

Imagine for example that I get sneaky access to an operator’s HSS/AuC, I could get it to generate a stack of authentication challenges that I could use for my nefarious moustache-twirling purposes whenever I wanted.

This attack would work, but this all comes crumbling down if the SIM was to attach to the real network after I’ve generated my stack of authentication challenges.

If the SQN on the SIM passes where it was when the vectors were generated, those vectors would become unusable.

It’s worth pointing out, that it’s not just evil purposes that lead your SQN to get out of Sync; this happens when you’ve got subscriber data split across multiple HSSes for example, and there’s a mechanism to securely catch the HSS’s SQN counter up with the SQN counter in the SIM, without exposing any secrets, but it just ticks the HSS’s SQN up – It never rolls back the SQN in the SIM.

The Flaw – Draining the Pool

The Authentication Information Request is used by a cellular network to authenticate a subscriber, and the Authentication Information Answer is sent back by the HSS containing the challenges (vectors).

When we send this request, we can specify how many authentication challenges (vectors) we want the HSS to generate for us, so how many vectors can you generate?

TS 129 272 says the Number-of-Requested-Vectors AVP is an Unsigned32, which gives us a possible pool of 4,294,967,295 combinations. This means it would be legal / valid to send an Authentication Information Request asking for 4.2 billion vectors.

It’s worth noting that that won’t give us the whole pool.

Sequence numbers (SQN) shall have a length of 48 bits.
TS 133 102

While the SQN in the SIM is 48 bits, that gives us a maximum number of values before we “tick over” the odometer of 281,474,976,710,656.

If we were to send 65,536 Authentication-Information-Requests asking for 4,294,967,295 a piece, we’d have got enough vectors to serve the sub for life.

Except the standard allows for an unlimited number of vectors to be requested, this would allow us to “drain the pool” from an HSS to allow every combination of SQN to be captured, to provide a high degree of certainty that the SQN provided to a SIM is far enough ahead of the current SQN that the SIM does not reject the challenges.

Can we do this?

Our lab has access to HSSes from several major vendors of HSS.

Out of the gate, the Oracle HSS does not allow more than 32 vectors to be requested at the same time, so props to them, but the same is not true of the others, all from major HSS vendors (I won’t name them publicly here).

For the other 3 HSSes we tried from big vendors, all eventually timed out when asking for 4.2 billion vectors (don’t know why that would be *shrug*) from these HSSes, it didn’t get rejected.

This is a lab so monitoring isn’t great but I did see a CPU spike on at least one of the HSSes which suggests maybe it was actually trying to generate this.

Of course, we’ve got PyHSS, the greatest open source HSS out there, and how did this handle the request?

Well, being standards compliant, it did what it was asked – I tested with 1024 vectors I’ll admit, on my little laptop it did take a while. But lo, it worked, spewing forth 1024 vectors to use.

So with that working, I tried with 4,294,967,295…

And I waited. And waited.

And after pegging my CPU for a good while, I had to get back to real life work, and killed the request on the HSS.

In part there’s the fact that PyHSS writes back to a database for each time the SQN is incremented, which is costly in terms of resources, but also that generating Milenage vectors in LTE is doing some pretty heavy cryptographic lifting.

The Risk

Dumping a complete set of vectors with every possible SQN would allow an attacker to spoof base stations, and the subscriber would attach without issue.

Historically this has been very difficult to do for LTE, due to the mutual network authentication, however this would be bypassed in this scenario.

The UE would try for a resync if the SQN is too far forward, which mitigates this somewhat.

Cryptographically, I don’t know enough about the Milenage auth to know if a complete set of possible vectors would widen the attack surface to try and learn something about the keys.

Mitigations / Protections

So how can operators protect ourselves against this kind of attack?

Different commercial HSS vendors handle this differently, Oracle limits this to 32 vectors, and that’s what I’ve updated PyHSS to do, but another big HSS vendor (who I won’t publicly shame) accepts the full 4294967295 vectors, and it crashes that thread, or at least times it out after a period.

If you’ve got a decent Diameter Routing Agent in place you can set your DRA to check to see if someone is using this exploit against your network, and to rewrite the number of requested vectors to a lower number, alert you, or drop the request entirely.

Having common OP keys is dumb, and I advocate to all our operator customers to use OP keys that are unique to each SIM, and use the OPc key derived anyway. This means if one SIM spilled it’s keys, the blast doesn’t extend beyond that card.

In the long term, it’d be good to see 3GPP limit the practical size of the Number-of-Requested-Vectors AVP.

2G/3G Impact

Full disclosure – I don’t really work with 2G/3G stacks much these days, and have not tested this.

MAP is generally pretty bandwidth constrained, and to transfer 280 billion vectors might raise some eyebrows, burn out some STPs and take a long time…

But our “Send Authentication Info” message functions much the same as the Authentication Information Request in Diameter, 3GPP TS 29.002 shows we can set the number of vectors we want:

5GC Vulnerability

This only impacts LTE and 5G NSA subscribers.

TS 29.509 outlines the schema for the Nausf reference point, used for requesting vectors, and there is no option to request multiple vectors.

Summary

If you’ve got baddies with access to your HSS / HLR, you’ve got some problems.

But, with enough time, your pool could get drained for one subscriber at a time.

This isn’t going to get the master OP Key or plaintext Ki values, but this could potentially weaken the Milenage security of your system.

5G Core – The Service Based Architecture concept

16/08/20245G SA, Mobile Networks, RFCs & Standards5G, SBA, SBI, StandaloneNick

One of the new features of 5GC is the introduction of Service Based Interfaces (SBI) which is part of 5GC’s Service Based Architecture (SBA).

Let’s start with the description from the specs:

3GPP TS 23.501 [3] defines the 5G System Architecture as a Service Based Architecture, i.e. a system architecture in which the system functionality is achieved by a set of NFs providing services to other authorized NFs to access their services.
3GPP TS 29.500 – 4.1 NF Services

For that we have two key concepts, service discovery, and service consumption

Services Consumer / Producers

That’s some nice words, but let’s break down what this actually means, for starters, let’s talk about services.

In previous generations of core network we had interfaces instead of services. Interfaces were the reference point between two network elements, describing how the two would talk. The interfaces were the protocols the two interfaces used to communicate.

For example, in EPC / LTE S6a is the interface between the MME and the HSS, S5 is the interface between the S-GW and P-GW. You could lookup the 3GPP spec for each interface to understand exactly how it works, or decode it in Wireshark to see it in action.

5GC moves from interfaces to services. Interfaces are strictly between two network elements, the S6a interface is only used between the MME and the HSS, while a service is designed to be reusable.

This means the Service Based Interface N5g-eir can be used by the AMF, but it could equally be used by anyone else who wants access to that information.

3GPP defines the service in the form of service producer (The EIR produces the N5g-eir service) and the service consumer (The client connecting to the N5g-eir service), but doens’t restrict which network elements can

This gets away from the soup of interfaces available, and instead just defines the services being offered, rather than locking the

“service consumers” (which can be thought of clients in a client/server model) can discover “service producers” (like servers in a client/server model).

Our AMF, which acts as a “service consumer” consuming services from the UDM/UDR and SMF.

Service Discovery – Automated Discovery of NF Services

Service-Based Architecture enables 5G Core Network Function service discovery.

In simple terms, this means rather that your MME being told about your SGW, the nodes all talk to a “Network Repository Function” that returns a list of available nodes.

I’ve written a full blog post on the NRF that explains the concepts.

CM-Idle / CM-Connected and RM-Deregistered / RM-Registered States in 5GC

09/08/20245G SA, Mobile Networks, Uncategorized5G, 5GCNick

The mobility management and connection management process in 5GC focuses on Connection Management (CM) and Registration Management (RM).

Registration Management (RM)

The Registration Management state (RM) of a UE can either be RM-Registered or RM-Deregistered. This is akin to the EMM state used in LTE.

RM-Deregistered Mode

From the Core Network’s perspective (Our AMF) a UE that is in RM-Deregistered state has no valid location information in the AMF for that UE.
The AMF can’t page it, it doesn’t know where the UE is or if it’s even turned on.

From the UE’s perspective, being in RM-Deregistered state could mean one of a few things:

UE is in an area without coverage
UE is turned off
SIM Card in the UE is not permitted to access the network

In short, RM-Deregistered means the UE cannot be reached, and cannot get any services.

RM-Registered Mode

From the Core Network’s perspective (the AMF) a UE in RM-Registered state has sucesfully registered onto the network.

The UE can perform tracking area updates, period registration updates and registration updates.

There is a location stored in the AMF for the UE (The AMF knows at least down to a Tracking Area Code/List level where the UE is).

The UE can request services.

Connection Management (CM)

Connection Management (CM) focuses on the NAS signaling connection between the UE and the AMF.

To have a Connection Management state, the Registration Management procedure must have successfully completed (the UE being in RM-Registered) state.

A UE in CM-Connected state has an active signaling connection on the N1 interface between the UE and the AMF.

CM-Idle Mode

In CM-Idle mode the UE has no active NAS connection to the AMF.

UEs typically enter this state when they have no data to send / recieve for a period of time, this conserves battery on the UE and saves network resources.

If the UE wants to send some data, it performs a Service Request procedure to bring itself back into CM-Connected mode.

If the Network wants to send some data to the UE, the AMF sends a paging request for the UE, and upon hearing it’s identifier (5G-S-TMSI) on the paging channel, the UE performs the Service Request procedure to bring itself back into CM-Connected mode.

CM-Connected Mode

In CM-Connected mode the UE has an active NAS connection with the AMF over the N1 interface from the UE to the AMF.

When the access network (The gNodeB) determines this state should change (typically based on the UE being idle for longer than a set period of time) the gNodeB releases the connection and the UE transitions to CM-Idle Mode.

MTU in LTE & 5G Transmission Networks – Part 2

26/07/20245G SA, EUTRAN, LTE, Mobile Networks5G, LTE, MTU, TransportNick

So let’s roll up our sleeves and get a Lab scenario happening,

To keep things (relatively) simple, I’ve put the eNodeB on the same subnet as the MME and Serving/Packet-Gateway.

So the traffic will flow from the eNodeB to the S/P-GW, via a simple Network Switch (I’m using a Mikrotik).

While life is complicated, I’ll try and keep this lab easy.

Experiment 1: MTU of 1500 everywhere

Network Element	MTU
Advertised MTU in PCO	1500
eNodeB	1500
Switch	1500
Core Network (S/P-GW)	1500

So everything attaches and traffic flows fine. There is no problem right?

Well, not a problem that is immediately visible.

While the PCO advertises the MTU value at 1500 if we look at the maximum payload we can actually get through the network, we find that’s not the case.

This means if our end user on a mobile device tried to send a 1500 byte payload, it’d never get through.

While DNS would work, most TCP traffic would flow fine, certain UDP applications would start to fail if they were sending payloads nearing 1500 bytes.

So why is this?

Well GTP adds overhead.

8 bytes for the GTP header
8 bytes for the transport UDP header
20 bytes for the transport IPv4 header
14 bytes if our transport is using Ethernet

For a total of 50 bytes of overhead, assuming we’re not using MPLS, QinQ or anything else funky on our transport network and just using Ethernet.

So we have two options here – We can either lower the MTU advertised in our Protocol Configuration Options, or we can increase the MTU across our transport network. Let’s look at each.

Experiment 2: Lower Advertised MTU in PCO to 1300

Well this works, and looks the same as our previous example, except now we know we can’t handle payloads larger than 1300 without fragmentation.

Experiment 3: Increase MTU across transmission Network

While we need to account for the 50 bytes of overhead added by GTP, I’ve gone the safer option and upped the MTU across the transport to 1600 bytes.

With this, we can transport a full 1500 byte MTU on the UE layer, and we’ve got the extra space by enabling jumbo frames.

Obviously this requires a change on all of the transmission layer – And if you have any hops without support for this, you’ll loose packets.

Conclusions?

Well, fragmentation is bad, and we want to avoid it.

For this we up the MTU across the transmission network to support jumbo frames (greater than 1500 bytes) so we can handle the 1500 byte payloads that users want.

Transport Keys & A4 / K4 Keys in EPC & 5GC Networks

12/07/20245G SA, EPC, LTE, Mobile Networks, RFCs & Standards, SDM, SecurityA4, Diameter, HSS, K4, SIM, SIM Card, Transport KeysNick

If you’re working with the larger SIM vendors, there’s a good chance they key material they send you won’t actually contain the raw Ki values for each card – If it fell into the wrong hands you’d be in big trouble.

Instead, what is more likely is that the SIM vendor shares the Ki generated when mixed with a transport key – So what you receive is not the plaintext version of the Ki data, but rather a ciphered version of it.

But as long as you and the SIM vendor have agreed on the ciphering to use, an the secret to protect it with beforehand, you can read the data as needed.

This is a tricky topic to broach, as transport key implementation, is not covered by the 3GPP, instead it’s a quasi-standard, that is commonly used by SIM vendors and HSS vendors alike – the A4 / K4 Transport Encryption Algorithm.

It’s made up of a few components:

K2 is our plaintext key data (Ki or OP)
K4 is the secret key used to cipher the Ki value.
K7 is the algorithm used (Usually AES128 or AES256).

I won’t explain too much about the crypto, but here’s an example from IoT Connectivity’s KiOpcGenerator tool:

def aes_128_cbc_encrypt(key, text):
    """
    implements aes 128b encryption with cbc.
    """
    keyb = binascii.unhexlify(key)
    textb = binascii.unhexlify(text)
    encryptor = AES.new(keyb, AES.MODE_CBC, IV=IV)
    ciphertext = encryptor.encrypt(textb)
    return ciphertext.hex().upper()

It’s important when defining your electrical profile and the reuqired parameters, to make sure the operator, HSS vendor and SIM vendor are all on the same page regarding if transport keys will be used, what the cipher used will be, and the keys for each batch of SIMs.

Here’s an example from a Huawei HSS with SIMs from G&D:

%%LST K4: HLRSN=1;%%
RETCODE = 0 SUCCESS0001:Operation is successful

        "K4SNO" "ALGTYPE"     "K7SNO" "KEYNAME"
        1       AES128        NONE    G+D

We’re using AES128, and any SIMs produced by G&D for this batch will use that transport key (transport key ID 1 in the HSS), so when adding new SIMs we’ll need to specify what transport key to use.

Uncomfortable Questions to ask about 5G Standalone at MWC – Part 3 – We have to do 5GC.

21/06/20245G SA, Mobile Networks, RFCs & Standards5G, SA, StandaloneNick

This post follows on from Part 1 and Part 2 of this 3 part series.

We are forced to move to 5G-SA

Claim: We must use 5G-SA with this spectrum (It’s a condition of the license)

I’ll concede that if it is a requirement for a license or funding, that 5G-SA be used, then that’s a pretty ironclad reason to introduce 5G-SA.

Claim: Users will Leave if you don’t have 5G-SA

We could argue the opposite effect will happen; Shifting to SA will reduce your user base.
Here’s why:

Users experiencing 5G-NSA (Non-Standalone) today, are already getting the speed boost from “5G”.

From a user perspective, while 5G-NSA support has been becoming common on mid-to-high priced handsets, handsets supporting 5G-SA are far less common.

Dish’ Project Genesis is one of the only examples of a 5G SA network deployed on a large scale. It launched with only a single supported phone (A Motorola branded handset) and today the supported phone list is very short, limited to expensive flagships. This lack of handset support means users must purchase a handset through Dish rather than being able to bring their own phones, as the only way that compatibility can be guaranteed is by controlling the whole ecosystem.

Unless you are in a highly developed market with 2G and 3G turned off, where the majority of your user base has recent generation flagship phones capable of supporting these features, you’re shrinking your addressable market with 5G-SA, rather than expanding it.

Conclusion – 5G-SA doesn’t stack up, what do I do?

SA doesn’t make sense for a lot of operators and markets – for now. I’m sure this post will look pretty dated in a few years time as many of these factors change and as operators sunset 2G and 3G networks.

I’m not advocating for 5G-SA never, I’m advocating not 5G-SA today.

There are simply better options out there for spending that operations budget to make network improvements.

Off the bat some ideas to expore:

Optimize your existing network.
Roll out NSA to an even larger area.
Shutdown 2G/3G layers.
Simplify your operations.
Cut down the number of vendors and moving parts.
Simplify again.
Automate.
Simplify more.

Doing this will mean you can enjoy cost savings from reduced headcount thanks to a simpler network.
Simpler networks have better up-time, thanks to operating a network that’s less frankensteiny – less cobbled together from disparate legacy parts.
You’ll also Enjoy reduced opex from all the systems you’ve shut down and cheaper roaming from all the bilaterals you moved to VoLTE.

All of these tasks will keep project teams busy for years and put the MNO in a stronger position moving forward, without getting distracted by slick marketing and shiny brochures.

Getting started with PyHSS

07/06/20245G SA, IMS / VoLTE, LTE, Mobile NetworksDiameter, IMS, LTE, PyHSS, VoIPNick

PyHSS is our open source Home Subscriber Server, it’s written in Python, has a variety of different backends, and is highly perforate (We benchmark to 10K transactions per second) and infinitely scaleable.

In this post I’ll cover the basics of setting up PyHSS in your enviroment and getting some Diameter peers connected.

For starters, we’ll need a database (We’ll use MySQL for this demo) and an account on that database for a MySQL user.

So let’s get that rolling (I’m using Ubuntu 24.04):

sudo apt update
sudo apt install mysql-server

Next we’ll create the MySQL user for PyHSS to use:

CREATE USER 'pyhss_user'@'%' IDENTIFIED BY 'pyhss_password';
GRANT ALL PRIVILEGES ON *.* TO 'pyhss_user'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;

We’ll also need Redis as well (PyHSS uses Redis for inter-service communications and for caching), so go ahead an install that for your distro:

sudo apt install redis-server

So that’s our prerequisites sorted, let’s clone the PyHSS repo:

git clone https://github.com/nickvsnetworking/pyhss /etc/pyhss

And install the requirements with pip from the PyHSS repo:

pip3 install -r requirements.txt

Next we’ll need to configure PyHSS, for that we update the config file (config.yaml) with the settings we want to use.

We’ll start by setting the bind_ip to a list of IPs you want to listen on, and your transport – We can use either TCP or SCTP.

For Diameter, we will set OriginHost and OriginRealm to match the Diameter hostname you want to use for this peer, and the Realm of your Diameter network.

Lastly we’ll need to set the database parameters, updating the database: section to populate your credentials, setting your username and password and the database to match your SQL installation we setup at the start.

With that done, we can start PyHSS, which we do using systemctl.

Because there’s multiple microservices that make up PyHSS, there’s multiple systemctl files use to run PyHSS as a service, they’re all in the /systemd folder.

We’ll copy them all to our systemd folder.

cp /etc/pyhss/systemd/ /lib/systemd/system/
systemctl daemon-reload
systemctl start pyhss

And with that we’ve got PyHSS running and ready for a Diameter peer to connect.

Now you should be able to bring our Diameter peers up.

If you’re using something like Kamailio, with C-Diameter Peer, you can read about the config for that here, or FreeDiameter you can read about here.

In the next post, we’ll cover subscriber provisioning via the API.

OPc vs OP in SIM keys

10/05/20245G SA, GSM, LTE, Mobile Networks, RFCs & Standards, SDM, Security, SIM CardsAuthentication, HSS, OPc, OpenSER, SDM, SIMNick

Years ago I wrote an article looking at how Key generation works inside SIM cards for LTE & 5G-NR.

I got this great question the other day:

Hello Nick, thank you for the article.
What is the use of the OPc key to be derived from OP key ?
Why can’t it just be a random key like Ki ?

It’s a super good question, and something I see a lot of operators get “wrong” from a security best practices perspective.

Refresher on OP vs OPc Keys

The “OP Key” is the “operator” key, and was (historically) common for an operator.

This meant all SIMs in the network had a common OP Key, and each SIM had a unique Ki/K key.

The SIM knew both, and the HSS only needed to know what the Ki was for the SIM, as they shared a common OP Key (Generally you associate an index which translates to the OP Key for that batch of SIMs but you get the idea).

But having common key material is probably not the best idea – I’m sure there was probably some reason why using a common key across all the SIMs seemed like a good option, and the K / Ki key has always been unique, so there was one unique key per SIM, but previously, OP was common.

Over time, the issues with this became clear, so the OPc key was introduced. OPc is derived from mushing the K & OP key together. This means we don’t need to expose / store the original OP key in the SIM or the HSS just the derived OPc key output.

This adds additional security, if the Ki for a SIM were to be exposed along with the OP for that operator, that’s half the entropy lost. Whereas by storing the Ki and OPc you limit the blast radius if say a single SIMs data was exposed, to only the data for that particular SIM.

This is how most operators achieve this today; there is still a common OP Key, locked away in a vault alongside the recipe for Coca-cola and the moon landing set.

But his OP Key is no longer written to the SIMs or stored in the HSS.

Instead, during the personalization process (The bit in manufacturing where SIMs get the unique data written to them (The IMSI & keys)) a derived OPc key is written to the card itself, and to the output files the operator then loads into their HSS/HLR/AuC.

This is not my preferred method for handling key material however, today we get our SIM manufacturers to randomize the OP key for every card and then derive an OPc from that.

This means we have two unique keys for each SIM, and even if the Ki and OP were to become exposed for a SIM, there is nothing common between that SIM, and the other SIMs in the network.

Values stores on the LTE / EUTRAN / EPC Home Subscriber Server (HSS) including K Key, OP / OPc key and SQN SequenceNUmber

Do we want our Ki to leak? No. Do we want an OP Key to leak? No. But if we’ve got unique keys for everything we minimize the blast radius if something were to happen – Just minimizes the risk.

Uncomfortable Questions to ask about 5G Standalone at MWC – Part 2 – Has this Cash cow got Milk?

21/02/20245G SA, EPC, Mobile Networks, RFCs & Standards, Software5G, 5G Core, 5G-SA, 5GC, Mobile Networks, StandaloneNick

This is the second post of 3 presenting the argument against introducing 5G-SA.

There’s an old adage that businesses spend money for one of three reasons:

To Save Money (Which I covered yesterday)
To make more Money (This post, congratulations, you’re reading it!)
Because they have to (Regulatory compliance, insurance, taxes, etc) – That’s the next post

So let’s look at SA in this context.

5G-SA can drive new revenue streams

We (as an industry) suck at this.

Last year on the Telecoms.com podcast, Scott Bicheno made the point that if operators took all the money they’d gambled (and lost) on trying to play in the sports rights, involvement in media companies, building their own streaming apps, attempts at bundling other utilities, digital identity, etc, and just left the cash in the bank and just operated the network, they’d be better off.

Uber, Spotify, “OTTs”, etc, utilize MNOs to enable their services, but operators don’t see this extra revenue.
While some operators may talk of “fair share” the truth is, these companies add value to our product (connectivity) which as an industry, we’ve failed to add ourselves.

Last year at MWC we saw vendors were still beating the drum about 5G being critical for the “Metaverse”, just weeks before Meta announced they were moving away from the Metaverse.

Today the only device getting any attention from consumers is Apple’s Vision Pro, a very pricey, currently niche offering, which has no SIM card or cellular connectivity.

If the Metaverse does turn out to be a cash cow, it is unlikely the telecommunications industry will be the ones milking it.

Claim: Customers are willing to pay more for 5G-SA

This myth seems to be fairly persistent, but with minimal data to support this claim.

While BSS vendors talk about “5G Monetization”, the truth is, people use their MNO to provide them connectivity. If the coverage is adequate, and the speed enough to do what they need to do, few would be willing to pay any additional cash each month to see higher numbers on a speedtest result (enabled by 5G-NSA) and even fewer would pay extra cash for, well, whatever those features only enabled by 5G-Standalone are?

With most consumers now also holding onto their mobile devices for longer periods of time, and with interest rates reining in consumer spending across the board, we are seeing the rise of a more cost conscious consumer than ever before. If we want to see higher ARPUs, we need to give the consumer a compelling reason to care and spend their cash, beyond a speed test result.

We talk a little about APIs lower down in the post.

Claim: Users want Ultra-Low Latency / High Reliability Comms that only 5G-SA delivers

Wanting to offer a product to the market, is not the same as the market wanting a product to consume.

Telecom operators want customers to want these services, but customer take up rates tell a different story. For a product like this to be viable, it must have a wide enough addressable market to justify the investment.

Reliability

The URLCC standards focus on preventing packet loss, but the world has moved on from needing zero packet loss.

The telecom industry has a habit of deciding what customers want without actually listening.
When a customer talks about wanting “reliable” comms, they aren’t saying they want zero packet loss, but rather fewer dropouts or service flaps.
For us to give the customer what they are actually asking for involves us expanding RAN footprint and adding transmission diversity, not 5G-SA.

The “protocols of the internet” (TCP/IP) have been around for more than 50 years now.

These protocols have always flowed over transport links with varied reliability and levels of packet loss.

Thanks to these error correction and retransmission techniques built into these protocols, a lost packet will not interrupt the stream. If your nuclear command and control network were carried over TCP/IP over the public internet (please don’t do this), a missing packet won’t lead to worldwide annihilation, but rather the sender will see the receiver never acknowledged the receipt of the packet at the other end, and resend it, end of.

If you walk into a hospital today, you’ll find patient monitoring devices, tracking the vital signs for patients and alerting hospital staff if a patient’s vital signs change. It is hard to think of more important services for reliability than this.

And yet they use WiFi, and have done for a long time, if a packet is lost on WiFi (as happens regularly) it’s just retransmitted and the end user never knows.

Autonomous cars are unlikely to ever rely on a 5G connection to operate, for the simple reason that coverage will never be 100%. If your car stops because you’re in a not-spot, you won’t be a happy customer. While plenty of cars have cellular modems in them, that are used to upload telemetry data back to the manufacturer, but not to drive the car.

One example of wireless controlled vehicles in the wild is autonomous haul trucks in mines. Historically, these have used WiFi for their comms. Mine sites are often a good fit for Private LTE, but there’s nothing inherent in the 5G Standalone standard that means it’s the only tool for the job here.

Slicing

Slicing is available in LTE (4G), with an architecture designed to allow access to others. It failed to gain traction, but is in networks today.

See: Pre-5G Network Slicing.

What is different this time?

Low Latency

The RAN a piece of the latency puzzle here, but it is just one piece of the puzzle.

If we look at the flow a packet takes from the user’s device to the server they want to talk to we’ve got:

Time it takes the UE to craft the packet
Time it takes for the packet to be transmitted over the air to the base station
Time it takes for the packet to get through the RAN transmission network to the core
Time it takes the packet to traverse the packet core
Time it takes for the packet to get out to transit/peering
Time it takes to get the packet from the edge of the operators network to the edge of the network hosting the server
Time it takes the packet through the network the server is on
Time it takes the server to process the request

The “low latency” bit of the 5G puzzle only involves the two elements in bold.

If you’ve got to get from point A to point B along a series of roads, and the speed limit on two of the roads you traverse (short sections already) is increased. The overall travel time is not drastically reduced.

I’m lucky, I have access to a well kitted out lab which allows me to put all of these latency figures to the test and provide side by side metrics. If this is of interest to anyone, let me know. Otherwise in the meantime you’ll just have to accept some conjecture and opinion.

You could rebut this talking about Edge Compute, and having the datacenter at the base of the tower, but for a number of fairly well documented reasons, I think this is unlikely to attract widespread deployment in established carrier networks, and Intel’s recent yearly earning specifically called this out.

Claim: Customers want APIs and these needs 5G SA

Companies like Twilio have made it easy to interact with the carrier network via their APIs, but yet again, it’s these companies producing the additional value on a service operated by the MNOs.

My coffee machine does not have an API, and I’m OK with this because I don’t have a want or need to interact with it programatically.

By far, the most common APIs used by businesses involving telco markets are APIs to enable sending an SMS to a user.

These have been around for a long time, and the A2P market is pretty well established, and the good news is, operators already get a chunk of this pie, by charging for the SMS.

Imagine a company that makes medical booking software. They’re a tech company, so they want their stack to work anywhere in the world, and they want to be able to send reminder SMS to end users.

They could get an account manager with each of the telcos in each of the markets they work in, onboard and integrate the arcane complexities of each operators wholesale SMS system, or they could use Twilio or a similar service, which gives them global reach.

Often the cost of services like Twilio are cheaper than working directly with the carriers in each market, and even if it is marginally more expensive, the cost savings by not having to deal with dozens of carriers or integrate into dozens of systems, far outweighs this.

GSMA’s OpenGateway Initiative has sought to rectify this, but it lacks support for the use case we just discussed.

While it’s a great idea, in the context of 5G Standalone and APIs, it’s worth noting that none of the use cases in OpenGateway require 5G Standalone (Except possibly Edge discovery, but it is debatable).

Even Slicing existed before in LTE.

Critically, from a developer experience perspective:

I can sign up to services like Twilio without a credit card, and start using the service right away, with examples in my programming language of choice, the developer user experience is fantastic.

Jump on the OpenGateway website today and see if you can even find a way to sign up to use the service?

Claim: Fixed Wireless works best with 5G-SA

Of all the touted use cases and applications for 5G, Fixed Wireless (FWA) has been the most successful.

The great thing about FWA on Cellular networks is you can use the same infrastructure you use for your mobile customers, and then sell excess capacity in the network to deliver Fixed Wireless Access services, better utilizing an asset (great!).

But again, this does not require Standalone 5G. If you deploy your FWA network using 5G SA, then you won’t be able to sweat that same asset for both mobile subscribers and FWA subscribers.

Today at least, very few handsets short of this generation of flagship phones, supports 5G SA. Even the phones sold as supporting 5G over the past few years, are almost all only supporting 5G-NSA, so if you rolled out your FWA network as Standalone, you can’t better utilize the asset by sharing with your existing LTE/5G-NSA customers.

Claim: The Killer App is coming for 5G and it needs 5G SA

This space is reserved for the killer app that requires 5G Standalone.

Whenever that comes?

Anyone?

I’m not paying to build a marina berth for my mega yacht, mostly because I don’t have one. Ditto this.

Could you explain to everyone on an investor call that you’re investing in something where the vessel of the payoff isn’t even known to exist? Telecom is “blue chip”, hardly speculative.

The Future for Revenue Growth?

Maybe there isn’t one.

I know it’s an unthinkable thought for a lot of operators, but let’s look at it rationally; in the developed world, everyone who wants a mobile service already has one.

This leaves operators with two options; gaining market share from their competitors and selling more/higher priced services to existing customers.

You don’t steal away customers from other operators by offering a higher priced product, and with reduced consumer spending people aren’t queuing up to spend more each month.

But there is a silver lining, if you can’t grow revenues, you can still shrink expenditure, which in the end still gets the same result at the end of the quarter – More cash.

Simplify your operations, focus on what you do really well (mobile services), the whole 80/20 rule, get better at self service, all that guff.

There’s no shortage of pain points for consumers telecom operators could address, to make the customer experience better, but few that include the word Slicing.

Uncomfortable Questions to ask about 5G Standalone at MWC – Part 1 – Does $tandalone save $$$?

20/02/20245G SA, EPC, EUTRAN, Mobile Networks, RFCs & Standards5G, 5G-SA, 5GC, EPC, StandaloneNick

No one spends marketing dollars talking about the problems with a tech and vendors aren’t out there promoting sweating existing assets. But understanding your options as an operator is more important now than ever before.

Sidebar; This post got really long, so I’m splitting it into 3…

We’re often asked to help define a a 5G strategy for operators; while every case is different, there’s a lot of vendors pushing MNOs to move towards 5G standalone or 5G-SA.

I’m always a fan of playing “devil’s advocate“, and with so many articles and press releases singing the praises of standalone 5G/5G-SA, so as a counter in this post, I’ll be making the case against the narratives presented to operators by vendors that the “right” way to do 5G is to introduce 5G Standalone, that they should all be “upgrading” to Standalone 5G.

With Mobile World Congress around the corner, now seems like a good time to put forward the argument against introducing 5G Standalone, rebutting some common claims about 5G Standalone operators will be told. We’ll counterpoint these arguments and I’ll put forward the case for not jumping onto the 5G-SA bandwagon – just yet.

On a personal note, I do like 5G SA, it has some real advantages and some cool features, which are well documented, including on this blog. I’m not looking to beat up on any vendors, marketing hype or events, but just to provide the “other side” of the equation that operators should consider when making decisions and may not be aware of otherwise. It’s also all opinion of course (cited where possible), but if you’re going to build your network based on a blog post (even one as good as this) you should probably reconsider your life choices.

Some Arcane Detail: 5G Non-Standalone (NSA) vs Standalone (SA)

5G NSA (Non Standalone) uses LTE (4G) with an additional layer “bolted on” that uses 5G on the radio interface to provide “5G” speeds to users, while reusing the existing LTE (Evolved Packet Core) core and VoLTE for voice / SMS.

From an operator perspective there is almost no change required in the network to support NSA 5G, other than in the RAN, and almost all the 5G networks in commercial use today use 5G NSA.

5G NSA is great, it gives the user 5G speeds for users with phones that support it, with no change to the rest of the network needed.

Standalone 5G on the other hand requires an a completely new core network with all the trimmings.

While it is possible to handover / interwork with LTE/4G (Inter-RAT Handovers), this is like 3G/4G interworking, where each has a different core network. Introducing 5G standalone touches every element of the network, you need new nodes supporting the new standards for charging, policy, user plane, IMS, etc.

Scope

There’s an old adage that businesses spend money for one of three reasons:

To Save Money (Which we’ll cover in this post)
To make more Money (Covered next – Will link when published)
Because they have to (Regulatory compliance, insurance, taxes, etc)

Let’s look at 5G Standalone in each of these contexts:

5G Cost Savings – Counterpoint: The cost-benefit doesn’t stack up

As an operator with an existing deployed 4G LTE network, deploying a new 5G standalone network will not save you money.

From an capital perspective this is pretty obvious, you’re going to need to invest in a new RAN and a new core to support this, but what about from an opex perspective?

Claim: 5G RAN is more efficient than 4G (LTE) RAN

Spectrum is both finite and expensive, so MNOs must find the most efficient way to use that spectrum, to squeeze the most possible value out of it.

Let’s look at some numbers:

In the case of 3G vs 4G (LTE) there was a strong cost saving case to be made; a single 5Mhz UMTS (3G) cell could carry a total of 14Mbps, while if that same 5Mhz channel was refarmed / shifted to a 4×4 LTE (4G) carrier we hit 75Mbps of downlink data.

In rough numbers, we can say we get 5x the spectral efficiency by moving from 3G to 4G. This means we can carry 5.2x more with the same spectrum on 4G than we can on 3G – A very compelling reason to upgrade.

The like-for-like spectral efficiency of 5G is not significantly greater than that of LTE.

In numbers the same 5Mhz of spectrum we refarmed from UMTS (3G) to 4G (LTE) provided a 5x gain in efficiency to deliver 75Mbps on LTE. The same configuration refarmed to 5G-NR would provide 80Mbps.

Refarming spectrum from 4G (LTE) to 5G (NR) only provides a 6% increase in spectral efficiency.

While 6% is not nothing, if refarmed to a 5G standalone network, the spectrum can no longer be used by LTE only devices (Unless Dynamic Spectrum Sharing is used which in itself leads to efficiency losses), which in itself reduces the efficiency and would add additional load to other layers.

The crazy speeds demonstrated by 5G are not due to meaningful increases in efficiency, but rather the ability to use more spectrum, spectrum that operators need to purchase at auction, purchase equipment to utilize and pay to run.

Claim: 5G Standalone Core is Cheaper to operate as it is “Cloud Native”

It has been widely claimed that the shift for the 5G Core Architecture to being “Cloud Native” can provide cost savings.

Operators should regard this in a skeptical manner; after all, we’ve been here before.

Did moving from big-iron to VNFs provide the promised cost savings to operators?

For many operators the shift from hardware to software added additional complexity to the network and increased the headcount to support this.

What were once big-iron appliances dedicated to one job, that sat in the corner and chugged away, are now virtual machines (VNFs).
Many operators have naturally found themselves needing a larger team to manage the virtual environment, compared to the size of the team they needed to just to plug power and data into a big box in an exchange before everything was virtualized.

Introducing a “Cloud Native” Kubernetes layer on top of the VNF / virtualization layer, on top of the compute layer, leaves us with a whole lot of layers. All of which require resources to be maintain, troubleshoot and kept running; each layer having associated costs for staffing, licensing and support.

Many mid size enterprises rushed into “the cloud” for the promised cost savings only to sheepishly admit it cost more than the expected.

Almost none of the operators are talking about running these workloads in the public cloud, but rather “Private Clouds” built on-premises, using “Cloud Native” best practices.

One of the central arguments about cloud revolves around “elastic scaling” where the network can automatically scale to match demand; think extra instances spun up a times of peak demand and shut down when the demand drops.

I explain elastic scaling to clients as having to move people from one place to another. Most of the time, I’m just moving myself, a push bike is fine, or I’ve got a 4 seater car, but occasionally I’ll need to move 25 people and for that I’d need a bus.

If I provide the transportation myself, I need to own a bike, a car and a bus.

But if use the cloud I can start with the push bike, and as I need to move more people, the “cloud” will provide me the vehicle I need to move the people I need to move at that moment, and I’ll just pay for the time I need the bus, and when I’m done needing the bus, I drop back to the (cheaper) push bike when I’m not moving lots of people.

This is a really compelling argument, and telecom operators regularly announces partnerships with the hyperscalers, except they’re always for non-core-network workloads.

While telecom operators are going to provide the servers to run this in “On-prem-cloud”, they need to dimension for the maximum possible load. This means they need to own a bike/car/bus, even if they’re not using it most of the time, and there’s really no cost savings to having a bus but not using it when you’re not paying by the hour to hire it.

Infrastructure aside, introducing a Standalone 5G Core adds another core network to maintain. Alongside the Circuit Switched Core (MSC/GGSN/SGSN) serving 2G/3G subscribers, Evolved Packet Core serving 4G (LTE) and 5G-NSA subscribers, adding a 5G Standalone Core to for the 5G-SA subscribers served by the 5G SA cells, is going to be more work (and therefore cost).

While the majority of operators have yet to turn off their 2G/3G core networks, introducing another core network to run in parallel is unlikely to lead to any cost savings.

Claim: Upgrading now can save money in the Future / Future Proofing

Life cycles of telecommunications are two fold, one is the equipment/platform life cycle (like the RAN components or Core network software being used to deliver the service) the other is the technology life cycle (the generation of technology being used).

The technology lifecycles in telecommunications are vastly longer than that for regular tech.

GSM (2G) was introduced into the UK in 1991, and will be phased out starting in 2033, a 42 year long technology life cycle.

No vendor today could reasonably expect the 5G hardware you deploy in 2024 to still be in production in 2066 – The platform/equipment life cycle is a lot shorter than the technology life cycle.

Operators will to continue relying on LTE (4G) well into the late 2030s.

I’d wager that there is not a single piece of equipment in the Vodafone UK GSM network today, that was there in 1991.
I’d go even further to say that any piece of equipment in the network today, didn’t even replace the 1991 equipment, but was probably 3 or 4 generations removed from the network built in 1991.

For most operators, RAN replacements happen between 4 to 7 years, often with targeted augmentation / expansion as needed in the form of adding extra layers / sectors between these times.

The question operators should be asking is therefore not what will I need to get me through to 2066, but rather what will I need to get to 2030?

The majority of operators outside the US today still operate a 2G or 3G network, generally with minimal bandwidth to support legacy handsets and devices, while the 4G (LTE) network does most of the heavy lifting for carrying user traffic. This is often with the aid of an additional 5G-NSA (Non-Standalone) layer to provide additional capacity.

Is there a cost saving angle to adding support for 5G-Standalone in addition to 2G/3G/4G (LTE) and 5G (Non-Standalone) into your RAN?

A logical stance would be that removing layers / technologies (such as 2G/3G sunsetting) would lead to cost savings, and adding a 5G Standalone layer would increase cost.

All of the RAN solutions on the market today from the major vendors include support for both Standalone 5G and Non Standalone, but the feature licensing for a non-standalone 5G is generally cheaper than that for Standalone 5G.

The question operators should be asking is on what timescale do I need Standalone 5G?

If you’ve rolled out 5G-NSA today, then when are you looking to sunset your LTE network?
If the answer is “I hope to have long since retired by that time”, then you’ve just answered that question and you don’t need to licence / deploy 5G-SA in this hardware refresh cycle.

Other Cost Factors

Roaming: The majority of roaming traffic today relies on 2G/3G for voice. VoLTE roaming is (finally) starting to establish a foothold, but we are a long way from ubiquitous global roaming for LTE and VoLTE, and even further away for 5G-SA roaming. Focusing on 5G roaming will enable your network for roaming use by a miniscule number of operators, compared to LTE/VoLTE roaming which covers the majority of the operators in the developed world who can utilize your service.

I decided to split this into 3 posts, next I’ll post the “5G can make us more money” post and finally a “5G because we have to” post. I’ll post that on LinkedIn / Twitter / Mailing list, so stick around, and feel free to trash me in the comments.