All posts by Nick

About Nick

Adventures with Tape

20/06/2026LinuxBackups, TapeNick

For a long time I’ve run my own backups to a portable hard drive using Duplicity / Dejavu.

It’s worked really well, it’s dammed slow to pull back a file I’ve accidentally deleted, or need an old version of (which I probably go back to every 6 months or so), but it’s fast to backup, and I do that a whole lot more than I recover. I’ve even used it when swapping laptops to move all my data as a test, plus the slowness of recovery acts as a good deterrent to me doing dumb things like dropping partition tables and needing to use the backup.

That worked fine as an individual but at work the team is generating more and more data, and I needed a way to backup all our shared drives and docs, which is constantly growing, which means I needed a solution that didn’t involve just buying a lot of drives at crack money.

So I got a tape deck.

It seemed like a good solution; I like that I can eject the tape, and that I can just append to tape with (minimal) risk of nuking the data already there. These two points were how I justified it logically, but it’s also weird tech I haven’t used before (which was the main driver for me wanting to use it).

My first few problems with tape are:

I have more data than a single tape can handle
I have more data than I have free storage capacity to buffer it
I have no idea how to use a tape drive or how to connect it

The Bits

LTO-6 seems like a sweet spot for tape storage at 2.5TB uncompressed (Sure you can buy LTO-10 drives, with 40TB capacity, but they cost more than my car), however because this is an experiment I picked up an LTO-4 (800GB uncompressed) tape drive for $50, plus 10 tapes ($60 / $6 each) so I could try it all out.

The tape drive has SAS connectors on it, my homelab server has SAS drives in it (A Dell R630 that lives under the couch in my office my dog sleep on when I’m at work), but there’s no way I’m fitting this chonky tape drive into a 2.5″ disk bay.

Searching online told me to buy a “Dell 0T93GD T93GD LSI 9300-8e 12Gbps Dual-Port External SAS HBA Controller” which I duly spent $55 on without really understanding what exactly an HBA is.

Being an HBA it’s got the Mini-SAS HD (SFF-8644) connector, which does not look like the SAS connector on my tape drive, so I needed a Mini-SAS HD (SFF-8644) to 29Pin SAS cable, and I grabbed a long one as the tape drive lives on the desk not under the couch.

Wiring it Up

I used my advanced plugging skills to plug things in. I’m not really sure why I added a heading here, I just plugged the female SAS cable into the tape drive, hooked an ATX power supply with a paperclip shoved in it into the ATX supply to turn it on, and plugged the other end into the server.

In the end I grabbed a spare 10K SAS magnetic drive with 1.8T on it, and used that as a cache on the same HBA, so I’d pull the files onto there, an then push them sequentially to the tape.

Then inside Proxmox I passed the PCI device for the HBA adapter into the VM.

Linux just detected it as a tape drive, and we’re good to go.

The next issue I ran into was read and write speeds of the magnetic cache drive, and the tape.

The fastest I could pull from Google and write to disk was ~480Mbps (Okay because we’re dealing with storage for the most part, I’ll deal in MB/s which is ~58MB/s), I’ve got more capacity on the link to Google, but not on the drive which tops out around here.

The tape writes at a pretty sustained 72MB/s, so tape writes faster than the disk it’s reading from, which meant the tape jobs finished ~20% faster than the cache filled, this is good because it allowed me to sleep without needing to swap tapes in the middle of the night or fill the cache disk.

I tried Baccula but it’s not great for this “I have a bunch of data to dump onto a drive” problem, so I instead used tar which stands for “Tape Archive” (who knew) to compress the files and put them all onto tapes.

The write took a good long while – ~20 hours, but I validated each one as I went and created a file manifest so if I need a particular archive I know which tape it’s on (labeled with advanced Sharpie).

So the result of the experiment?

Well, the data is backed up, and I now have a good long term storage solution, I also backed up all the family photo albums,

Will I buy an LTO-6 or LTO-7 drive and use this more frequently? Maybe. But did I enjoy learning about it and messing around with a weird storage format, you betcha.

Tales from the Trenches – Gx over Gy?

06/06/2026EPC, GSM, History, LTE, Mobile Networks, Notes, RFCs & Standards, SDMDiameter, EPC, Gx, Gx over Gy, GyNick

I was recently asked by a potential customer if we supported Gx over Gy.

I’d never heard of this before, so I gave my standard “If it’s in the spec we should support it, but I’ll check” answer, and got them to send me a PCAP, which I’ve got.

This is weird.

So for starers, Protocoldex has nothing for this application ID (16777225), even though it has all the LTE diameter specs.

My starting point was TS 29.230 TS Diameter applications; 3GPP specific codes and identifiers which acknowledged the existence of “Gx over Gy” with IANA code 16777225 and pointed me to TS 29.210 which is a 3G spec (which is not a LTE / 4G spec).

The last version was from 2006, in 3GPP release 6, which is two years before LTE was standardized in Release 8. The word LTE does not appear in the doc or in the metadata tags.

It speaks of TPF (Traffic Plane Function) and TPF (Charging Rules Function).

LTE is “Long Term Evolution” – In later releases this draft TPF would evolve into the PGW (before the PGW-C / PGW-U divorce) and the TPF would go on to become the PCRF (and save spring break).

Reading through these early specs is like looking at Homo Eructs (get your mind out of the gutter) and knowing it evolves into Homo Sapiens.

So what does Gx over Gy do? Well, the concept is pretty straightforward, rather than needing a Sy interface between the PCRF and OCS, you can provision policy rules from the OCS, rather than on the PCRF.

Of course, you could never run VoLTE on this (the P-CSCF needs the Rx interface to the PCRF to provision dedicated bearers and the PCRF provisions those over TFTs over Gx interface).

So what network functions should implement this standard? Well, the P-GW specs do not reference this as something that’s included in the P-GW, nor is it in the GGSN – This was a “gooch” spec between the hypothetical standards land and real world implementations.

So will we be implementing it? Probably not. But an interesting bit of archaeology and a look through the genealogy of 3GPP.

PFCP and SIP Redirect

30/05/20265G SA, EPC, LTE, Mobile Networks, RF, RFCs & Standards, SDMCharging, Diameter, IMS, OCS, Online Charging, PFCP, Redirect-Information, SIPNick

There’s a cool feature in PFCP that allows you to redirect traffic, which I’ve written about before.

But there’s a funky thing that’s left me scratching my head, in the Redirect information IE, you can set a SIP URI.

That’d be great and all, but PFCP is all about packets not about calls.

So what’s the deal?

Had I uncovered some Machiavellian plot to move channel-associated-signaling onto PFCP instead of TDM links as God intended?

Well, no…

The Redirect Information in PFCP comes from the Redirect Information in Diameter, that’s how your OCS can tell your SMF or your PGW-C (or your TAS) – hey this session is all out of usage, and should be redirected.

Of course, PFCP is just all about packets, but Diameter has a foot in both camps, Gy and Ro are both on Diameter.

So when the 3GPP specced this IE, they just copied the encoding from the Redirect Address Type AVP in Diameter charging base, which has support for calls.

I can put down my pitchfork and go and hug my E1 links knowing they’re safe.

Walled Gardens & Redirection in 4G/5G (RedirectInformation)

23/05/20265G SA, EPC, LTE, Mobile Networks, Notes, RFCs & Standards5G, LTE, PFCP, Redirect-Information, SANick

PFCP includes a “Redirect Information” IE, which if set, allows you to change the forwarding action in PFCP to Redirect traffic.

We use this for walled garden redirects, when the OCS reports credit exhausted to the PGW-C, the PGW-C can tell the UPF (PGW-U) that all the traffic from a given subscriber should be redirected to a captive portal / walled garden, like a “Topup Now Page” you’d be used to seeing on Airport WiFi.

“Sign in to network” prompt presented on Cellular

Here’s what the spec says:

8.37. Redirect-Server AVP The Redirect-Server AVP (AVP Code 434) is of type Grouped and contains the address information of the redirect server (e.g., HTTP redirect server, SIP Server) with which the end user is to be connected when the account cannot cover the service cost. It MUST be present when the Final-Unit-Action AVP is set to REDIRECT. It is defined as follows (per the grouped-avp-def of RFC 3588 [DIAMBASE]): Redirect-Server ::= < AVP Header: 434 > { Redirect-Address-Type } { Redirect-Server-Address }

So how does this work in practice?

Once upon a time, you’d just intercept all HTTP request and serve your own content, but it’s not 2005 on Starbucks WiFi anymore, and SSL is everywhere.

Luckily this is a (mostly) solved problem, Apple has “Captive Network Assistant” that probes http://captive.apple.com/hotspot-detect.html and checks for a specific response, Google’s Android has http://connectivitycheck.gstatic.com/generate_204 and does the same thing.

There is a draft RFC to do this better, but it’s not widely adopted.

But before I can tell you what we do, I’ll show you what we’re not doing before we do the doing so you can see what the do does by looking at what happens when we don’t – Clear?

Before we send any Session Modification Request with redirect I can do a DNS lookup, here’s an example from our test jig that goes to Facebook:

A Record lookup for `facebook.com` resolving to `57.145.8.1`

This is just a regular A record DNS query wrapped up in GTP-U as it’d look from a eNB/gNB/SGW that gets an answer back also in GTP-U.

As we’ve already got a session up in our case, the SMF or PGW-C we sends the PFCP Session Modification Request I shared in the screenshot earlier to the UPF.

The Redirect Server Address in the Redirect Information IE in PFCP

We do a few things on the UPF at this point, the first, is that we block forwarding access to all IPs except 10.179.2.135 (The redirect server in the screenshot), and we steal / intercept all DNS queries.

This means if you query facebook.com after the Redirect Information is in place, you get back an A-Record answer for facebook.com but it’s telling you Facebook lives on our redirect server.

We’ve got a whitelist on our UPF for certain domains, so if we’re sending you to a self-signup page, you’re going to need to be able to hit our payment processors portals (Stripe, Paypal, etc), so we need to allow their domains, but we don’t know their IPs, so instead we do server side DNS lookups (via our DNS servers before you sneaky kids get any other ideas) for the whitelsited domains, and if it’s on our DNS whitelist, we allow resolution to those domains and allow access to those IPs returned in the DNS response.

In my lab I’m redirecting HTTP traffic to a management server

Turning it off just involves sending another PFCP Session Modification Request but without the redirect information.

Once this is set we’re back resolving addresses.

WebOS remote without the bloat

16/05/2026Notes, SoftwareNick

I rooted my WebOS TV.

I’m trying to shift away from all the garbage that comes with so many Android app, the LG app that acts as a TV remote in particular caught my ire – You need to sign in and give it location permission just to control the TV, and stream the audio to my phone so I can listen with my headphones.

So instead I made a very hacky app to run on the telly, that does both, that I can access from any device.

https://github.com/nickvsnetworking/webos-audiostream-remote

https://github.com/nickvsnetworking/webos-audiostream-remote

GSM SuperCell

09/05/2026GSM, Mobile Networks, RFCs & StandardsNick

This is an idea I’ve been kicking around for a little while – A single GSM TRX being broadcast across multiple cell sites.

Generally in GSM land, a “TRX” is a cell or a sector – but it doesn’t need to be. Later in GSM features like antenna diversity allow the same signal to be broadcast out multiple ports and received on multiple ports, and these to even work together.

Knowing this is possible, what if you run a single TRX across multiple cells / sectors / sites?

This means rather than cell site A & cell site B being “neighbors” they’re a single TRX. A subscriber moving between the two sees the same LAC and Cell ID, but they see the signal strength drop and then rise as they move between the cells, but there’s no “handover” – it’d look to the phone the same as going away from a cell then coming closer, which is what they’ve done, but the cell is being broadcast from two locations.

So why do this? Well, while VoLTE is nice, handset support on low end feature phones is still shite, and it means in some of the places we operate we couldn’t capture the low end of the market – those who just needed a basic voice service.
As such we’ve finally added an MSC to our product offering (after almost 10 years of me swearing there’s no way we’d build legacy crap) and a BSC that’s compatible with our existing RAN portfolio (Nokia Airscale), all to be able to address that gap, by running a small GSM layer off our existing radios.

The capacity would be shared of course – this is just one TRX, so 8 full rate channels, but for our use of CSFB for feature phones, garbage IoT devices, it doesn’t matter. On spectral efficiency this is way better than a 5Mhz UMTS carrier (smallest you can do) and co-exists nicely with LTE.

Running a large number of sectors / cells on a common single TRX means when you’ve got a boundary where you need to hand to another TRX, you need fewer channels your reuse pattern. Even running 3 TRXes in our “Super cell” area, is only 600Khz of bandwidth consumed, and if the area is large enough we can do a ~3:1 reuse pattern.

For this to work we’ve got to serve all the cells of a single baseband, but baseband hotels are becoming the norm, and fiber is everywhere. I did start exploring if we could do one TRX on multiple BTSes on our BSC, from an Abis perspective it’d work, but we’d need to ensure timing and I don’t know enough about how the clocking works on our BTSes to say for sure that they’d be in sync even with GLONASS/GPS.

Would this work at scale? I’ve no idea, but I’m hoping to find out!

Adventures not building 3G UMTS RNCs

02/05/2026History, Mobile Networks, Notes3G, RNC, UMTSNick

I have run AMPS (1G) in my lab. I’ve run 2G (GSM) networks in production.

There’s a few dozen production LTE/5G networks out there I’ve put my stamp on, but…

Never, have I ever, run UMTS.

And that feels like a blind spot.

Sure, core wise 3G it reuses the 2G core (MSC / SGSN and friends). As a company, Omnitouch supports 2G, 4G & 5G networks – But we’ve never had the need to deploy 3G.

There’s a common theory that the odd-numbered “Gs” are shit. 3G/UMTS was crap, and 5G / NR standalone is kinda shit. There’s some merit to this theory.

3G/UMTS was a transitional tech, when it was worked on at the turn of the century, there was a growing recognition that this whole internet thing was going to be a big deal, but without the benefit of knowing exactly how people would use it.

In 2G, operators would configure on their networks how many “timeslots” were for voice calls and how many were for data. Imagine drawing a pie chart and rationing out this percentage for the internets, and that percentage for voice calls, and that’s pretty much how it went. The major difference (without going into the “CDMA” code thing) that 3G brought was that operators no longer had to decide the split of cat videos vs calls the network could decide the split.

Australia shut down GSM about a decade ago, and turned off UMTS close to two years ago, but many counties, including large parts of Europe, plan to operate 2G layers ’till the 2030s. While UMTS/3G survived ~20 years, GSM will be in it’s 40s at the point where it’s shut off in many countries.

As what my partner somewhat lovingly refers to as a “school holiday project” (Pointless thing I do rather than relaxing like a normal person) I set about trying to build an RNC – A “Radio Network Controller” the UMTS equivalent of a Base Station Controller in GSM, that I could connect to OmniMSC.

So I fired up the Airscale, configured an Iub link and looked at what it sent.

And the answer was a big fat nothing – the first rule of fight club is that the RNC initiates the connection.

I limped the project along to the stage where I had the cell reporting “Up” and visible, but to actually attach phones started to get into the CDMA part of allocating codes, and real life started to catch up with me.

Perhaps someday I’ll continue the work, but the sad truth is there’s almost no scenarios today where you wouldn’t be better to deploy LTE and a GSM layer if you need to support legacy devices – the smallest UMTS carrier is 5Mhz, and while you can’t do 1:1 frequency reuse in GSM, you’ve got 20x unique GSM TRXes in that same 5Mhz you can use.

So for now, I’m giving up, knowing slightly more about RNC architecture, but still having never done a UMTS attach.

Ansible & SSH – Screwing up en mass

25/04/2026NotesAnsibleNick

One of my favorite things about Ansible and network automation as a whole is that I can do things in a repeatable manner, super quickly.

One of the worst things about network automation is I can uniformly break things super quickly.

Recently I was working on spinning up a core for a customer, who had some funky VPN stuff, which meant I’d needed to jump traffic through a jumphost (no biggie).

But I’d set ControlPath in my ansible.cfg file with %%h but that’s escaped as %h in the SSH config, except my ansible.cfg is not the same as an SSH config, so it passed it literally.

Protip: It’s just %h not %%h.

This meant it when Ansible created the socket, it didn’t fill with target hostname, so I had a single socket, which happened to be created on the first VM that we connected to (not consistent).

Then all the other commands for all the playbooks were run on a single VM that the socket was on, while Ansible reported it was running the roles and tasks across hosts it wasn’t, as everything was happening on one host.

This was, very confusing to debug.

If I sshed into box X, it’s hostname would show box Y, and it’d have the roles deployed from box Z.

I’ve no idea if anyone else will make the same stupid mistake as I did today, but I probably will, so I wrote this done.

Somebody’s watching me – Adventures in Cellular Location services

20/03/20265G SA, EPC, EUTRAN, IMS / VoLTE, LTE, Mobile Networks, Security5G, EPC, Location, Lpp, LTENick

Preface: I build cellular networks for a job.

We support a network in Alaska, and one of the guys we work with there – John – has a story (which I’ll steal here) where he gets a phone call late at night from someone saying they’re in the US Air Force, and uh, they’ve, uh, lost a plane. And since John works for the phone company, he wouldn’t have any idea where it is would you? They ask him.

As a matter of fact, John could see the last cell the SIM the pilot was carrying was attached to, they sent a helicopter out and found the pilot, who survived.

This was a long time ago, and he was able to pin the location down to a cell (sector), and lookup which direction the sectors were pointing for that cell and the location of it, to give a pretty good idea of the general search area.

Now that everyone carries a GPS in their pockets, the level of accuracy here is a lot more than just which cell are you served by (although that’s a lot of accuracy anyway, and not to be ignored).

There’s significant privacy implications here and a lot of misinformation about pinging cell towers and “zoom enhance” stuff.

I figured I’d actually share how this works IRL – There’s nothing ‘secret‘ here – All of this stuff is in the 3GPP standards which outline how mobile networks should behave.

I’ve written a precursor to this a few years ago – And the call was coming from… INSIDE THE HOUSE. A look at finding UE Locations in LTE.

Location Sources & Accuracy

There’s roughly 4 levels of accuracy in cell phone networks, we’ll cover each one, and how the network treats it.

(I’m talking 4G/5G here as most of the world has moved on or is already moving on from 2G/3G)

Tracking Area Level Accuracy

Cell sites get grouped into tracking areas, they’re kinda like broadcast domains in TCP/IP networking, when you need to “page” (find) a phone that’s “idle” (sleeping) you page the tracking area.

Tracking Area sizing has sweet spots, you want more than a few cells, commonly about a dozen or so in the same geographic area get lumped into the same tracking area. In regional areas you might have a large geographic area – Up to a few hundred Km in regional Australia for example, lumped into a single tracking area, whereas in a city that might be a single city block.

If you move between cells inside the same tracking area, then your phone doesn’t need to say to the network “hey I’m moving cells” – It’s only if we go over to a new tracking area that the phone needs to wake up and tell the network it’s now in this new tracking area.

(If you’ve got a tracking area that’s too big (too many cells) then it becomes a nightmare to find who you’re looking for, as the paging channels are always blaring out IDs, tracking areas too small and you’ve got phones having to constantly say “hey I’m moving to this tracking area now” – If you want to learn more about Tracking Areas I’ve written about them on the blog before)

The core network (MME/AMF) always knows the location of a phone at minimum to the tracking area – It’s the base level of location the network has to work with.

Cell ID Level Accuracy (CGI / E-CGI)

Every cell site sector (cell) has a unique ID to denote which carrier you’re connected to. If you’ve got a 3 sector site, with a single layer per cell sector, then that’s 3 Cell Global Identifiers (CGIs) – one for each sector.

Here’s a tower we put up recently, the CGIs I’ve drawn on are just examples, but if you’re connected to the sector facing North, you’d have CGI of 111, if you’re connected to the cell to the south east, you’d have 112, and the one to the south west would be 113.

CGIs are just numbers, they could be any number, all that matters is that number is unique (ish) in the network, they don’t need to be sequential, or have any common digits.

If we know the CGI of a given user we can kinda draw a 1/3rd wedge off the side of the tower in the direction the antenna is pointing, and if you’re inside that wedge, and that tower is still providing coverage, then we know the customer is somewhere inside that wedge.

But those wedges can still be large, so the margin of error for locating someone is still pretty large. You can probably answer the question of “Are they in the office or are they at home” if they’re in different suburbs.

There was a recent case of a misconfigured Mavenir IMS in the O2 network in the UK that was leaking CGI information on calls between two parties, as the SIP messages contained this and were not getting stripped before being passed to the B-Party.

When the network wants to know a bit more about where the phone is located, it can ask the cell site which Global Cell ID the phone is in, this is pretty rare, but can be done. When the phone is actively doing stuff, like making a call, using data or sending a text, the network knows the CGI of the event.

My lab is setup with CGI 4000 and TAC 100, and this information is littered across every signaling message.

Note: The encoding shows up as 0000 0000 0000 0000 1111 1010 0000 …. = cell-ID: 0x0000fa0 for CGI 4000, just roll with it, the spec explains why this is.

A SIP REGISTER message from my lab, showing the CGI (*00640000fa0*)

GNSS LPP Positioning

When the Cell ID level is not accurate enough, the network can request the phone to provide it’s location, using whatever it’s got available to it.

In reality, this is either done by an engineer from the phone company with the permissions to do so, or directly by law enforcement using the SLh/SLg Diameter interfaces.

When an engineer does it, there’s usually a portal they can go to, like this one in OmniMME, they search the IMSI or MSISDN, and then can get the location information via a variety of methods.

Your phone gets a message from the network, that says “Hey phone, tell me where you are”.

If you’ve got access to RRC messaging / NAS messaging on your phone through QXDM / Diag mode – You can see these requests.

If you’ve got enough access to the baseband you can even block these requests should you feel so inclined.

I’ve included some Wireshark captures of how this actually looks and how it looks from the Web UI of the MME, with the address removed.

OTDOA – “Pinging”

Sometimes you don’t get an indoor location with GPS or the phone might be too old to support LPP Positioning, no GPS built in or something.

In those scenarios, we use “Time Difference of Arrival” to calculate the position by measuring time between 2 or more cell sites, and calculating the time between when a signal was sent to a phone, and when it receives it, to calculate distance from the base station.

This is better than CGI as it gives you an idea of how far from the cell site the phone is, and the cell site, but it doesn’t return a map with “you are here”, but rather some rough distances, and CGIs for each cell it can see.

The engineer then pulls up a map of all the cell sites, finds the CGIs the cell phone can see, headings for each CGI and tries to do some early high school maths like someones life actually depends on it.

Protocoldex – A quick reference guide for AVPs / IEs / Commands / Applications

13/03/20265G SA, GSM, IMS / VoLTE, LTE, Mobile Networks, RFCs & Standards5G, Diameter, EPC, GTP-C, IMS, LTE, MAP, VoLTENick

I do a lot of protocol testing, writing Diameter/PFCP/GTP-C etc, and spend a lot of time referencing the standards.

So I built this – Inspired by a 1990s video game / TV / Playing card franchise online reference tool, but rather than identifying pocket monsters, it’s identifying AVPs and stuff

You can punch in the AVP code, AVP name, description, etc, for Diameter, PFCP, GTP-C, MAP or SBI and see all the details to go with it.

I’ve been using it a heap, hopefully some of you might find it useful:

Protocoldex

Tales from the Trenches – PGW-C Deleting Sessions

06/03/2026EPC, GSM, LTE, Mobile Networks, NotesNick

One of our customers is an MVNE and they reached out the other day with an issue.

They were turning up a new PGW and they’d see Create Session Request, everything looked OK, it’d get a response, but then in the GUI of the PGW-C they’d see the session drop.

The logs showed the newly setup session dropping shortly after being setup.

Have a look at the screenshot and see if you can work out why:

So what’s going on, and why is the PGW-C deleting sessions?

The initial reaction from the customer was there’s something up with the PGW, but the answer is bit more nuanced.

Per the specs, you can’t have two PDN sessions for the same subscriber (IMSI) on the same APN (DNN).

So if 50557000000001 is connected to the PGW-C on the internet APN, if I send another Create Session Request to the same PGW-C, it deletes the old session, before starting the new one.

In this case, the MVNE it was going through was dropping the Create Session Response, so it never made it back to the MNO, and then the MME in the MNO sent it again.

Joys of GTPv2-C being UDP based and connectionless!

Telco to Tech-Co. No. Just Telco. Tech-No.

27/02/2026History, RantsRants, TelecomNick

Over a decade ago, Dan McKinley published a blog post titled “Choose Boring Technology” which advocates that software developers and engineers design systems using “boring” technology.

One of the most worthwhile exercises I recommend here is to consider how you would solve your immediate problem without adding anything new. First, posing this question should detect the situation where the “problem” is that someone really wants to use the technology. If that is the case, you should immediately abort.
Dan McKinley’s blog post “Choose Boring Technology”

In the long time since he wrote that post, I feel like a lot of the tech industry has matured in it’s approach and learned these lessons – We’re not jumping onto the bleeding edge new-fangled tools as much, and instead developers and engineers are sticking towards tried and tested design patterns in order to achieve the business goals.

No Hayden, this doesn’t mean I’ll write PHP.

But it feels like in telecom at least, leadership teams have not learned this lesson.

Every telco conference I go to I hear about “telco to tech co transformation” in presentations from telecom CTOs on the verge of retirement, and it’s bullshit.

We are a boring tool – Telecom is the boring technology.

Boring it may be, but customers have only shown a demand for connectivity from telecom operators.

End customers aren’t showing demand for TV content from their operators (sports rights anyone?), bundled services, AI something, metaverse, connected cars, crypto, 5G slicing, containers, edge, robotic surgery or whatever else is being shilled as the next best thing for telecom operators.

The idea of diversifying into other revenue streams fails to ask the question of why telcos would be better suited to deliver value in those markets than literally every other organization on the planet. In the majority of cases, there’s no strong case to be made for telcos to take the lead here. Telecom projects generally have a much higher failure rate than that of most other sectors in tech, so why would telcos be best suited to these new industries?

Customers have consistently shown demand for fast, reliable, affordable access to connectivity, and only that.
That’s what customers want from us as an industry – We should have a laser like focus on delivering that, better and better year on year, and not chasing distractions.

As an example, go to Google Maps reviews, or Down Detector and find the feedback from customers of any given telco. While telcos crow about NPS the lived experience of customers – justified or not, is often pretty piss poor.

Chasing becoming a tech-co distracts from the core business for a network operator, of, well, operating the network.

Any talk of “business transformation” and shifting to becoming a tech-co just distracts from that mission.

We’re largely utilities and we’re not sexy, but that’s okay.

We all know Bell Labs, Australia’s Telecom labs and others produced some amazing technology inventions and were at the forefront of tech in their day – Shockley, Shannon, etc. But that was finished by the 1970s, and required a state monopoly and with an R&D budget that rivaled that of many small countries.
That monopoly is gone, and that money is gone. We can’t compete on broad tech innovation.

But consistently, since the telephone was first introduced in the 1800s, communications has been what customers want.

What we do isn’t shiny, but it is critical, and there’s still plenty of room for improvement in our space, to do things better.
I hope we as an industry focus on just doing what we do now but better, and embrace being a boring technology.

Packet Buffering in the UPF

20/02/20265G SA, EPC, LTE, Mobile Networks5GC, EPC, GTP, LTE, PFCPNick

When a UE enters Idle mode, the network releases radio resources and the UE enters power saving mode.

When the UE wants to send data (Uplink) the UE just tells the network “hey I want to send something” and away it goes, nice and simple.

But when the network wants to send data to the UE (Downlink) then the UPF needs a method to tell the Control Plane (SGW-C or SMF) that there’s data waiting and to go and page the UE.

A prime example of this is when you’ve got a Mobile Terminated VoLTE call coming in, you need a way to tell the UE to wake up out of Idle mode because you’ve got something to send to it (a SIP INVITE).

But in order for this to work, we can’t just say “Hey I’ve got some packets for you” and let them get dropped, the UPF also needs to buffer (store temporarily) the downlink packets for the UE until the UE comes out of Idle mode, and then flush them out to deliver them to the UE.

So let’s look at the flow.

Enabling Buffering (Idle Mode)

When the sub enters idle mode, the Control Plane (SGW-C for an EPC or SMF for a 5GC) it sends a Session Modification Request but with the BUFF (Buffer) and NOCP (Notify Control Plane) flags set, and FORW (Forward) turned off.

What this means is now for packets to that bearer, the UPF must:

Not forward any traffic
Buffer the traffic
Notify the control plane when the first packet comes in that we buffer

Then the UPF just sits and waits for any incoming packets.

The Notify

When the UE gets an incoming packet that it’s supposed to buffer and notify, well, it does just that.

The packets are copied into a buffer, in sequence, and for the first packet, the UPF must send a notification to the Control Plane.

That looks like this, it’s just a Session Report Request with the Dowlink Data Report flag.

Now the SMF/SGW-U sends back a Session Report Response and starts the process of paging the UE.

At the same time the UPF keeps buffering – It’s work is not done.

Flushing and Forwarding

Once the UE has become reachable, the Control Panel needs to modify the bearer to turn back on forwarding. It does through another Session Modification Request, this is the inverse of the one it sent to turn on buffering, as we’re turning off buffering and notifications, and turning on forwarding.

Now the UPF flushes it’s buffer – It’ll send all the packets that were queued up out over the wire towards the gNB / eNodeB, so the SIP INVITE for the MT call or whatever will make it through.

One thing to note is that the packets that get buffered are going to take some time to get delivered, as the NOTIFY / page UE / reconnect UE / Session Modification Request (to enable forwarding again) needs to happen before the buffers are flushed and delivered.

Notice the latency spike on the first packet? 610ms? That’s because the UE had to be paged to wake up.

And that’s pretty much it, the UPF has now flushed it’s buffers and moves back to forwarding actions.

Field Trip – UTS Nokia Lab

13/02/2026Australian Telco, Mobile Networks5G, EUTRANNick

I was recently in Sydney for a GSMA event with a few extra days in town to show one of our team members who was visiting from the US, around a city I don’t even live in.

So I reached out to our Nokia account manager and asked if I could visit “The pointy room” – The antenna chamber they’ve got there and Nokia lab.

It did not disappoint.

As well as all the fancy RAN kit and pointy room, they’ve got the kind of machine shop I dream of.

Thanks to Paul, Dave and Ed for showing me around.

We also had a few days hiking in the Blue Mountains, not as impressive as RAN kit.

Network Instance in PFCP

06/02/20265G SA, Mobile Networks5GC, PFCP, UPFNick

I was recently looking for a field I could use in PFCP to denote the VRF / Network Segment to be used, and initially thought Network Instance would be perfect for this.

It’s not.

Network Instance is kinda preferred over the APN/DNN for decisions, for example a Packet Detection Rule (PDR) does not give a damn what you’ve set as the APN/DNN, only what the Network Instance is set to:

a combination of the parameters, that incoming packets are requested to match, among: Local F-TEID, Network
Instance, UE IP address(es), SDF Filter(s) and/or Application ID. For 5GC, the PDI may additionally contain
one or more QFI(s) to detect traffic pertaining to specific QoS flow(s), Ethernet Packet Filter(s) and/or Ethernet
PDU Session Information (see clause 5.13.1).
TS 129 244 – 5.2.1A Packet Detection Rule Handling

So the PDRs actually look at Network Instance not APN/DNN.

Well, back to the drawing board…

Filtering OpenStreetMap Data for use in Forsk Atoll

30/01/2026Mobile Networks, RF, SoftwareAtoll, Forsk, OpenStreetMap, RF DesignNick

Taking a full extract from OpenStreetMap using the export tool on the website provides a .osm output file, which includes all the layers.

I’ve written about using GlobalMapper to convert this data into shapefiles for use in Forsk Atoll, but when I want roads, and building footprints, and an outline of the area, I need all of those as separate files.

Enter the osmfilter application, which does just that.

It’s a simple binary (Download from here) you can use to extract just the layers you need, for example:

osmfilter.exe map.osm --keep="natural=coastline" -o=coastline.osm

Gets me just the coastline outline in coastline.osm

osmfilter.exe map.osm --keep="building=" -o=buildings.osm

Gets me just the building footprints in buildings.osm.

osmfilter map.osm --keep="highway=" -o=roads.osm

Gets me all the roads.

I can the import this data using GlobalMapper, to export as a Shapefile that I import into Forsk Atoll, and style the Buildings, the roads, the whatever, however you want.

Stupid Mistakes – New UPF and IMS

23/01/20265G SA, EPC, IMS / VoLTE, LTE, Mobile Networks, Notes, VoIPEPC, GTP, IMS, LTE, SIP, UPF, VoIP, VoLTENick

Our team recently shipped a new UPF which is a huge improvement on our old UPF, and I drew the short straw of doing all the interop testing for the IMS.

Initially I thought there was an issue with IP routing, as I’d never see the SIP register from the UE, but I would see the IMS APN coming up.

I could access the internet from the UE IPs just fine, but that’s going to public IPs, whereas the P-CSCF is in private address space, and hosted on the same box as the UPF.

I spent hours on this as my lab servers do routing on a stick, and I thought some hardware offload somewhere was trying to fast path my packets and send them back to the server without going via the router.

Then I dug a little deeper and found I could see the 3 way handhake between the UE an the P-CSCF, but no SIP packets.

Successful 3 way handshake between the UE and the P-CSCF on TCP 5060

This was confusing, clearly we had at least intermittent two way comms – the 3 way TCP handshake confirmed that, but then why were packets not getting across?

We have an XCAP server hosted on our P-CSCF instances, so I tried hitting that from the phone in case there was something weird about routing to the network segment that hosts the P-CSCF, but I could hit the XCAP server just fine, so now I was certain the UE IP pool could route to the P-CSCF and 3 way handshake for TCP was working and payload could be pushed.

Clearly we can route to the P-CSCF as that’s where this XCAP server is hosted

Then I dug into what happened after the 3 way handshake, and I found a TCP payload containing the start of the SIP REGISTER.

Hmm, we have a SIP Fragment here at least…

I traced it all the way through and lo, it’s hitting the P-CSCF:

And the fragment is recieved on the P-CSCF

Okay, but then what happens, because it’s only a fragment, not the complete re-assembled packet, so what’s going on?

Well, the P-CSCF sends a TCP ACK back to the UE.

And the TCP fragment containing the first part of the REGISTER gets an ACK back from the P-CSCF

The ACK gets forwarded to the UPF:

And then… Nothing? The UPF never encaps the TCP ACK back into GTP-U and never sends it onto base station.

Eventually the UE re-sends the payload with the start of the REGISTER, but it does not get the ACK from the P-CSCF.

Retransmitted TCP segment containing the REGISTER from the UE

So naughty UPF right? Not forwarding that ACK for some reason?

I started digging, maybe the ACK was getting routed weirdly and landing on the UPF without going through the router?

Well not quite…

When I started digging into the QER rules being installed I noticed the MBR bitrate we had on the IMS APN in the HSS was tiny.

The UPF can only gate on traffic to the UE, so was gating the ACK traffic, as the QER had consumed all the bandwidth so the ACK never made it back.

Time wasted – About 4 hours, but I will not make this mistake again!

SS7/MAP – Call Forward

16/01/2026GSM, Mobile Networks, RFCs & StandardsForward, HLR, MAP, MSC, SS7Nick

I’ve covered how SS7/ISUP handles call forward before, but the HLR can also store call forwarding information.

This is returned to the MSC when the SendRoutingInfo dialog is performed against the HLR.

If it’s present the MSC will redirect the call to that destination, after bouncing it through CAMEL (if enabled).

A lot simpler than Call Forward in IMS, but same outcome.

A lot of HSSes we see are just HLRs under the hood and only implement a minimalist MMTel feature set for call forwarding for this reason to have it track across both.

Loading OpenTopography.org data into Forsk Atoll

09/01/2026Mobile Networks, RF, SoftwareAtoll, DEM, Forsk, GIS, RF DesignNick

Looks like this is my 3rd (and hopefully final) post on the topic of loading Digital Elevation Models / Topographic data into Forsk Atoll, because this time, we’ve got global data, which allows us to Digital Elevation Models, at 30m resolution, for anywhere on the planet.

The Copernicus DEM is a Digital Surface Model (DSM) which represents the surface of the Earth including buildings, infrastructure and vegetation. This DSM is derived from an edited DSM named WorldDEM, where flattening of water bodies and consistent flow of rivers has been included. In addition, editing of shore- and coastlines, special features such as airports, and implausible terrain structures has also been applied.

The WorldDEM product is based on the radar satellite data acquired during the TanDEM-X Mission, which is funded by a Public Private Partnership between the German State, represented by the German Aerospace Centre (DLR) and Airbus Defence and Space. OpenTopography is providing access to the global GLO-30 Defence Gridded Elevation Data (DGED) 2023_1 version of the data hosted by ESA via the PRISM service. Details on the Copernicus DSM can be found on this ESA site.

This is a tool for a job, 30m resolution is not crazy high – LIDAR scans achieve sub 1m accuracy, but aren’t available everywhere, where as the COP30 dataset is global, meaning we can do RF design for anywhere on the planet.

So how do we get this into Atoll to do RF modeling?

We visit the OpenTopography.org website, and select the target area for the data we want to download.

Then you punch in an email address, grab a drink and find the download link in your inbox.

You’ll get a zip file, containing a file name like output_hh.tif.

Alas, we can’t import this straight into Atoll, if we do, we’ll see this issue:

So I converted it using GlobalMapper as I’ve talked about in the past, but ran into this issue on the exported products:

The pixel size was always zero meters, even though I’d changed it inside GlobalMapper.

The issue, I later worked out is with projections.

So to work around this I imported the data into GlobalMapper, where I could see it fine (and I added some OSM data like roads and building footprints).

I had to re-project the data, so inside Global Mapper I had to go to Tools -> Configure.

Then change the projection to UTM and use the UTM Zone Finder to find out what zone I needed.

Then I exported the data as an Erdas Imagine file and was able to imported happily into Atoll, after setting the matching projection in Atoll (Which you generally do at the start of the project).

Elixir – ETS vs Map Gut Feels (With Data)

02/01/2026Elixir, SoftwareElixirNick

Some questions I wanted to impirically answer with Elixir:

Does the size of value of a Map or ETS object impact the time it takes to retrive a given record (specifically to find that key from others)?
Does the number of keys in ETS or a Map impact the time it takes to retrive a given record?

So what did I find:

Question: Does the size of value of a Map or ETS object impact the time it takes to retrive a given record (specifically to find that key from others)?

Answer: In Maps, the impact of a 100x larger payload being written has a fairly minimal (4% impact) on read speed, but this is ~30% when using ETS.

Methodology used:

Compared 100000 and 100 byte payloads for read and write times.

ETS Write is 24% slower with the larger payload (That’s to be expected, we’ve got more data to copy)
ETS Read is 30% slower with the larger payload (Suggests that the size of the payload has a bearing on how quickly it can be indexed)
Map Write was 13% slower with the larger payload and the Map read is only 4% slower

  @reads 14_880_000
  @distinct 100_000
  @payload_bytes 100000

=== Results ===
Payload gen: 0 ms for 100000 bytes

ETS write:   223 ms (448,430.5 writes/sec)
ETS read:    15933 ms total (933,910.8 reads/sec)

Map write:   108 ms (925,925.9 writes/sec)
Map read:    2355 ms total (6,318,471.3 reads/sec)

  @reads 14_880_000
  @distinct 100_000
  @payload_bytes 100
=== Results ===
Payload gen: 0 ms for 100 bytes

ETS write:   177 ms (564,971.8 writes/sec)
ETS read:    11149 ms total (1,334,648.8 reads/sec)

Map write:   125 ms (800,000.0 writes/sec)
Map read:    2470 ms total (6,024,291.5 reads/sec)

Question: Does the number of keys in ETS or a Map impact the time it takes to retrive a given record?

Answer: Writing 1000 records is of course super quick. That much should be obvious, but the retrive / read time is what I’m interested in here.

ETS: Yes, there is a ~40% performance hit in retriving a record from an ETS DB with 14m records than a DB with 1k records. This doesn’t appear to be linear, but there is an impact.

Map: Boy howdy, there’s a 10x increase in the time to get data when the database is larger than smaller. This is probably to do with how simple Maps are, compared to ETS which is definatley a better tool for the job when working with more records.

This led to an interesting realization, Map is faster at smaller sets of data (more keys, not larger values), but ETS is faster for larger data sets (again more keys, not coutning the values), and there’s a break-even point for ETS usage.

TL;DR – ETS – Number of keys has a mangable impact (40%) on retrive time. Map – Number of keys has a very real impact (10x) impact on read times.

Compared 1000 distinct keys vs 14.8m distinct keys.

  @reads 14_880_000
  @distinct 1000
  @payload_bytes 1000
ETS write:   1 ms (1,000,000.0 writes/sec)
ETS read:    9769 ms total (1,523,185.6 reads/sec)

Map write:   9 ms (111,111.1 writes/sec)
Map read:    1292 ms total (11,517,027.9 reads/sec)

  @reads 14_880_000
  @distinct 14_880_000
  @payload_bytes 1000
ETS write:   26176 ms (568,459.7 writes/sec)
ETS read:    16363 ms total (909,368.7 reads/sec)

Map write:   45503 ms (327,011.4 writes/sec)
Map read:    12926 ms total (1,151,168.2 reads/sec)

Nick vs Networking

Telco Network Engineering