Technology is constantly evolving, new research papers are published every day.
But recently I was shocked to discover I’d missed a critical development in communications, that upended Shannon’s “A mathematical theory of communication”.
I’m talking of course, about the GENERATION X PLUS SP-11 PRO CELL ANTENNA.
I’ve been doing telecom work for a long time, while I mostly write here about Core & IMS, I am a licenced rigger, I’ve bolted a few things to towers and built my fair share of mobile coverage over the years, which is why I found this development so astounding.
With this, existing antennas can be extended, mobile phone antennas, walkie talkies and cordless phones can all benefit from the improvement of this small adhesive sticker, which is “Like having a four foot antenna on your phone”.
So for the bargain price of $32.95 (Or $2 on AliExpress) I secured myself this amazing technology and couldn’t wait to quantify it’s performance.
Think of the applications – We could put these stickers on 6 ft panel antennas and they’d become 10ft panels. This would have a huge effect on new site builds, minimize wind loading, less need for tower strengthening, more room for collocation on the towers due to smaller equipment footprint.
Luckily I have access to some fancy test equipment to really understand exactly how revolutionary this is.
The packaging says it’s like having a 4 foot antenna on your phone, let’s do some very simple calculations, let’s assume the antenna in the phone is currently 10cm, and that with this it will improve to be 121cm (four feet).
Projected Gain (Post Sticker)Formulas Used
According to some basic projections we should see ~21dB gain by adding the sticker, that’s a 146x increase in performance!
Man am I excited to see this in action.
Fortunately I have access to some fun cellular test equipment, including the Viavi CellAdvisor and an environmentally controlled lab my kitchen bench.
I put up a 1800Mhz (band 3) LTE carrier in my office in the other room as a reference and placed the test equipment into the test jig (between the sink and the kettle).
We then took baseline readings from the omni shown in the pictures, to get a reading on the power levels before adding the sticker.
We are reading exactly -80dBm without the sticker in place, so we expertly put some masking tape on the omni (so we could peel it off) and applied the sticker antenna to the tape on the omni antenna.
At -80dBm before, by adding the 21dB of gain, we should be put just under -60dBm, these Viavi units are solid, but I was fearful of potentially overloading the receive end from the gain, after a long discussion we agreed at these levels it was unlikely to blow the unit, so no in-line attenuation was used.
Okay, </sarcasm> I was genuinely a little surprised by what we found; there was some gain, as shown in the screenshot below.
Marker 1 was our reference without the sticker, while reference 2 was our marker with the sticker, that’s a 1.12dB gain with the sticker in place. In linear terms that’s a ~30% increase in signal strength.
Screenshot
So does this magic sticker work? Well, kinda, in as much that holding onto the Omni changes the characteristics, as would wrapping a few turns of wire around it, putting it in the kettle or wrapping it in aluminum foil. Anything you do to an antenna to change it is going to cause minor changes in characteristic behavior, and generally if you’re getting better at one frequency, you get worse at another, so the small gain on band 3 may also lead to a small loss on band 1, or something similar.
So what to make of all this? Maybe this difference is an artifact from moving the unit to make a cup of tea, the tape we applied or just a jump in the LTE carrier, or maybe the performance of this sticker is amazing after all…
Recently we were on a project and our RAN guy was seeing UEs hand between one layer and another over and over. The hysteresis and handover parameters seemed correct, but we needed a way to see what was going on, what the eNB was actually advertising and what the UE was sending back.
In a past life I had access to expensive complicated dedicated tooling that could view this information transmitted by the eNB, but now, all I need is a cellphone or a modem with a Qualcomm chip.
Namespace(k='11111111111111111111111111111111', op='22222222222222222222222222222222', opc=None) Generating OPc key from OP & K Generating Multimedia Authentication Vector Input K: b'11111111111111111111111111111111' Input OPc: b'2f3466bd1bea1ac9a8e1ab05f6f43245' Input AMF: b'\x80\x00'
Of course, being open source, you can grab the functions out of this and make a little script to convert everything in a CSV or whatever format your key data is in.
So what about OPc to OP? Well, this is a one-way transaction, we can’t get the OP Key from an OPc & Ki.
I’ve written about Milenage and SIM based security in the past on this blog, and the component that prevents replay attacks in cellular network authentication is the Sequence Number (Aka SQN) stored on the SIM.
Think of the SQN as an incrementing odometer of authentication vectors. Odometers can go forward, but never backwards. So if a challenge comes in with an SQN behind the odometer (a lower number), it’s no good.
Why the SQN is important for Milenage Security
Every time the SIM authenticates it ticks up the SQN value, and when authenticating it checks the challenge from the network doesn’t have an SQN that’s behind (lower than) the SQN on the SIM.
Let’s take a practical example of this:
The HSS in the network has SQN for the SIM as 8232, and generates an authentication challenge vector for the SIM which includes the SQN of 8232. The SIM receives this challenge, and makes sure that the SQN in the SIM, is equal to or less than 8232. If the authentication passes, the new SQN stored in the SIM is equal to 8232 + 1, as that’s the next valid SQN we’d be expecting, and the HSS incriments the counters it has in the same way.
By constantly increasing the SQN and not allowing it to go backwards, means that even if we pre-generated a valid authentication vector for the SIM, it’d only be valid for as long as the SQN hasn’t been authenticated on the SIM by another authentication request.
Imagine for example that I get sneaky access to an operator’s HSS/AuC, I could get it to generate a stack of authentication challenges that I could use for my nefarious moustache-twirling purposes whenever I wanted.
This attack would work, but this all comes crumbling down if the SIM was to attach to the real network after I’ve generated my stack of authentication challenges.
If the SQN on the SIM passes where it was when the vectors were generated, those vectors would become unusable.
It’s worth pointing out, that it’s not just evil purposes that lead your SQN to get out of Sync; this happens when you’ve got subscriber data split across multiple HSSes for example, and there’s a mechanism to securely catch the HSS’s SQN counter up with the SQN counter in the SIM, without exposing any secrets, but it just ticks the HSS’s SQN up – It never rolls back the SQN in the SIM.
The Flaw – Draining the Pool
The Authentication Information Request is used by a cellular network to authenticate a subscriber, and the Authentication Information Answer is sent back by the HSS containing the challenges (vectors).
When we send this request, we can specify how many authentication challenges (vectors) we want the HSS to generate for us, so how many vectors can you generate?
TS 129 272 says the Number-of-Requested-Vectors AVP is an Unsigned32, which gives us a possible pool of 4,294,967,295 combinations. This means it would be legal / valid to send an Authentication Information Request asking for 4.2 billion vectors.
It’s worth noting that that won’t give us the whole pool.
Sequence numbers (SQN) shall have a length of 48 bits.
TS 133 102
While the SQN in the SIM is 48 bits, that gives us a maximum number of values before we “tick over” the odometer of 281,474,976,710,656.
If we were to send 65,536 Authentication-Information-Requests asking for 4,294,967,295 a piece, we’d have got enough vectors to serve the sub for life.
Except the standard allows for an unlimited number of vectors to be requested, this would allow us to “drain the pool” from an HSS to allow every combination of SQN to be captured, to provide a high degree of certainty that the SQN provided to a SIM is far enough ahead of the current SQN that the SIM does not reject the challenges.
Can we do this?
Our lab has access to HSSes from several major vendors of HSS.
Out of the gate, the Oracle HSS does not allow more than 32 vectors to be requested at the same time, so props to them, but the same is not true of the others, all from major HSS vendors (I won’t name them publicly here).
For the other 3 HSSes we tried from big vendors, all eventually timed out when asking for 4.2 billion vectors (don’t know why that would be *shrug*) from these HSSes, it didn’t get rejected.
This is a lab so monitoring isn’t great but I did see a CPU spike on at least one of the HSSes which suggests maybe it was actually trying to generate this.
Of course, we’ve got PyHSS, the greatest open source HSS out there, and how did this handle the request?
Well, being standards compliant, it did what it was asked – I tested with 1024 vectors I’ll admit, on my little laptop it did take a while. But lo, it worked, spewing forth 1024 vectors to use.
So with that working, I tried with 4,294,967,295…
And I waited. And waited.
And after pegging my CPU for a good while, I had to get back to real life work, and killed the request on the HSS.
In part there’s the fact that PyHSS writes back to a database for each time the SQN is incremented, which is costly in terms of resources, but also that generating Milenage vectors in LTE is doing some pretty heavy cryptographic lifting.
The Risk
Dumping a complete set of vectors with every possible SQN would allow an attacker to spoof base stations, and the subscriber would attach without issue.
Historically this has been very difficult to do for LTE, due to the mutual network authentication, however this would be bypassed in this scenario.
The UE would try for a resync if the SQN is too far forward, which mitigates this somewhat.
Cryptographically, I don’t know enough about the Milenage auth to know if a complete set of possible vectors would widen the attack surface to try and learn something about the keys.
Mitigations / Protections
So how can operators protect ourselves against this kind of attack?
Different commercial HSS vendors handle this differently, Oracle limits this to 32 vectors, and that’s what I’ve updated PyHSS to do, but another big HSS vendor (who I won’t publicly shame) accepts the full 4294967295 vectors, and it crashes that thread, or at least times it out after a period.
If you’ve got a decent Diameter Routing Agent in place you can set your DRA to check to see if someone is using this exploit against your network, and to rewrite the number of requested vectors to a lower number, alert you, or drop the request entirely.
Having common OP keys is dumb, and I advocate to all our operator customers to use OP keys that are unique to each SIM, and use the OPc key derived anyway. This means if one SIM spilled it’s keys, the blast doesn’t extend beyond that card.
In the long term, it’d be good to see 3GPP limit the practical size of the Number-of-Requested-Vectors AVP.
2G/3G Impact
Full disclosure – I don’t really work with 2G/3G stacks much these days, and have not tested this.
MAP is generally pretty bandwidth constrained, and to transfer 280 billion vectors might raise some eyebrows, burn out some STPs and take a long time…
But our “Send Authentication Info” message functions much the same as the Authentication Information Request in Diameter, 3GPP TS 29.002 shows we can set the number of vectors we want:
5GC Vulnerability
This only impacts LTE and 5G NSA subscribers.
TS 29.509 outlines the schema for the Nausf reference point, used for requesting vectors, and there is no option to request multiple vectors.
Summary
If you’ve got baddies with access to your HSS / HLR, you’ve got some problems.
But, with enough time, your pool could get drained for one subscriber at a time.
This isn’t going to get the master OP Key or plaintext Ki values, but this could potentially weaken the Milenage security of your system.
One of the new features of 5GC is the introduction of Service Based Interfaces (SBI) which is part of 5GC’s Service Based Architecture (SBA).
Let’s start with the description from the specs:
3GPP TS 23.501 [3] defines the 5G System Architecture as a Service Based Architecture, i.e. a system architecture in which the system functionality is achieved by a set of NFs providing services to other authorized NFs to access their services.
3GPP TS 29.500 – 4.1 NF Services
For that we have two key concepts, service discovery, and service consumption
Services Consumer / Producers
That’s some nice words, but let’s break down what this actually means, for starters, let’s talk about services.
In previous generations of core network we had interfaces instead of services. Interfaces were the reference point between two network elements, describing how the two would talk. The interfaces were the protocols the two interfaces used to communicate.
For example, in EPC / LTE S6a is the interface between the MME and the HSS, S5 is the interface between the S-GW and P-GW. You could lookup the 3GPP spec for each interface to understand exactly how it works, or decode it in Wireshark to see it in action.
5GC moves from interfaces to services. Interfaces are strictly between two network elements, the S6a interface is only used between the MME and the HSS, while a service is designed to be reusable.
This means the Service Based Interface N5g-eir can be used by the AMF, but it could equally be used by anyone else who wants access to that information.
3GPP defines the service in the form of service producer (The EIR produces the N5g-eir service) and the service consumer (The client connecting to the N5g-eir service), but doens’t restrict which network elements can
This gets away from the soup of interfaces available, and instead just defines the services being offered, rather than locking the
“service consumers” (which can be thought of clients in a client/server model) can discover “service producers” (like servers in a client/server model).
Our AMF, which acts as a “service consumer” consuming services from the UDM/UDR and SMF.
Service Discovery – Automated Discovery of NF Services
Service-Based Architecture enables 5G Core Network Function service discovery.
In simple terms, this means rather that your MME being told about your SGW, the nodes all talk to a “Network Repository Function” that returns a list of available nodes.
The mobility management and connection management process in 5GC focuses on Connection Management (CM) and Registration Management (RM).
Registration Management (RM)
The Registration Management state (RM) of a UE can either be RM-Registered or RM-Deregistered. This is akin to the EMM state used in LTE.
RM-Deregistered Mode
From the Core Network’s perspective (Our AMF) a UE that is in RM-Deregistered state has no valid location information in the AMF for that UE. The AMF can’t page it, it doesn’t know where the UE is or if it’s even turned on.
From the UE’s perspective, being in RM-Deregistered state could mean one of a few things:
UE is in an area without coverage
UE is turned off
SIM Card in the UE is not permitted to access the network
In short, RM-Deregistered means the UE cannot be reached, and cannot get any services.
RM-Registered Mode
From the Core Network’s perspective (the AMF) a UE in RM-Registered state has sucesfully registered onto the network.
The UE can perform tracking area updates, period registration updates and registration updates.
There is a location stored in the AMF for the UE (The AMF knows at least down to a Tracking Area Code/List level where the UE is).
The UE can request services.
Connection Management (CM)
Connection Management (CM) focuses on the NAS signaling connection between the UE and the AMF.
To have a Connection Management state, the Registration Management procedure must have successfully completed (the UE being in RM-Registered) state.
A UE in CM-Connected state has an active signaling connection on the N1 interface between the UE and the AMF.
CM-Idle Mode
In CM-Idle mode the UE has no active NAS connection to the AMF.
UEs typically enter this state when they have no data to send / recieve for a period of time, this conserves battery on the UE and saves network resources.
If the UE wants to send some data, it performs a Service Request procedure to bring itself back into CM-Connected mode.
If the Network wants to send some data to the UE, the AMF sends a paging request for the UE, and upon hearing it’s identifier (5G-S-TMSI) on the paging channel, the UE performs the Service Request procedure to bring itself back into CM-Connected mode.
CM-Connected Mode
In CM-Connected mode the UE has an active NAS connection with the AMF over the N1 interface from the UE to the AMF.
When the access network (The gNodeB) determines this state should change (typically based on the UE being idle for longer than a set period of time) the gNodeB releases the connection and the UE transitions to CM-Idle Mode.
So let’s roll up our sleeves and get a Lab scenario happening,
To keep things (relatively) simple, I’ve put the eNodeB on the same subnet as the MME and Serving/Packet-Gateway.
So the traffic will flow from the eNodeB to the S/P-GW, via a simple Network Switch (I’m using a Mikrotik).
While life is complicated, I’ll try and keep this lab easy.
Experiment 1: MTU of 1500 everywhere
Network Element
MTU
Advertised MTU in PCO
1500
eNodeB
1500
Switch
1500
Core Network (S/P-GW)
1500
So everything attaches and traffic flows fine. There is no problem right?
Well, not a problem that is immediately visible.
While the PCO advertises the MTU value at 1500 if we look at the maximum payload we can actually get through the network, we find that’s not the case.
This means if our end user on a mobile device tried to send a 1500 byte payload, it’d never get through.
While DNS would work, most TCP traffic would flow fine, certain UDP applications would start to fail if they were sending payloads nearing 1500 bytes.
So why is this?
Well GTP adds overhead.
8 bytes for the GTP header
8 bytes for the transport UDP header
20 bytes for the transport IPv4 header
14 bytes if our transport is using Ethernet
For a total of 50 bytes of overhead, assuming we’re not using MPLS, QinQ or anything else funky on our transport network and just using Ethernet.
So we have two options here – We can either lower the MTU advertised in our Protocol Configuration Options, or we can increase the MTU across our transport network. Let’s look at each.
Experiment 2: Lower Advertised MTU in PCO to 1300
Well this works, and looks the same as our previous example, except now we know we can’t handle payloads larger than 1300 without fragmentation.
Experiment 3: Increase MTU across transmission Network
While we need to account for the 50 bytes of overhead added by GTP, I’ve gone the safer option and upped the MTU across the transport to 1600 bytes.
With this, we can transport a full 1500 byte MTU on the UE layer, and we’ve got the extra space by enabling jumbo frames.
Obviously this requires a change on all of the transmission layer – And if you have any hops without support for this, you’ll loose packets.
Conclusions?
Well, fragmentation is bad, and we want to avoid it.
For this we up the MTU across the transmission network to support jumbo frames (greater than 1500 bytes) so we can handle the 1500 byte payloads that users want.
I generally do this with Python or via the Swagger UI for the Web UI, but here’s how we can create a fixed-line IMS subscriber in PyHSS, so we can register it with a softphone, without using EAP-AKA.
Firstly we create the AuC object for this password combo.
If you’re working with the larger SIM vendors, there’s a good chance they key material they send you won’t actually contain the raw Ki values for each card – If it fell into the wrong hands you’d be in big trouble.
Instead, what is more likely is that the SIM vendor shares the Ki generated when mixed with a transport key – So what you receive is not the plaintext version of the Ki data, but rather a ciphered version of it.
But as long as you and the SIM vendor have agreed on the ciphering to use, an the secret to protect it with beforehand, you can read the data as needed.
This is a tricky topic to broach, as transport key implementation, is not covered by the 3GPP, instead it’s a quasi-standard, that is commonly used by SIM vendors and HSS vendors alike – the A4 / K4 Transport Encryption Algorithm.
It’s made up of a few components:
K2 is our plaintext key data (Ki or OP)
K4 is the secret key used to cipher the Ki value.
K7 is the algorithm used (Usually AES128 or AES256).
It’s important when defining your electrical profile and the reuqired parameters, to make sure the operator, HSS vendor and SIM vendor are all on the same page regarding if transport keys will be used, what the cipher used will be, and the keys for each batch of SIMs.
Here’s an example from a Huawei HSS with SIMs from G&D:
We’re using AES128, and any SIMs produced by G&D for this batch will use that transport key (transport key ID 1 in the HSS), so when adding new SIMs we’ll need to specify what transport key to use.
In our last post we covered the basics of NB-IoT Non-IP Data Deliver (NIDD), and if that acronym soup wasn’t enough for you, we’re going to take a deep dive into the flows for attaching, sending, receiving and closing a NIDD session.
The attach for NIDD is very similar to the standard attach for wideband LTE, except the MME establishes a connection on the T6a Diameter interface toward the SCEF, to indicate the sub is online and available.
The NIDD Attach
The SCEF is now able to send/receive NIDD traffic from the subscriber on the T6a interface, but in reality developers don’t / won’t interact with Diameter, so the SCEF exposes the T8 API that developers can interact with to access an abstraction layer to interact with the SCEF, and then through onto the UE.
If you’re wondering what the status of Open Source SCEF implementations are, then you may have already guessed we’re working on one! PyHSS should have support for NB-IoT SCEF features in the future.
NB-IoT provides support for Non-IP Data Delivery (NIDD) over 3GPP Networks, but to handle this, some new network elements are introduced, in a home network scenario that’s the SCEF and the SCF/AS.
On the 3GPP side the SCEF it communicates to the MME via the T6a Interface, which is based upon Diameter.
On the side towards our IoT Service Consumers (in the standards referred to as “SCS/AS” or “Service Capabilities Server Application Servers” (catchy names as always), via the RESTful HTTP based T8 interface.
The start of the S1 Attach procedure is very similar to a regular S1 attach.
The initial S1 PDU Connectivity Request indicates in the ESM Message Container that the PDN Type is Non IP.
S1 PDU Connectivity Request from attach procedure
Other than that, the initial attach procedure looks very similar to the regular S1 attach procedure.
On the S6a interface the Update Location Request from the MME to the HSS indicates that this is an EUTRAN-NB-IoT Radio Access Type.
And the Update Location Answer APN Configuration contains some additional AVPs on the APN to indicate that the APN supports Non-IP-PDN-Type and that the SCEF is used for Data Delivery.
The SCEF-ID (Diameter Host) and SCEF-Realm (Diameter realm) to serve this user is also specified in the APN Configuration in the Update Location Answer.
This is how our MME determines where to send the T6a traffic.
With this, the MME sends a Connection Management Request (CMR) towards the SCEF specified in the SCEF-ID returned by the HSS.
The Connection Management Request / Response
The MME now sends a Diameter T6a Connection Management Request to the SCEF in the Update Location Answer,
In it we have a Session-Id, which continues for the life of our NIDD session, the service-selection which contains our APN (In our case “non-ip”) and the User-Identifier AVP which contains the MSISDN and/or IMSI of the subscriber.
To accept this, the SCEF sends back a Connection-Management-Answer to confirm we’re all good to go:
At this point our SCEF now knows about the subscriber who’s just attached to our network, and correlates it with the APN and the session-ID.
On the S1 side the connection is confirmed and we’re ready to roll.
Mobile Originated Data Request / Response
When the UE wants to send NIDD it’s carried in NAS messaging, so we see an Uplink NAS transport from the UE and inside the NAS payload itself is our HEX data.
Our MME grabs this out and sends it in the form of of a Mobile-Originated-Data-Request (MODR) to the SCEF, along with the same Session-ID that was setup earlier:
At this stage our Non-IP Data is exposed over the T8 RESTful API, which we won’t cover in this post.
Note: I’m lazily posting this as its been in my drafts folder for an exceedingly long time – Before going too much further, it’s worth pointing out that eMBMS never really made it anywhere – no production networks of note use eMBMS. I started researching it and my interest petered out once I discovered I couldn’t get any UEs or hardware that supported eMBMS.
Mobile networks are designed as point to point, all traffic is unicast.
But multicast and broadcast traffic is real, and becoming more common in some applications.
In areas where users stream the same radio program, or TV show, live, each of them is consuming the same data stream, but each one gets sent a unique copy of the data, on a resource block allocated to them for reception of the data.
If we have 10 users on a cell, each streaming a 5Mbps live video, that’s 50Mbps of capacity taken up on the radio / air interface. If that stream was moved onto a eMBMS service, only 5Mbps of capacity would be used, regardless of how many people on the cell are consuming it.
For Mission Critical Push to Talk applications, the lack of broadcast/multicast support was highlighted again. For a PTT app with 10 users in a talk group, you’d need to schedule resource blocks for 10 users, and allocate 10 radio resources 10 times, send GTP packets 10 times, all to send the same data to 10 people.
So enter eMBMS – The Evolved Multimedia Broadcast and Multicast Service, providing multicast service for LTE.
Overall Architecture
eMBMS introduces a few changes to the RAN side to handle support for a shared data channel, which is sent by the eNodeB and that UEs can listen on to get data. (More on admission control later)
From a core perspective two new network elements are introduced, the Broadcast/Multicast Service Center (BM-SC) and Multimedia Broadcast Multicast Services Gateway (MBMS GW), these elements function in much the same was the P-GW and S-GW retrospectively, but in regards to Multicast services.
Like so many 3GPP specs before it, MBMS relies on GTP for transporting the data to be distributed, and relies on GTPv2-C for control plane data.
BM-SC – Broadcast Media Service Centre
The Broadcast Multicast Service Centre acts as the gateway between content providers (providing streams of data to be distributed) and the EPC.
The BM-SC sets up eMBMS sessions and pulls broadcast data from the content providers and collects receipts from subscribers of some streams to charge / track consumption of the services.
In this regard the BM-SC is akin to the P-GW, which as the border for the EPC and external networks, except it’s largely unidirectional.
MBMS Gateway
The MBMS Gateway (MBMS-GW) encapsulates the broadcast data stream from the BM-SC and encapsulates it into GTP packets to be distributed to eNBs across the network.
The MBMS-GW allocates a multicast transport address for each broadcast data stream?
MME Interaction
For this a new interface is introduced on the MME – the Sm interface, which interconnects the MME and the MBMS-Gateways assigned to it.
This post follows on from Part 1 and Part 2 of this 3 part series.
We are forced to move to 5G-SA
Claim: We must use 5G-SA with this spectrum (It’s a condition of the license)
I’ll concede that if it is a requirement for a license or funding, that 5G-SA be used, then that’s a pretty ironclad reason to introduce 5G-SA.
Claim: Users will Leave if you don’t have 5G-SA
We could argue the opposite effect will happen; Shifting to SA will reduce your user base. Here’s why:
Users experiencing 5G-NSA (Non-Standalone) today, are already getting the speed boost from “5G”.
From a user perspective, while 5G-NSA support has been becoming common on mid-to-high priced handsets, handsets supporting 5G-SA are far less common.
Dish’ Project Genesis is one of the only examples of a 5G SA network deployed on a large scale. It launched with only a single supported phone (A Motorola branded handset) and today the supported phone list is very short, limited to expensive flagships. This lack of handset support means users must purchase a handset through Dish rather than being able to bring their own phones, as the only way that compatibility can be guaranteed is by controlling the whole ecosystem.
Unless you are in a highly developed market with 2G and 3G turned off, where the majority of your user base has recent generation flagship phones capable of supporting these features, you’re shrinking your addressable market with 5G-SA, rather than expanding it.
Conclusion – 5G-SA doesn’t stack up, what do I do?
SA doesn’t make sense for a lot of operators and markets – for now. I’m sure this post will look pretty dated in a few years time as many of these factors change and as operators sunset 2G and 3G networks.
I’m not advocating for 5G-SA never, I’m advocating not 5G-SA today.
There are simply better options out there for spending that operations budget to make network improvements.
Off the bat some ideas to expore:
Optimize your existing network.
Roll out NSA to an even larger area.
Shutdown 2G/3G layers.
Simplify your operations.
Cut down the number of vendors and moving parts.
Simplify again.
Automate.
Simplify more.
Doing this will mean you can enjoy cost savings from reduced headcount thanks to a simpler network. Simpler networks have better up-time, thanks to operating a network that’s less frankensteiny – less cobbled together from disparate legacy parts. You’ll also Enjoy reduced opex from all the systems you’ve shut down and cheaper roaming from all the bilaterals you moved to VoLTE.
All of these tasks will keep project teams busy for years and put the MNO in a stronger position moving forward, without getting distracted by slick marketing and shiny brochures.
PyHSS is our open source Home Subscriber Server, it’s written in Python, has a variety of different backends, and is highly perforate (We benchmark to 10K transactions per second) and infinitely scaleable.
In this post I’ll cover the basics of setting up PyHSS in your enviroment and getting some Diameter peers connected.
For starters, we’ll need a database (We’ll use MySQL for this demo) and an account on that database for a MySQL user.
So let’s get that rolling (I’m using Ubuntu 24.04):
sudo apt update sudo apt install mysql-server
Next we’ll create the MySQL user for PyHSS to use:
CREATE USER 'pyhss_user'@'%' IDENTIFIED BY 'pyhss_password';
GRANT ALL PRIVILEGES ON *.* TO 'pyhss_user'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
We’ll also need Redis as well (PyHSS uses Redis for inter-service communications and for caching), so go ahead an install that for your distro:
sudo apt install redis-server
So that’s our prerequisites sorted, let’s clone the PyHSS repo:
And install the requirements with pip from the PyHSS repo:
pip3 install -r requirements.txt
Next we’ll need to configure PyHSS, for that we update the config file (config.yaml) with the settings we want to use.
We’ll start by setting the bind_ip to a list of IPs you want to listen on, and your transport – We can use either TCP or SCTP.
For Diameter, we will set OriginHost and OriginRealm to match the Diameter hostname you want to use for this peer, and the Realm of your Diameter network.
Lastly we’ll need to set the database parameters, updating the database: section to populate your credentials, setting your username and password and the database to match your SQL installation we setup at the start.
With that done, we can start PyHSS, which we do using systemctl.
Because there’s multiple microservices that make up PyHSS, there’s multiple systemctl files use to run PyHSS as a service, they’re all in the /systemd folder.
The Equipment Identity Register (EIR) is a pretty handy function in 3GPP networks.
Via the Diameter based S13 interface, the MME, is able to query the EIR to ask if a given IMEI & IMSI combination should be allowed to attach.
This allows stolen / grey market / unauthorized devices (IMEIs) to be rejected from the network, the EIR can have a list of “bad” IMEIs that if seen will reject the request.
It also allows us to lock a SIM (IMSI) to a given device (IMEI) or type of device – We can use this for say a Fixed Wireless service, to lock the SIMs (IMSIs) to a range of modems (IMEI Prefixes).
Lastly it gives us insight and analytics into the devices used on the network, by mapping the IMEI to a device, we can say that IMEI 1234567890 is an Apple iPhone 12 Pro Max, or a Nokia Fastmile 5G-24W-A.
PyHSS supports all these capabilities, so let’s have a look at how we’d manage / access them.
Setting up EIR Rules
These rules are set via the RESTful API in PyHSS.
The Equipment Identity Register built into PyHSS supports matching in one of two modes, set by regex_mode.
In Exact Mode (regex_mode: 0) matches are based on an exact matching IMEI, and matching the IMSI if set (If IMSI is set to nothing (”), then only the IMEI is evaluated).
Exact Mode is suited for IMEI/IMSI locking, to ensure a SIM is locked to a particular device, or to blacklist stolen devices.
Regex Mode (regex_mode: 1) matches based on Regex, this is suited for whitelisting IMEI prefixes for say, specific validated vendors.
The match_response_code maps to the Equipment-Status AVP output, so specified values are:
0 : ‘Whitelist’
1: ‘Blacklist’
2: ‘Greylist’
Some end to end examples of this provisioned into the API:
If the IMEI starts with 777 and the IMSI is 1234123412341234 then return 2 (Greylist).
No Match Behaviour
If there is no match from the backend, then the config parameter no_match_response dictates the response code returned (Blacklist/Whitelist/Greylist).
Mapping Type Allocation Codes (TACs) to IMEIs
There are several data feeds of the Type Allocation Codes (TACs) which map a given IMEI prefix to a model number.
TAC database extract
Unfortunately, this data is not freely available, so we can’t bundle it with PyHSS, but if you have the IMEI Database, you can load it into PyHSS using Redis, to allow us to report on this data.
In your config.yaml you’ll just need to set the tac_database parameter, which will read the data on startup.
PyHSS YAML Config extract
Triggering on SIM Swap
If we keep track of the current IMSI/IMEI combination used for each SIM/Device, we can get notified every time it changes.
You might want to use this to trigger OTA provisioning or clear old data in your IMS.
For that we can use the sim_swap_notify_webhook in the config to send a HTTP POST to a given endpoint to inform it that a SIM is now in a different device.
We also have to have imsi_imei_logging set to true in the Config in order to log the history.
Reporting on IMEIs
We can also log/capture historical data about IMSI/IMEI combinations.
We use this from a customer support perspective to be able to see if a customer has recently changed phones, so if they call support, our staff can ask the customer about it to help troubleshoot.
“I can see you were connected previously on a Samsung Galaxy S22, but now you’re using a Nokia 3310, did the issues happen before you moved phones?”
This is super handy.
We can get a general log of IMSI vs IMEI like this:
Feed of IMSI vs IMEI along with a timestamp and the response that was sent back
But what’s more useful is searching for a IMSI or an IMEI and then getting back a full list of devices / SIMs that have been used.
Searching for an IMSI I can see it’s only ever been used in this Samsung Galaxy
Lastly via Grafana we export all this data, which allows us to visualize this data and build dashboards showing the devices on the network.
Visualizing EIR Data in Grafana
PyHSS includes a Promethues exporter, when it comes to prom_eir_devices_total it lists each seen Type Allocation Code / UE in the network, along with the number we’ve seen of each.
Raw it looks like this:
But visualized in Grafana we can get a dashboard to give us a breakdown per vendor:
Hello Nick, thank you for the article. What is the use of the OPc key to be derived from OP key ? Why can’t it just be a random key like Ki ?
It’s a super good question, and something I see a lot of operators get “wrong” from a security best practices perspective.
Refresher on OP vs OPc Keys
The “OP Key” is the “operator” key, and was (historically) common for an operator.
This meant all SIMs in the network had a common OP Key, and each SIM had a unique Ki/K key.
The SIM knew both, and the HSS only needed to know what the Ki was for the SIM, as they shared a common OP Key (Generally you associate an index which translates to the OP Key for that batch of SIMs but you get the idea).
But having common key material is probably not the best idea – I’m sure there was probably some reason why using a common key across all the SIMs seemed like a good option, and the K / Ki key has always been unique, so there was one unique key per SIM, but previously, OP was common.
Over time, the issues with this became clear, so the OPc key was introduced. OPc is derived from mushing the K & OP key together. This means we don’t need to expose / store the original OP key in the SIM or the HSS just the derived OPc key output.
This adds additional security, if the Ki for a SIM were to be exposed along with the OP for that operator, that’s half the entropy lost. Whereas by storing the Ki and OPc you limit the blast radius if say a single SIMs data was exposed, to only the data for that particular SIM.
This is how most operators achieve this today; there is still a common OP Key, locked away in a vault alongside the recipe for Coca-cola and the moon landing set.
But his OP Key is no longer written to the SIMs or stored in the HSS.
Instead, during the personalization process (The bit in manufacturing where SIMs get the unique data written to them (The IMSI & keys)) a derived OPc key is written to the card itself, and to the output files the operator then loads into their HSS/HLR/AuC.
This is not my preferred method for handling key material however, today we get our SIM manufacturers to randomize the OP key for every card and then derive an OPc from that.
This means we have two unique keys for each SIM, and even if the Ki and OP were to become exposed for a SIM, there is nothing common between that SIM, and the other SIMs in the network.
Do we want our Ki to leak? No. Do we want an OP Key to leak? No. But if we’ve got unique keys for everything we minimize the blast radius if something were to happen – Just minimizes the risk.
S8 Home Routing is a really simple concept, the traffic goes from the SGW in the visited PLMN to the PGW in the home PLMN, so the PCRF, OCS/OFCS, IMS, IP Addresses, etc, etc, are all in the home network, and this avoids huge amounts of complexity.
But in order for this to work, the visited network MME needs to find the PGW of the home network, and with over 700 roaming networks in commercial use, each one with potentially hundreds of unique APNs each routing to a different PGW, this is a tricky proposition.
If you’ve configured your PGW peers statically on your MME, that’s fine, but it doesn’t scale very well – And if you add an MVNO who wants their own PGW for serving their APN, well you’ll be adding some complexity there to, so what to do?
Well, the answer is DNS.
By taking the APN to be served, the home PLMN and the interface type desired, with some funky DNS queries, our MME can determine which PGW should be selected for a request.
Let’s take a look, for a UE from MNC XXX MCC YYY roaming into our network, trying to access the “IMS” APN.
Our MME knows the network code of the roaming subscriber from the IMSI is MNC XXX, MCC YYY, and that the UE is requesting the IMS APN.
So our MME crafts a DNS request for the NAPTR query for ims.apn.epc.mncXXX.mccYYY.3gppnetwork.org:
Because the domain is epc.mncXXX.mccYYY.3gppnetwork.org it’s routed to the authoritative DNS server in the home network, which sends back the response:
We’ve got a few peers to pick from, so we need to filter this list of Answers to only those that are relevant to us.
First we filter by the Service tag, whihc for each listed peer shows what services that peer supports.
But since we’re looking for S8, we need to find a peer who’s “Service” tag string contains:
x-3gpp-pgw:x-s8-gtp
We’re looking for two bits of info here, the presence of x-3gpp-pgw in the Service to indicate that this peer is a PGW and x-s8-gtp to indicate that this peer supports the S8 interface.
A service string like this:
x-3gpp-pgw:x-s5-gtp
Would be excluded as it only supports S5 not S8 (Even though they are largely the same interface, S8 is used in roaming).
It’s also not uncommon to see both services indicated as supported, in which case that peer could be selected too:
x-3gpp-pgw:x-s5-gtp:x-s8-gtp
(The answers in the screenshot include :x-gp which means the PGWs advertised are also co-located with a GGSN)
So with our answers whittled down to only those that meet our needs, we next use the Order and the Preference to pick our best candidate, this is the same as regular DNS selection logic.
From our candidate, we’ve also got the Regex Replacement, which allows our original DNS request to be re-written, which allows us to point at a single peer.
In our answer, we see the original request ims.apn.epc.mncXXX.mccYYY.3gppnetwork.org is to be re-written to topon.lb1.pgw01.epc.mncXXX.mccYYY.3gppnetwork.org.
This is the FQDN of the PGW we should use.
Now we know the FQND we should use, we just do an A-Record lookup (Or AAAA record lookup if it is IPv6) for that peer we are targeting, to turn that FQDN into an IP address we can use.
And then in comes the response:
So now our MME knows the IP of the PGW, it can craft a Create Session request where the F-TEID for the S8 interface has the PGW IP set on it that we selected.
For more info on this TS 129.303 (Domain Name System Procedures) is the definitive doc, but the GSMA’s IR.88 “LTE and EPC Roaming Guidelines” provides a handy reference.
How does one encode / interpret the value of this AVP / IE was the question I set out to answer.
TS 29.274 says:
For the encoding of this information element see 3GPP TS 32.298
TS 32.298 says:
The functional requirements for the Charging Characteristics as well as the profile and behaviour bits are further defined in normative Annex A of TS 32.251
TS 32.251 Annex A says:
The Charging Characteristics parameter consists of a string of 16 bits designated as Behaviours (B), freely defined by Operators, as shown in TS 32.298 [51]. Each bit corresponds to a specific charging behaviour which is defined on a per operator basis, configured within the PCN and pointed when bit is set to “1” value.
After a few circular references I found this is imported from 32.298.
Finally we find some solid answers hidden away in TS 132 215, under the Charging Characteristics Profile index.
Charging Characteristics consists of a string of 16 bits designated as Profile (P) and Behaviour (B), shown in Figure 4. The first four bits (P) shall be used to select different charging trigger profiles, where each profile consists of the following trigger sets:
S-CDR: activate/deactivate CDRs, time limit, volume limit, maximum number of charging conditions, tariff times;
G-CDR: same as SGSN, plus maximum number of SGSN changes;
M-CDR: activate/deactivate CDRs, time limit, and maximum number of mobility changes;
SMS-MO-CDR: activate/deactivate CDRs;
SMS-MT-CDR: active/deactivate CDRs.
The Charging Characteristics field allows the operator to apply different kind of charging methods in the CDRs. A subscriber may have Charging Characteristics assigned to his subscription. These characteristics can be supplied by the HLR to the SGSN as part of the subscription information, and, upon activation of a PDP context, the SGSN forwards the charging characteristics to the GGSN on the Gn / Gp reference point according to the rules specified in Annex A of TS 32.251 [11].
This information can be used by the GSNs to activate CDR generation and control the closure of the CDR or the traffic volume containers (see clause 5.1.2.2.23) and is included in CDRs transmitted to nodes handling the CDRs via the Ga reference point. It can also be used in nodes handling the CDRs (e.g., the CGF or the billing system) to influence the CDR processing priority and routing.
These functions are accomplished by specifying the charging characteristics as sets of charging profiles and the expected behaviour associated with each profile.
The interpretations of the profiles and their associated behaviours can be different for each PLMN operator and are not subject to standardisation. In the present document only the charging characteristic formats and selection modes are specified.
The functional requirements for the Charging Characteristics as well as the profile and behaviour bits are further defined in normative Annex A of TS 32.251 [11], including the definitions of the trigger profiles associated with each CDR type.
The format of charging characteristics field is depicted in Figure 4. Px (x =0..3) refers to the Charging Characteristics Profile index. Bits classified with a “B” may be used by the operator for non-standardised behaviour (see Annex A of TS 32.251 [11]).
Right, well hopefully next time someone goes looking for this info you’ll find it a bit more easily than I did!
The S8 Home Routing approach for LTE Roaming works really well, as more and more operators are switching off their legacy circuit switched 2G/3G networks and shifting to LTE & VoLTE for roaming, we’re seeing more an more S8-HR deployments.
When LTE was being standardised in 2008, Local Breakout (LBO) and S8 Home Routing were both considered options for how roaming may look. Fast forward to today, and S8 Home routing is the only way roaming is done for modern deployments.
In light of this, there are some “best practices” in an “all S8 Home Routed” world, we’ve developed, that I thought I’d share.
The Basics
When roaming, the SGW in the Visited Network, sends user traffic back to the PGW in the Home Network.
This means Online/Offline charging, IMS, PCRF, etc, is all done in the Home PLMN. As long as data packets can get from the SGW in the Visited PLMN to the PGW in the Home PLMN, and authentication flows from the Visited MME to the HSS in the Home PLMN, you’re golden.
The Constraints
Of course real networks don’t look as simple as this, in reality a roaming scenario for a visited network has a lot more nodes, which need to be
Building Distributed Packet Core & IMS
Virtualization (VNF / CNF) has led operators away from “big iron” hardware for Packet Core & IMS nodes, towards software based solutions, which in turn offer a lot more flexibility.
Best practice for design of User Plane is to keep the the latency down, by bringing the user plane closer to the user (the idea of “Edge” UPFs in 5GC is a great example of this), and the move away from “big iron” in central locations for SGW and PGW nodes has been the trend for the past decade.
So to achieve these goals in the networks we build, we geographically distribute the core network.
This means we’ve got quite a few S-GW, P-GW, MME & HSS instances across the network. There’s some real advantages to this approach:
From a redundancy perspective this allows us to “spread the load” and build far more resilient networks. A network with 20 smaller HSS instances spread around the country, is far more resilient than 2 massive ones, regardless of how many power feeds or redundant disks it may have.
This allows us to be more resource efficient. MNOs have always provisioned excess capacity to cater for the loss of a node. If we have 2 MMEs serving a country, then each node has to have at least 50% capacity free, so if one MME were to fail, the other MME could handle the additional load it from it’s dead friend. This is costly for resources. Having 20 MMEs means each MME has to have 5% capacity free, to handle the loss of one MME in the pool.
It also forces our infrastructure teams to manage infrastructure “as cattle” rather than pets. These boxes don’t get names or lovingly crafted, they’re automatically spun up and destroyed without thinking about it.
For security, we only use internal IP addresses for the nodes in our packet core, this provides another layer of protection for the “crown jewels” of our network, so no one messing with BGP filtering can accidentally open the flood gates to our core, as one US operator learned leaving a GGSN open to the world leading to the private information for 100 million customers being leaked.
What this all adds to, is of course, the end user experience. For the end subscriber / customer, they get a better experience thanks to the reduced latency the connection provides, better uptime and faster call setup / SMS delivery, and less cost to deliver services.
I love this approach and could prothletise about it all day, but in a roaming context this presents some challenges.
The distributed networks we build are in a constant state of flux, new capacity is being provisioned in some areas, nodes things decommissioned in others, and our our core nodes are only reachable on internal IPs, so wouldn’t be reachable by roaming networks.
Our Distributed-Core Roaming Solution
To resolve this we’ve taken a novel approach, we’ve deployed a pair of S-GWs we call the “Roaming SGWs”, and a pair of P-GWs we call the “Roaming PGWs”, these do have public IPs, and are dedicated for use only by roaming traffic.
We really like this approach for a few reasons:
It allows us to be really flexible do what we want inside the network, without impacting roaming customers or operators who use our network for roaming. All the benefits I described from the distributed architectures can still be realised.
From a security standpoint, only these SGW/PGW pairs have public IPs, all the others are on internal IPs. This good for security – Our core network is the ‘crown jewels’ of the network and we only expose an edge to other providers. Even though IPX networks are supposed to be secure, one of the largest IPX providers had their systems breached for 5 years before it was detected, so being almost as distrustful of IPX traffic as Internet traffic is a good thing. This allows us to put these PGWs / SGWs at the “edge” of our network, and keep all our MMEs, as well as our on-net PGW and SGWs, on internal IPs, safe and secure inside our network.
For charging on the SGWs, we only need to worry about collecting CDRs from one set of SGWs (to go into the TAP files we use to bill the other operators), rather than running around hoovering up SGW CDRs from large numbers of Serving Gateways, which may get blown away and replaced without warning.
Of course, there is a latency angle to this, for international roaming, the traffic has to cross the sea / international borders to get to us. By putting it at the edge we’re seeing increased MOS on our calls, as the traffic is as close to the edge of the network as can be.
Caveat: Increased S11 Latency on Core Network sites over Satellite
This is probably not relevant to most operators, but some of our core network sites are fed only by satellite, and the move to this architecture shifted something: Rather than having latency on the S8 interface from the SGW to the PGW due to the satellite hop, we’ve got latency between the MME and the SGW due to the satellite hop.
It just shifts where in the chain the latency lies, but it did lead to us having to boost some timers in the MME and out of sequence deliver detection, on what had always been an internal interface previously.
Evolution to 5G Standalone Roaming
This approach aligns to the Home Routed options for 5G-SA roaming; UPF chaining means that the roaming traffic can still be routed, as seems to be the way the industry is going.
SA roaming is in its infancy, without widely deployed SA networks, we’re not going to see common roaming using SA for a good long while, but I’ll be curious to see if this approach becomes the de facto standard going forward.
Where to from here?
We’re pretty happy with this approach in the networks we’ve been building.
So far it’s made IREG testing easier as we’ve got two fixed points the IPX needs to hit (The DRAs and the SGWs) rather than a wide range of networks.
Operators with a vast number of APNs they need to drop into different VRFs may have to do some traffic engineering here – Our operations are generally pretty flat, but I can see where this may present some challenges for established operators shifting their traffic.
I’d be keen to hear if other operators are taking this approach and if they’ve run into any issues, or any issues others can see in this, feel free to drop a comment below.
SGs-AP which is used for CSFB & SMS doesn’t span network borders (you can’t roam with SGs-AP), and with SMSoIP out of the question, that gave us the option of MAP or Diameter, so we picked Diameter.
This introduces the S6c and SGd Diameter interfaces, in the diagrams below Orange is the Home Network (HPMN) and the Green is the Visited Network (VPMN).
The S6c interface is used between the SMSc and the HSS, in order to retrieve the routing information. This like the SRI-for-SM in MAP.
The SGd interface is used between the MME serving the UE and the SMSc, and is used for actual delivery of the MO/MT messages.
I haven’t shown the Diameter Routing Agents in these diagrams, but in reality there would be a DRA on the VPLMN and a DRA on the HPMN, and probably a DRA in the IPX between them too.
The Attach
The attach looks like a regular roaming attach, the MME in the Visited PMN sends an Update Location Request to the HSS, so the HSS knows the MME that is serving the subscriber.
S6a Update Location Request to indicate the MME serving the Subscriber
The Mobile Terminated SMS Flow
Now we introduce the S6c interface and the SGd interfaces.
When the Home SMSc has a message to send to the subscriber (Mobile Terminated SMS) it runs a the Send-Routing-Info-for-SM-Request (SRR) dialog to the HSS.
The Send-Routing-Info-for-SM-Answer (SRA) back from the HSS contains the info on the MME Diameter Host name and Diameter Realm serving the subscriber.
S6t – Send-Routing-Info-for-SM request to get the MME serving the subscriber
With this info, we can now craft a Diameter Request that will get sent to the MME serving the subscriber, containing the SMS PDU to send to the UE.
SGd MT-Forward-Short-Message to deliver Mobile Terminated SMS to the serving MME
We make sure it’s sent to the correct MME by setting the Destination-Host and Destination-Realm in the Diameter request.
Here’s how the request looks from the SMSc towards our DRA:
As you can see the Destination Realm and Destination-Host is set, as is the User-Name set to the IMSI of the UE we want to send the message to.
And down the bottom you can see the SMS-TPDU, the same as it’s been all the way back since GSM days.
The Mobile Originated SMS Flow
The Mobile Originated flow is even simpler, because we don’t need to look up where to route it to.
The MME receives the MO SMS from the UE, and shoves it into a Diameter message with Application ID set to SGd and Destination-Realm set to the HPMN Realm.
When the message reaches the DRA in the HPMN it forwards the request to an SMSc and then the Home SMSc has the message ready to roll.
Having rated CDRs in CGrateS is great, but in reality, you probably want to get them into a billing system, CSV file, S3 bucket, CRM, invoice, Grafana, SQL table, etc, etc.
The Event Exporter Service (EES (previously called CDRe)) handles exporting CDRs from CGrateS.
Like everything in CGrateS, it’s highly configurable, and, again, like everything in CGrateS, supports every combination of services you can think of, plus a stack you haven’t thought of.
CDRs can be exported one of two ways, in real time, as the CDR is generated (online), or after the fact, exporting from the database containing the CDRs (offline).
Exporting in realtime (online) is a great option if you don’t want (or need) to store the CDRs in CGrateS; if you’re just using CGrateS to rate calls and spit them into a seperate system, this is a fantastic option, as it allows your CGrateS instances to remain light and not get clogged up with lots of old CDRs – That said, of course you can export the CDRs in realtime and still store them in CGrateS, that’s also a totally valid approach as well.
The more traditional approach is offline CDR export, where periodically or when an event is triggered, you scrape up a pile of CDRs and send them to your external systems.
For both options, we’ll need to define at least one exporter in our cgrates.json config file. For this example we’ll define a HTTP POST that we will trigger for realtime (online) CDR exporting, and a CSV file we dump to periodically when called from the API.
So first things first, we enable the EES module in the config:
"ees": {
"enabled": true,
"exporters": [
]
}
We’ll start with defining one exporter, named CSVExporter, that will output files to a folder named “testCSV” in the /tmp/ directory, but you can plonk these files wherever you like:
We’ve got a lot of different types of export available to us, but type *file_csv is the easiest, so that’s where we’ll start.
Setting synchronous to true will mean we’ll only run one export job at a time, but it also means we’ll get back the result via the API, which will allow us to keep track of the ID of the last record we updated, so we don’t export the same record multiple times, more on this later.
Flags allows us to, if we wanted, bounce the event through AttributeS, for example, by adding *attributes to the flags, but in this case, it’s just logging to syslog.
Of course, just enabling ees won’t actually send calls to it, we’ll need to add “ees_conns“: [“*localhost”], to “apiers”: and “cdrs” so they know to bounce the events through it:
If you’ve already got CDRs on your system from our previous tutorial, fantastic, but if not, let’s get up and running with a quick and dirty script to define some destinations, a charger, an account balance and then use some of the balance to generate a CDR:
import cgrateshttpapi
import pprint
import uuid
import datetime
now = datetime.datetime.now()
CGRateS_Obj = cgrateshttpapi.CGRateS('localhost', 2080)
#Define Destinations
CGRateS_Obj.SendData({'method':'ApierV2.SetTPDestination','params':[{"TPid":'cgrates.org',"ID":"Dest_AU_Mobile","Prefixes":["614"]}]})
#Load TariffPlan we just defined from StorDB to DataDB
CGRateS_Obj.SendData({"method":"APIerSv1.LoadTariffPlanFromStorDb","params":[{"TPid":'cgrates.org',"DryRun":False,"Validate":True,"APIOpts":None,"Caching":None}],"id":0})
#Define default Charger
print(CGRateS_Obj.SendData({"method": "APIerSv1.SetChargerProfile","params": [{"Tenant": "cgrates.org","ID": "DEFAULT",'FilterIDs': [],'AttributeIDs' : ['*none'],'Weight': 0,}]}))
account = "Nick_Test_123"
#Add a balance to the account with type *sms with 100 sms events
pprint.pprint(CGRateS_Obj.SendData({"method": "ApierV1.SetBalance","params": [{"Tenant": "cgrates.org","Account": account,"BalanceType": "*sms","DestinationIDs": 'Dest_NZ_Mobile;Dest_AU_Mobile',"Categories": "*any","Balance": {"ID": "100_SMS_Bundle_AU_NZ_Mobile","Value": 100,"Weight": 25}}]}))
#Process CDR Event for a single SMS
pprint.pprint(CGRateS_Obj.SendData({"method": "CDRsV2.ProcessExternalCDR","params": [{"OriginID": str(uuid.uuid1()),"ToR": "*sms","RequestType": "*pseudoprepaid","AnswerTime": now.strftime("%Y-%m-%d %H:%M:%S"),"SetupTime": now.strftime("%Y-%m-%d %H:%M:%S"),"Tenant": "cgrates.org","Account": account,"Destination" : "61412345678","Usage": "1",}]}))
Right, with that out of the way, we should now have something in our CDRs table, a quick SQL query confirms this is the case:
So, as you may have guessed, we’ve called the ExportCDRs API endpoint, we’ve specified which ExporterIDs we want to reference (these link back to the objects in the config, and the one we have defined currently is named CSVExporter).
Setting Verbose: True means that CGrateS gives us back a lot of info from the API call, here’s what we get back:
Now that looks pretty positive, we got 12 events of SMS usage exported, which we can see in the file /tmp/testCSV/CSVExporter_21e9bc2.csv – and if we cat out the file, yeap, there’s all the CDRs.
But it’s a bit of a mess, there’s a lot of fields in there, so let’s adjust what goes into the CSV.
Let’s start by filtering what goes into the exporter, to only give us SMS events, of course you could adjust the filters here to target exporting only the records you want, based on anything you can define with Filters (and there’s a lot you can define with filters).
Now we’re only exporting SMS records, so let’s clean up the output of the CSV to just give us the data we want, which is the CDR ID, time, account, destination and usage.
Now after a restart of CGrateS, our exports look like this:
Stunning, truly beautiful, look at that output!
Right, well you may at this point have noticed a problem if you’ve run this more than once. The problem is that is every time we run this, we get all the CDRs since the beginning of time.
But where filtering by date/time falls down, is that if an offline CDR of a call on Monday, only got ingested on Tuesday, it would be missed by the export.
But, setting Verbose: True on the ExportCDRs API call gives us a handy trick, we’ve been told what the highest ID in the CDRs table we just exported in the response from the API in LastExpOrderID field.
If we jump over to the SQL database we use for StorDB, we can see that 33 is the ID of the highest CDR in the system.
So let’s try something, let’s run the exporter again, but this time let’s get all the CDRs where the ID is higher than 33:
#Process CDR Event for a single SMS
pprint.pprint(CGRateS_Obj.SendData({"method": "CDRsV2.ProcessExternalCDR","params": [{"OriginID": str(uuid.uuid1()),"ToR": "*sms","RequestType": "*pseudoprepaid","AnswerTime": now.strftime("%Y-%m-%d %H:%M:%S"),"SetupTime": now.strftime("%Y-%m-%d %H:%M:%S"),"Tenant": "cgrates.org","Account": account,"Destination" : "61412345678","Usage": "1",}]}))
#Trigger export where the OrderID is above 33
result = CGRateS_Obj.SendData({"method":"APIerSv1.ExportCDRs","params":[
{"ExporterIDs": ["CSVExporter"],
"Verbose" : True,
"ExtraArgs" : {
"OrderIDStart" : int(33),
},
"Accounts" : [account]}
]})
pprint.pprint(result)
Boom, now if we have a look at the output we can see the export covered two records, and the last ID was 35.
So as long as we keep track of the LastExpOrderID value, and feed that as in input every time we run ExportCDRs, we can ensure we never miss a CDR, and never get the same CDR twice.
Want more telecom goodness?
I have a good old fashioned RSS feed you can subscribe to.