For most Voice / Telco engineers IPsec is a VPN technology, maybe something used when backhauling over an untrusted link, etc, but voice over IP traffic is typically secured with TLS and SRTP.
IMS / Voice over LTE handles things a bit differently, it encapsulates the SIP & RTP traffic between the UE and the P-CSCF in IPsec Encapsulating Security Payload (ESP) payloads.
In this post we’ll take a look at how it works and what it looks like.
It’s worth noting that Kamailio recently added support for IPsec encapsulation on a P-CSCF, in the IMS IPSec-Register module. I’ll cover usage of this at a later date.
The Message Exchange
The exchange starts off looking like any other SIP Registration session, in this case using TCP for transport. The UE sends a REGISTER to the Proxy-CSCF which eventually forwards the request through to a Serving-CSCF.
This is where we diverge from the standard SIP REGISTER message exchange. The Serving-CSCF generates a 401 Unauthorized response, containing an authentication challenge in the WWW-Authenticate header, and also a Ciphering Key & Integrity Key (ck= and ik=) also in the WWW-Authenticate header.
The Serving-CSCF sends the Proxy-CSCF the 401 response it created. The Proxy-CSCF assigns a SPI for the IPsec ESP to use, a server port and client port and indicates the used encryption algorithm (ealg) and algorithm to use (In this case HMAC-SHA-1-96.) and adds a new header to the 401 Unauthorized called Security–Server header to share this information with the UE.
The Proxy-CSCF also strips the Ciphering Key (ck=) and Integrity Key (ik=) headers from the SIP authentication challenge (WWW-Auth) and uses them as the ciphering and integrity keys for the IPsec connection.
Finally after setting up the IPsec server side of things, it forwards the 401 Unauthorized response onto the UE.
Upon receipt of the 401 response, the UE looks at the authentication challenge.
If the network is considered authenticated by the UE it generates a response to the Authentication Challenge, but it doesn’t deliver it over TCP. Using the information generated in the authentication challenge the UE encapsulates everything from the network layer (IPv4) up and sends it to the P-CSCF in an IPsec ESP.
Communication between the UE and the P-CSCF is now encapsulated in IPsec.
Wireshark trace of IPsec IMS Traffic between UE and P-CSCF
When learning to use Kamailio you might find yourself thinking about if you really want to learn to write a Kamailio configuration file, which is another weird scripting language to learn to achieve a task.
Enter KEMI – Kamailio Embedded Interface. KEMI allows you to abstract the routing logic to another programing language. In layman’s terms this means you can write your routing blocks, like request_route{}, reply_route{}, etc, in languages you already know – like Lua, JavaScript, Ruby – and my favorite – Python!
Why would you use KEMI?
Write in a language you already know;
You don’t need to learn how to do write complex routing logic in Kamailio’s native scripting language, you can instead do it in a language you’re already familiar with, writing your Routing Blocks in another programming language.
Change Routing on the Fly;
By writing the routing logic in KEMI allows you to change your routing blocks without having to restart Kamailio, something you can’t do with the “native” scripting language – This means you can change your routing live.
Note: This isn’t yet in place for all languages – Some still require a restart.
Leverage your prefered language’s libraries;
While Kamailio’s got a huge list of modules to interface with a vast number of different things, the ~200 Kamailio modules don’t compare with the thousands of premade libraries that exist for languages like Python, Ruby, JavaScript, etc.
Prerequisites
We’ll obviously need Kamailio installed, but we’ll also need the programming language we want to leverage setup (fairly obvious).
Configuring Kamailio to talk to KEMI
KEMI only takes care of the routing of SIP messages inside our routing blocks – So we’ve still got the Kamailio cfg file (kamailio.cfg) that we use to bind and setup the service as required, load the modules we want and configure them.
Essentially we need to load the app for the language we use, in this example we’ll use app_python3.so and use that as our Config Engine.
IPsec ESP can be used in 3 different ways on the Gm interface between the Ue and the P-CSCF:
Integrity Protection – To prevent tampering
Ciphering – To prevent inception / eavesdropping
Integrity Protection & Ciphering
On Wireshark, you’ll see the ESP, but you won’t see the payload contents, just the fact it’s an Encapsulated Security Payload, it’s SPI and Sequence number.
By default, Kamailio’s P-CSCF only acts in Integrity Protection mode, meaning the ESP payloads aren’t actually encrypted, with a few clicks we can get Wireshark to decode this data;
Just open up Wireshark Preferences, expand Protocols and jump to ESP
Now we can set the decoding preferences for our ESP payloads,
In our case we’ll tick the “Attempt to detect/decode NULL encrypted ESP payloads” box and close the box by clicking OK button.
Now Wireshark will scan through all the frames again, anything that’s an ESP payload it will attempt to parse.
Now if we go back to the ESP payload with SQN 1 I showed a screenshot of earlier, we can see the contents are a TCP SYN.
Now we can see what’s going on inside this ESP data between the P-CSCF and the UE!
As a matter of interest if you can see the IK and CK values in the 401 response before they’re stripped you can decode encrypted ESP payloads from Wireshark, from the same Protocol -> ESP section you can load the Ciphering and Integrity keys used in that session to decrypt them.
On top of plain vanilla RFC3261, there’s a series of “Extension” methods added to SIP to expand it’s functionality, common extension methods are INFO, MESSAGE, NOTIFY, PRACK and UPDATE. Although now commonplace, of these is not defined in RFC3261 so is considered an “extension” to SIP.
It’s worth just pausing here to reiterate we’re not talking extensions like in a PBX context, like extra phones, we’re talking extensions like you’d add to a house, like extra functionality.
A SIP client can request functionality from a server (UAC to a UAS), if the server does not have support for that functionality, it can reject the session on those grounds and send back a response indicating it doesn’t know how to handle that extension, like a 420 Bad Extension – Bad SIP Protocol Extension used, not understood by the server. Response.
So clients can determine what functionality a server doesn’t support if it rejects the request, but there was no way to see what functionality the server does support, and what functionality the client requires.
If a UAC or UAS requires support for an extension – For example a Media Gateway has to understand PRACK, it can use the Require header to specify the request should be rejected if support for the listed extensions is not provided.
These headers are most commonly seen in SIP OPTIONS requests.
Kamailio is generally thought of as a SIP router, but it can in fact handle Diameter signaling as well.
Everything to do with Diameter in Kamailio relies on the C Diameter Peer and CDP_AVP modules which abstract the handling of Diameter messages, and allow us to handle them sort of like SIP messages.
CDP on it’s own doesn’t actually allow us to send Diameter messages, but it’s relied upon by other modules, like CDP_AVP and many of the Kamailio IMS modules, to handle Diameter signaling.
Before we can start shooting Diameter messages all over the place we’ve first got to configure our Kamailio instance, to bring up other Diameter peers, and learn about their capabilities.
C Diameter Peer (Aka CDP) manages the Diameter connections, the Device Watchdog Request/Answers etc, all in the background.
We’ll need to define our Diameter peers for CDP to use so Kamailio can talk to them. This is done in an XML file which lays out our Diameter peers and all the connection information.
In our Kamailio config we’ll add the following lines:
This will load the CDP modules and instruct Kamailio to pull it’s CDP info from an XML config file at /etc/kamailio/diametercfg.xml
Let’s look at the basic example given when installed:
<?xml version="1.0" encoding="UTF-8"?>
<!--
DiameterPeer Parameters
- FQDN - FQDN of this peer, as it should apper in the Origin-Host AVP
- Realm - Realm of this peer, as it should apper in the Origin-Realm AVP
- Vendor_Id - Default Vendor-Id to appear in the Capabilities Exchange
- Product_Name - Product Name to appear in the Capabilities Exchange
- AcceptUnknownPeers - Whether to accept (1) or deny (0) connections from peers with FQDN
not configured below
- DropUnknownOnDisconnect - Whether to drop (1) or keep (0) and retry connections (until restart)
unknown peers in the list of peers after a disconnection.
- Tc - Value for the RFC3588 Tc timer - default 30 seconds
- Workers - Number of incoming messages processing workers forked processes.
- Queue - Length of queue of tasks for the workers:
- too small and the incoming messages will be blocked too often;
- too large and the senders of incoming messages will have a longer feedback loop to notice that
this Diameter peer is overloaded in processing incoming requests;
- a good choice is to have it about 2 times the number of workers. This will mean that each worker
will have about 2 tasks in the queue to process before new incoming messages will start to block.
- ConnectTimeout - time in seconds to wait for an outbound TCP connection to be established.
- TransactionTimeout - time in seconds after which the transaction timeout callback will be fired,
when using transactional processing.
- SessionsHashSize - size of the hash-table to use for the Diameter sessions. When searching for a
session, the time required for this operation will be that of sequential searching in a list of
NumberOfActiveSessions/SessionsHashSize. So higher the better, yet each hashslot will consume an
extra 2xsizeof(void*) bytes (typically 8 or 16 bytes extra).
- DefaultAuthSessionTimeout - default value to use when there is no Authorization Session Timeout
AVP present.
- MaxAuthSessionTimeout - maximum Authorization Session Timeout as a cut-out measure meant to
enforce session refreshes.
-->
<DiameterPeer
FQDN="pcscf.ims.smilecoms.com"
Realm="ims.smilecoms.com"
Vendor_Id="10415"
Product_Name="CDiameterPeer"
AcceptUnknownPeers="0"
DropUnknownOnDisconnect="1"
Tc="30"
Workers="4"
QueueLength="32"
ConnectTimeout="5"
TransactionTimeout="5"
SessionsHashSize="128"
DefaultAuthSessionTimeout="60"
MaxAuthSessionTimeout="300"
>
<!--
Definition of peers to connect to and accept connections from. For each peer found in here
a dedicated receiver process will be forked. All other unkwnown peers will share a single
receiver. NB: You must have a peer definition for each peer listed in the realm routing section
-->
<Peer FQDN="pcrf1.ims.smilecoms.com" Realm="ims.smilecoms.com" port="3868"/>
<Peer FQDN="pcrf2.ims.smilecoms.com" Realm="ims.smilecoms.com" port="3868"/>
<Peer FQDN="pcrf3.ims.smilecoms.com" Realm="ims.smilecoms.com" port="3868"/>
<Peer FQDN="pcrf4.ims.smilecoms.com" Realm="ims.smilecoms.com" port="3868"/>
<Peer FQDN="pcrf5.ims.smilecoms.com" Realm="ims.smilecoms.com" port="3868"/>
<Peer FQDN="pcrf6.ims.smilecoms.com" Realm="ims.smilecoms.com" port="3868"/>
<!--
Definition of incoming connection acceptors. If no bind is specified, the acceptor will bind
on all available interfaces.
-->
<Acceptor port="3868" />
<Acceptor port="3869" bind="127.0.0.1" />
<Acceptor port="3870" bind="192.168.1.1" />
<!--
Definition of Auth (authorization) and Acct (accounting) supported applications. This
information is sent as part of the Capabilities Exchange procedures on connecting to
peers. If no common application is found, the peers will disconnect. Messages will only
be sent to a peer if that peer actually has declared support for the application id of
the message.
-->
<Acct id="16777216" vendor="10415" />
<Acct id="16777216" vendor="0" />
<Auth id="16777216" vendor="10415"/>
<Auth id="16777216" vendor="0" />
<!--
Supported Vendor IDs - list of values which will be sent in the CER/CEA in the
Supported-Vendor-ID AVPs
-->
<SupportedVendor vendor="10415" />
<!--
Realm routing definition.
Each Realm can have a different table of peers to route towards. In case the Destination
Realm AVP contains a Realm not defined here, the DefaultRoute entries will be used.
Note: In case a message already contains a Destination-Host AVP, Realm Routeing will not be
applied.
Note: Routing will only happen towards connected and application id supporting peers.
The metric is used to order the list of prefered peers, while looking for a connected and
application id supporting peer. In the end, of course, just one peer will be selected.
-->
<Realm name="ims.smilecoms.com">
<Route FQDN="pcrf1.ims.smilecoms.com" metric="3"/>
<Route FQDN="pcrf2.ims.smilecoms.com" metric="5"/>
</Realm>
<Realm name="temp.ims.smilecoms.com">
<Route FQDN="pcrf3.ims.smilecoms.com" metric="7"/>
<Route FQDN="pcrf4.ims.smilecoms.com" metric="11"/>
</Realm>
<DefaultRoute FQDN="pcrf5.ims.smilecoms.com" metric="15"/>
<DefaultRoute FQDN="pcrf6.ims.smilecoms.com" metric="13"/>
</DiameterPeer>
First we need to start by telling CDP about the Diameter peer it’s going to be – we do this in the <DiameterPeer section where we define the FQDN and Diameter Realm we’re going to use, as well as some general configuration parameters.
<Peers are of course, Diameter peers. Defining them here will mean a connection is established to each one, Capabilities exchanged and Watchdog request/responses managed. We define the usage of each Peer further on in the config.
The Acceptor section – fairly obviously – sets the bindings for the addresses and ports we’ll listen on.
Next up we need to define the Diameter applications we support in the <Acct id=” /> and <SupportedVendor> parameters, this can be a little unintuitive as we could list support for every Diameter application here, but unless you’ve got a module that can handle those applications, it’s of no use.
Instead of using Dispatcher to manage sending Diameter requests, CDP handles this for us. CDP keeps track of the Peers status and it’s capabilities, but we can group like Peers together, for example we may have a pool of PCRF NEs, so we can group them together into a <Realm >. Instead of calling a peer directly we can call the realm and CDP will dispatch the request to an up peer inside the realm, similar to Dispatcher Groups.
Finally we can configure a <DefaultRoute> which will be used if we don’t specify the peer or realm the request needs to be sent to. Multiple default routes can exist, differentiated based on preference.
We can check the status of peers using Kamcmd’s cdp.list_peers command which lists the peers, their states and capabilities.
One question that’s not as obvious as it perhaps should be is the different states shown with kamcmd dispatcher.list command;
So what do the flags for state mean?
The first letter in the flag means is the current state, Active (A), Inactive (I) or Disabled (D).
The second letter in the flag means monitor status, Probing (P) meaning actively checked with SIP Options pings, or Not Set (X) denoting the device isn’t actively checked with SIP Options pings.
AP – Actively Probing – SIP OPTIONS are getting a response, routing to this destination is possible, and it’s “Up” for all intents and purposes.
IP – Inactively Probing – Destination is not meeting the threshold of SIP OPTIONS request responses it needs to be considered active. The destination is either down or not responding to all SIP OPTIONS pings. Often this is due to needing X number of positive responses before considering the destination as “Up”.
DX – Disabled & Not Probing – This device is disabled, no SIP OPTIONS are sent.
AX – Active & Not Probing– No SIP OPTIONS are sent to check state, but is is effectively “Up” even though the remote end may not be reachable.
In the third part of the Kamailio 101 series I briefly touched upon pseudovariables, but let’s look into what exactly they are and how we can manipulate them to change headers.
The term “pseudo-variable” is used for special tokens that can be given as parameters to different script functions and they will be replaced with a value before the execution of the function.
You’ve probably seen in any number of the previous Kamailio Bytes posts me use pseudovariables, often in xlog or in if statements, they’re generally short strings prefixed with a $ sign like $fU, $tU, $ua, etc.
When Kamailio gets a SIP message it explodes it into a pile of variables, getting the To URI and putting it into a psudovariable called $tU, etc.
We can update the value of say $tU and then forward the SIP message on, but the To URI will now use our updated value.
When it comes to rewriting caller ID, changing domains, manipulating specific headers etc, pseudovariables is where it mostly happens.
Kamailio allows us to read these variables and for most of them rewrite them – But there’s a catch. We can mess with the headers which could result in our traffic being considered invalid by the next SIP proxy / device in the chain, or we could mess with the routing headers like Route, Via, etc, and find that our responses never get where they need to go.
So be careful! Headers exist for a reason, some are informational for end users, others are functional so other SIP proxies and UACs can know what’s going on.
Rewriting SIP From Username Header (Caller ID)
When Kamailio’s SIP parser receives a SIP request/response it decodes the vast majority of the SIP headers into a variety of pseudovariables, we can then reference these variables we can then reference from our routing logic.
Let’s pause here and go back to the Stateless SIP Proxy Example, as we’ll build directly on that.
Follow the instructions in that post to get your stateless SIP proxy up and running, and we’ll make this simple change:
####### Routing Logic ########
/* Main SIP request routing logic
* - processing of any incoming SIP request starts with this route
* - note: this is the same as route { ... } */
request_route {
xlog("Received $rm to $ru - Forwarding");
$fU = "Nick Blog Example"; #Set From Username to this value
#Forward to new IP
forward("192.168.1.110");
}
Now when our traffic is proxied the From Username will show “Nick Blog Example” instead of what it previously showed.
The blurry photo didn’t make anything that much clearer, but they looked like two motion switches, and being a big fan of really old telco hardware, I found myself driving to an auction selling things very much unrelated to telephone exchanges to bid on what I thought might have been a step-by-step exchange.
Photo from listing
$50 later I am now the proud owner of an Automatic Telephone & Electric Co (ATE) Liverpool works 50 line PAX (Private Automatic Exchange).
The switch
My office now has less room, a big burly battery eliminator and ring machine take up the space on my desk, but I couldn’t be happier with it.
Uniselectors
Of the 5 final selectors I’ve got two somehow worked “out of the box”, while the other 3 all need some serious adjustment, but she clicks and she’s mostly complete, so should be a good summer holiday project!
I’ll post some video up when she’s fully functional.
While poking around the development and debugging features on Samsung handsets I found the ability to run IMS Debugging directly from the handset.
Alas, the option is only available in the commercial version, it’s just there for carriers, and requires a One Time Password to unlock.
When tapping on the option a challenge is generated with a key.
Interestingly I noticed that the key changes each time and can reject you even in aeroplane mode, suggesting the authentication happens client side.
This left me thinking – If the authentication happens client side, then the App has to know what the valid password for the key shown is…
Some research revealed you can pull APKs off an Android phone, so I downloaded a utility called “APK Extractor” from the Play store, and used it to extract the Samsung Sysdump utility.
So now I was armed with the APK on my local machine, the next step was to see if I could decompile the APK back into source code.
Some Googling found me an online APK decompiler, which I fed the compiled APK file and got back the source code.
I did some poking around inside the source code, and then I found an interesting directory:
Here’s a screenshot of the vanilla code that came out of the app.
I’m not a Java expert, but even I could see the “CheckOTP” function and understand that that’s what validates the One Time Passwords.
The while loop threw me a little – until I read through the rest of the code; the “key” in the popup box is actually a text string representing the current UNIX timestamp down to the minute level. The correct password is an operation done on the “key”, however the CheckOTP function doesn’t know the challenge key, but has the current time, so generates a challenge key for each timestamp back a few minutes and a few minutes into the future.
I modified the code slightly to allow me to enter the presented “key” and get the correct password back. It’s worth noting you need to act quickly, enter the “key” and enter the response within a minute or so.
In the end I’ve posted the code on an online Java compiler,
Samsung handsets have a feature built in to allow debugging from the handset, called Sysdump.
Entering *#9900# from the Dialing Screen will bring up the Sysdump App, from here you can dump logs from the device, and run a variety of debugging procedures.
But for private LTE operators, the two most interesting options are by far the TCPDUMP START option and IMS Logger, but both are grayed out.
Tapping on them asks for a one-time password and has a challenge key.
These options are not available in the commercial version of the OS and need to be unlocked with a one time key generated by a tool Samsung for unlocking engineering firmware on handsets.
Luckily this authentication happens client side, which means we can work out the password it’s expecting.
Once you’ve entered the code and successfully unlocked the IMS Debugging tool there’s a few really cool features in the hamburger menu in the top right.
DM View
This shows the SIP / IMS Messaging and the current signal strength parameters (used to determine which RAN type to use (Ie falling back from VoLTE to UMTS / Circuit Switched when the LTE signal strength drops).
Tapping on the SIP messages expands them and allows you to see the contents of the SIP messages.
Viewing SIP Messaging directly from the handset
Interesting the actual nitty-gritty parameters in the SIP headers are missing, replaced with X for anything “private” or identifiable.
Luckily all this info can be found in the Pcap.
The DM View is great for getting a quick look at what’s going on, on the mobile device itself, without needing a PC.
Logging
The real power comes in the logging functions,
There’s a lot of logging options, including screen recording, TCPdump (as in Packet Captures) and Syslog logging.
From the hamburger menu we can select the logging parameters we want to change.
From the Filter Options menu we can set what info we’re going to log,
There’s always a lot of talk and opinion about the technologies the NBN employs, it’s effectiveness, etc.
I’ve made a conscious decision to steer clear of opinion in this blog, but there’s often talk and blame shifting between NBNco and RSPs, so I thought I’d cover how the business model works.
Because of this I thought it’d be interesting to write about how the network actually works between carriers (RSPs) and NBNco.
Physical Structure
Last Mile
The last mile in US terms, CAN in Australian Telecom lingo, is connecting the subscriber edge to the network.
NBNCo employs a few different technologies for this, depending on a number of factors;
Fiber to the Premises (Original standard – End to end fibre)
All these last mile services get consolidated and eventually end up at a local PoI – Point of Interconnect, (typically called a POP if you’re any other telco).
These are typically hosted inside exchanges, but not every exchange is an NBNco PoI, if it’s not it uses NBN Backhaul to get to the nearest PoI.
NBNco currently operates 121 PoI sites.
NBNco don’t exclusively use TEBA sites, some are hosted in NBNco “Depots”, there’s currently 10 sites not in TEBA footprints.
At the PoI
Retail Service Providers (RSPs) have to have racks inside the PoI locations, and essentially setup layer 2 cross-connects to the NBNco racks.
Once the traffic is on the RSP network, it’s the RSP’s responsibility to carry it where it needs to go, via their own network / backhaul.
Billing and Metering
Of course, if NBNco is handing off the pipes of customer traffic off to each RSP they need a way to charge the RSPs for this, this is handled by two elements – CVCs equating to the bandwidth at the PoI and AVCs equating to a fixed standing charge per connection monthly.
CVC – Connectivity Virtual Circuit
At the PoI the connection between the NBNco rack and the RSP rack is metered over a CVC – Connectivity Virtual Circuit.
This is shared across all users of that RSP at that PoI.
Let’s say I’m an RSP and I’ve purchased a 1Mbps CVC shared across my 1,000 customers at that PoI, the customers aren’t going to have a good experience.
Of course, CVC bandwidth isn’t free, previously NBNco charged on average $15.25/Mbps.
This had the effect of ensuring each RSP had just enough CVC bandwidth for their customers, but this led to some customers having a poor experience on switching to NBNco as they found their speeds dropped due to not enough CVC bandwidth at the PoI for that RSP.
In June 2017 NBNco announced a change to the pricing structure to try and encourage RSPs to buy more CVC bandwidth to ensure customers speeds weren’t bottlenecked at the CVC.
The new pricing structure makes it more financially attractive to buy more CVC bandwidth based on how many active connections (AVCs) an RSP has in place.
NBNco now charges $17.50 per symmetrical Mbps for each traffic class. (More on traffic classes later)
This means at each PoI the RSP must have a pool of CVC bandwidth large enough to meet the needs of all the customers connections (AVCs) bandwidth needs at that PoI.
AVC – Access Virtual Circuit
NBNco charges AVC fees based on the speed tier the end user will have and the traffic class (QoS) the service has applied.
(This speed tier is regardless of if the RSP has the CVC bandwidth to support this)
Pricing of TC4 (Best Effort) AVCs
Introducing NNIs
NBNco acknolged in Jul 2018 that for some carriers (RSPs) having presence in 121 sites puts up a large barrier to entry.
To counteract this they introduced Network-Network Interface (NNI).
Imagine you’re operating an RSP with a footprint in capital cities and PoI / CVCs in populated areas, you can’t serve customers in remote areas without having a presence at their local NBNco PoI location and buying CVC bandwidth for that location – It just wouldn’t stack up financially.
NBNco introduced the NNI product to essentially backhaul the traffic from these customers to the nearest PoI their RSP is at and share the CVC bandwidth at that PoI.
The PLMN Identifier is used to identify the radio networks in use, it’s made up of the MCC – Mobile Country Code and MNC – Mobile Network Code.
But sadly it’s not as simple as just concatenating MCC and MNC like in the IMSI, there’s a bit more to it.
In the example above the Tracking Area Identity includes the PLMN Identity, and Wireshark has been kind enough to split it out into MCC and MNC, but how does it get that from the value 12f410?
This one took me longer to work out than I’d like to admit, and saw me looking through the GSM spec, but here goes:
PLMN Contents: Mobile Country Code (MCC) followed by the Mobile Network Code (MNC). Coding: according to TS GSM 04.08 [14].
If storage for fewer than the maximum possible number n is required, the excess bytes shall be set to ‘FF’. For instance, using 246 for the MCC and 81 for the MNC and if this is the first and only PLMN, the contents reads as follows: Bytes 1-3: ’42’ ‘F6′ ’18’ Bytes 4-6: ‘FF’ ‘FF’ ‘FF’ etc.
TS GSM 04.08 [14].
Making sense to you now? Me neither.
Here’s the Python code I wrote to encode MCC and MNCs to PLMN Identifiers and to decode PLMN into MCC and MNC, and then we’ll talk about what’s happening:
In the above example I take MCC 505 (Australia) and MCC 93 and generate the PLMN ID 05f539.
The first step in decoding is to take the first two bits (in our case 05 and reverse them – 50, then we take the third and fourth bits (f5) and reverse them too, and strip the letter f, now we have just 5. We join that with what we had earlier and there’s our MCC – 505.
Next we get our MNC, for this we take bytes 5 & 6 (39) and reverse them, and there’s our MNC – 93.
Together we’ve got MCC 505 and MNC 93.
The one answer I’m still looking for; why not just encode 50593? What is gained by encoding it as 05f539?
After a few quiet months I’m excited to say I’ve pushed through some improvements recently to PyHSS and it’s growing into a more usable HSS platform.
MongoDB Backend
This has a few obvious advantages – More salable, etc, but also opens up the ability to customize more of the subscriber parameters, like GBR bearers, etc, that simple flat text files just wouldn’t support, as well as the obvious issues with threading and writing to and from text files at scale.
Knock knock.
Race condition.
Who’s there?
— Threading Joke.
For now I’m using the Open5GS MongoDB schema, so the Open5Gs web UI can be used for administering the system and adding subscribers.
The CSV / text file backend is still there and still works, the MongoDB backend is only used if you enable it in the YAML file.
The documentation for setting this up is in the readme.
SQN Resync
If you’re working across multiple different HSS’ or perhaps messing with some crypto stuff on your USIM, there’s a chance you’ll get the SQN (The Sequence Number) on the USIM out of sync with what’s on the HSS.
This manifests itself as an Update Location Request being sent from the UE in response to an Authentication Information Answer and coming back with a Re-Syncronization-Info AVP in the Authentication Info AVP. I’ll talk more about how this works in another post, but in short PyHSS now looks at this value and uses it combined with the original RAND value sent in the Authentication Information Answer, to find the correct SQN value and update whichever database backend you’re using accordingly, and then send another Authentication Information Answer with authentication vectors with the correct SQN.
SQN Resync is something that’s really cryptographically difficult to implement / confusing, hence this taking so long.
What’s next? – IMS / Multimedia Auth
The next feature that’s coming soon is the Multimedia Authentication Request / Answer to allow CSCFs to query for IMS Registration and manage the Cx and Dx interfaces.
Code for this is already in place but failing some tests, not sure if that’s to do with the MAA response or something on my CSCFs,
LTE has great concepts like NAS that abstract the actual transport layers, so the NAS packet is generated by the UE and then read by the MME.
One thing that’s a real headache about private LTE is the authentication side of things. You’ll probably bash your head against a SIM programmer for some time.
As your probably know when connecting to a network, the UE shares it’s IMSI / TIMSI with the network, and the MME requests authentication information from the HSS using the Authentication Information Request over Diameter.
The HSS then returns a random value (RAND), expected result (XRES), authentication token (AUTN) and a KASME for generating further keys,
The RAND and AUTN values are sent to the UE, the USIM in the UE calculates the RES (result) and sends it back to the MME. If the RES value received by the MME is equal to the expected RES (XRES) then the subscriber is mutually authenticated.
Using this tool I was able to plug a USIM into my USIM reader, using the Diameter client built into PyHSS I was able to ask for Authentication vectors for a UE using the Authentication Information Request to the HSS and was sent back the Authentication Information Answer containing the RAND and AUTN values, as well as the XRES value.
Diameter – Authentication Information Response showing E-UTRAN Vectors
Then I used the osmo-sim-auth app to query the RES and RAND values against the USIM.
The RES I got back matched the XRES, meaning the HSS and the USIM are in sync (SQNs match) and they mutually authenticated.
I thought I’d expand a little on how the Crypto side of things works in LTE & NR (also known as 4G & 5G).
Authentication primarily happens in two places, one at each end of the network, the Home Subscriber Server and in the USIM card. Let’s take a look at each of them.
On the USIM
On the USIM we’ve got two values that are entered in when the USIM is provisioned, the K key – Our secret key, and an OPc key (operator key).
These two keys are the basis of all the cryptography that goes on, so should never be divulged.
The only other place to have these two keys in the HSS, which associates each K key and OPc key combination with an IMSI.
The USIM also stores the SQN a sequence number, this is used to prevent replay attacks and is incremented after each authentication challenge, starting at 1 for the first authentication challenge and counting up from there.
On the HSS
On the HSS we have the K key (Secret key), OPc key (Operator key) and SQN (Sequence Number) for each IMSI on our network.
Each time a IMSI authenticates itself we increment the SQN, so the value of the SQN on the HSS and on the USIM should (almost) always match.
Authentication Options
Let’s imagine we’re designing the authentication between the USIM and the Network; let’s look at some options for how we can authenticate everyone and why we use the process we use.
Failed Option 1 – Passwords in the Clear
The HSS could ask the USIM to send it’s K and OPc values, compare them to what the HSS has in place and then either accept or reject the USIM depending on if they match.
The obvious problem with this that to send this information we broadcast our supposedly secret K and OPc keys over the air, so anyone listening would get our secret values, and they’re not so secret anymore.
This is why we don’t use this method.
Failed Option 2 – Basic Crypto
So we’ve seen that sending our keys publicly, is out of the question.
The HSS could ask the USIM to mix it’s K key and OPc key in such a way that only someone with both keys could unmix them.
This is done with some cryptographic black magic, all you need to know is it’s a one way function you enter in values and you get the same result every time with the same input, but you can’t work out the input from the result.
The HSS could then get the USIM to send back the result of mixing up both keys, mix the two keys it knows and compare them.
The HSS mixes the two keys itself, and get’s it’s own result called XRES (Expected Result). If the RES (result) of mixing up the keys by the USIM is matches the result when the HSS mixes the keys in the same way (XRES (Expected Result)), the user is authenticated.
The result of mixing the keys by the USIM is called RES (Result), while the result of the HSS mixing the keys is called XRES (Expected Result).
This is abetter solution but has some limitations, because our special mixing of keys gets the same RES each time we put in our OPc and K keys each time a subscriber authenticates to the network the RES (result) of mixing the keys is going to be the same.
This is vulnerable to replay attacks. An attacker don’t need to know the two secret keys (K & OPc) that went into creating the RES (resulting output) , the attacker would just need to know the result of RES, which is sent over the air for anyone to hear. If the attacker sends the same RES they could still authenticate.
This is why we don’t use this method.
Failed Option 3 – Mix keys & add Random
To prevent these replay attacks we add an element of randomness, so the HSS generates a random string of garbage called RAND, and sends it to the USIM.
The USIM then mixes RAND (the random string) the K key and OPc key and sends back the RES (Result).
Because we introduced a RAND value, every time the RAND is different the RES is different. This prevents against the replay attacks we were vulnerable to in our last example.
If the result the USIM calculated with the K key, OPc key and random data is the same as the USIM calculated with the same K key, OPc key and same random data, the user is authenticated.
While an attacker could reply with the same RES, the random data (RAND) will change each time the user authenticates, meaning that response will be invalid.
While an attacker could reply with the same RES, the random data (RAND) will change each time the user authenticates, meaning that response will be invalid.
The problem here is now the network has authenticated the USIM, the USIM hasn’t actually verified it’s talking to the real network.
This is why we don’t use this method.
GSM authentication worked like this, but in a GSM network you could setup your HLR (The GSM version of a HSS) to allow in every subscriber regardless of what the value of RES they sent back was, meaning it didn’t look at the keys at all, this meant attackers could setup fake base stations to capture users.
Option 4 – Mutual Authentication (Real World*)
So from the previous options we’ve learned:
Our network needs to authenticate our subscribers, in a way that can’t be spoofed / replayed so we know who to bill & where to route traffic.
Our subscribers need to authenticate the network so they know they can trust it to carry their traffic.
So our USIM needs to authenticate the network, in the same way the network authenticates the USIM.
To do this we introduce a new key for network authentication, called AUTN.
The AUTN key is generated by the HSS by mixing the secret keys and RAND values together, but in a different way to how we mix the keys to get RES. (Otherwise we’d get the same key).
This AUTN key is sent to the USIM along with the RAND value. The USIM runs the same mixing on it’s private keys and RAND the HSS did to generate the AUTN , except this is the USIM generated – An Expected AUTN key (XAUTN). The USIM compares XAUTN and AUTN to make sure they match. If they do, the USIM then knows the network knows their secret keys.
The USIM then does the same mixing it did in the previous option to generate the RES key and send it back.
The network has now authenticated the subscriber (HSS has authenticated the USIM via RES key) and the subscriber has authenticated the USIM (USIM authenticates HSS via AUTN key).
*This is a slightly simplified version of how EUTRAN / LTE authentication works between the HSS and the USIM – In reality there are a few extra values, such as SQN to take into consideration and the USIM talks to to the MME not the HSS directly.
I’ll do a follow up post covering the more nitty-gritty elements, AMF and SQN fields, OP vs OPc keys, SQN Resync, how this information is transfered in the Authentication Information Answer and how KASME keys are used / distributed.
The Australian Government publishes elevation data online that’s freely available for anyone to use. There’s a catch – If you’re using Forsk Atoll, it won’t import without a fair bit of monkeying around with the data…
You draw around the area you want to download, enter your email address and you’re linked to a download of the dataset you’ve selected.
So now we download the data from the link, unzip it and we’re provided with a .tiff image with the elevation data in the pixel colour and geocoded with the positional information.
Problem is, this won’t import into Atoll – Unsupported depth.
I fired it up, and imported the elevation tiff file we’d downloaded.
Selected “Elevation” waited a few seconds and presto!
We can export from here in the PNG 16 bit grayscale format Atoll takes, but there’s a catch, negative elevation values and blank data will show up as giant spikes which will totally mess with your propagation modeling.
So I found an option to remove elevation data from a set range, but it won’t deal with negative values…
So I found another option in the elevation menu to offset elevation vertically, I added 100 ft (It’s all in ft for some reason) to everything which meant my elevation data that was previously negative was now just under 100.
So if an area was -1ft before it was now 99ft.
Now I was able to use the remove range for anything from 0 100 ft (previously sea level)
Now my map only shows data above sea level
Now I offset the elevation vertically again and remove 100ft so we get back to real values
Now I was able to export the elevation data from the Elevation -> Export to menu
Atoll seems to like PNG 16 bit greyscale so that’s what we’ll feed it.
In Atoll we’ll select File -> Import and open the PNG we just generated.
Data type will be Altitude, Pixel size is 5m (as denoted in email / dataset metadata).
Next question is offset, which took me a while to work out…
The email has the Lat & Long but Atoll deals in WGS co-ordinates,
Luckily the GeoPlanner website allows you to enter the lat & long of the top corner and get the equivalent West and North values for the UTM dataum.
Enter these values as your coordinates and you’re sorted.
I can even able a Map layer and confirm it lines up:
Let’s take a look at GTP, the workhorse of mobile user plane packet data.
This post covers all generations of mobile data (2.5 -> 5G), so I’m using generic terms.
GSM, UMTS, LTE & NR all have one protocol in common – GTP – The GPRS Tunneling Protocol.
So why do every generation of mobile data networks from GSM/GPRS in 2000, to 5G NR Standalone in 2020, rely on this one protocol for transporting user data?
So Why GTP?
GTP – the GPRS Tunnelling Protocol, is what encapsulates and tunnels IP packets from the internet / packet data network, to and from the User.
So why encapsulate the packets? What if the Base Station had access to the internet and routed the traffic to the users?
Let’s say we did that, we’d have to have large pools of IP addresses available at each Base Station and when a user connected they’d be assigned an IP Address and traffic for these users would be routed to the Base Station which would forward it onto the user.
This would work well until a user moves from one Base Station to another, when they’d have to get a new IP Address allocated.
TCP/IP was never designed to be mobile, an IP address only exists in a single location.
Breaking out traffic directly from a base station would have other issues, such as no easy way to enforce QoS or traffic policies, meter usage, etc.
How to fix IP’s lack of mobility? GTP.
GTP addressed the mobility issue by having a single fixed point the IP Address is assigned to (In GSM/GRPS/UMTS this is the Gateway GPRS Support Node, in LTE this is the P-GW and in 5G-SA this is the UPF), which encapsulates IP traffic to/from a mobile user into GTP Packet.
You can think of GTP like GRE or any of the other common encapsulation protocols, wrapping up the IP packets into a GTP packet which we can rerouted to different Base Stations as the users move from being served by one Base Station to another.
This easy redirecting / rerouting of user traffic is why GTP is used for NR (5G), LTE (4G), UMTS (3G) & GPRS (2.5G) architectures.
GTP Packets
When looking at a GTP packet of user data you’d be forgiven for thinking nothing much goes on,
Example GTP packet containing a DNS query
Like in most tunneling / encapsulation protocols we’ve got the original network / protocol stack of IPv4 and UDP, and a payload of a GTP packet.
The packet itself is pretty bare bones, there’s flags, denoting a few basics like version number, the message type (T-PDU), the length of the GTP packet and it’s payload (used for delineating the end of the payload), a sequence number an a Tunnel Endpoint Identifier (TEID).
In the payload, we can see the network / protocol stack and application layer of the contents of the GTP packet.
From a mobility standpoint, the beauty of GTP is that it takes IP packets and puts them into a media stream of sorts, with out of band signalling, this means we can change the parameters of our GTP stream easily without touching the encapsulated IP Packet.
When a UE moves from one base station to another, all that has to happen is the destination the GTP packets are sent to is changed from the old base station to the new base station. This is signalled using GTP-C in GPRS/UMTS, GTPv2-C in LTE and HTTP in 5G-SA.
Traffic to and from the UE would look the same as the screenshot above, the only difference would be the first IPv4 address would be different, but the IPv4 address in the GTP tunnel would be the same.
NBNco’s FTTC technology is accounting for a larger and larger share of the access network mix as the rollout nears completion, but let’s take a look at the hardware doing the heavy lifting.
I won’t go into the fiber network build NBNco are using (Squids, etc) in this post, we’ll just focus on the DSLAM that lives in the pit outside your house, or in NBN parlance – DPU or Distribution Point Unit.
In short, this is a 4 port DSLAM, fed by a fiber service and reverse powered.
The unit itself is waterproof, allowing it to live in the pit outside a customer premises, for FTTC deployments it’s common for every second P3 pit to contain a DPU (each pit typically feeds two premises).
There’s 4 copper tails for connecting in each of the 4 copper pairs to feed 4 premises. The copper run is typically less than 100m and is pretty easy to work out – Pace the distance from your first telephone outlet (TO) to your nearest pit, and there’s a 50% chance that’s the length of your cable run. Because of the short run of cable it’s a lot less to go wrong in the CAN, the only joint on the pair being the one on the DPU itself and anything inside the demarcac.
The DPU is powered by the customer’s modem via a reverse power feed, this means NBNco don’t have to worry about powering the unit, something on FTTN cabinets has been a maintenance headache due to battery backup maintenance.
The lead in cable to the customer premises is joined to the DPU via a “Snot Box”.
Snot box for joining lead in to DPUDPU in the pit
Unfortunately due to the enclosures being water tight and sealed, they don’t have the best thermal management. It’s not uncommon for them to reach 50+ degrees C in the field, which leads to a high failure rate, especially during summer.
NetComm Wireless / Casa NDD-4100
In 2016 NetComm Wireless (Now owned by Casa Systems) signed an agreement with NBNco to provide Fiber to the Distribution Point (FTTdp) Distribution Point Unit (DPU) equipment to NBNco for the launch in 2018, using their NetComm NDD-4100 units.
The unit has 4 ports for customer connections over a VDSL G.9923 interface, with reverse power feed, meaning the unit is fed by the CPE.
For backhaul the unit has GPON G.984 interface.
The device may not be powered at all times so a management proxy caches commands that are fed to the system when it comes back online.
Promo video
Nokia lightspan sx-f
In June 2018 NBNco started trialing Nokia DPUs, and many later installations since then are using the Nokia DPU.
I’ve head a bunch of complains about the NetComm having issues and dying, and for a sealed unit there’s very little debugging that can be done to it.
In the Melbourne office of NBNco there’s a Nokia DPU that’s been running in a fish tank for a number of years.
I started working on a private LTE project a while ago; RAN hardware (eNodeBs) were on the way, down to a shortlist of a few EPC platforms, but I still needed USIMs before anyone was connecting to the network.
So why are custom USIMs a requirement? Can’t you just use any old USIM/SIMs?
For roaming to work between carriers they’ve got to have their HSS / DRA connecting to the DRA or HSS of other carriers, to allow roaming subscribers to access the network, otherwise they too would fall foul of the mutual network authentication and the USIM wouldn’t connect to the network.
The first USIMs I purchased online through a popular online marketplace with a focus on connecting you to Chinese manufacturers. They listed a package of USIMS, a USB reader/writer that supported all the standard USIM form factors and the software to program it, which I purchased.
The USIMs worked fairly well – They are programmable via a card reader and software that, although poorly translated/documented, worked fairly well.
USIM Programming Interface
K and OP/OPc values could be written to the card but not read, while the other values could be read and written from the software, the software also has the ability to sequentially program the USIMs to make bulk operations easier. The pricing worked out about $8 USD per USIM, which although expensive for the quantity and programmable element is pretty reasonable.
Every now and then the Crypto values for some reason or another wouldn’t get updated, which is exactly as irritating as it sounds.
Pretty quickly into the build I learned the USIMs didn’t include an ISIM service on the card, ISIM being the service that runs on the UCCID responsible for IMS / VoLTE authentication.
Again I went looking and reached out to a few manufacturers of USIMs.
The big vendors, Gemalto, Kona, etc, weren’t interested in providing USIMs in quantities less than 100,000 and their USIMs came from the factory pre-programmed, meaning the values could only be changed through remote SIM provisioning, a form of black magic.
In the end I reached out to an OEM manufacturer from China who provided programmable USIM / ISIMs for less than I was paying on the online marketplace and at any quantity I wanted with custom printing options, allocated ICCIDs, etc.
The non-programmable USIMs worked out less than $0.40 USD each in larger quantities, and programmable USIM/ISIMs for about $5 USD.
The software was almost identical except for the additional tab for ISIM operations.
USIM / ISIM programmingISIM parameters
Smart Card Readers
In theory this software and these USIMs could be programmed by any smart card reader.
In practice, the fact that the ISO standard smart card is the same size as a credit card, means most smart card readers won’t fit the bill.
I tried a few smart card readers, from the one built into my Thinkpad, to a Bluedrive II from one of the USIM vendors, in the end the MCR3516 Smart Card Reader which reads 4FF USIMs (Standard ISO size smart card, full size SIM, Micro SIM and Nano SIM form factors, which saved on so much mucking about with form factor adapters etc.
4FF Smart Card Reader for programming SIM/USIM/ISIM
Future Projects
I’ve got some very calls “Multi Operator Neutral Host” (MoNEH) USIMs from the guys at Telet Research I’m looking forward to playing with,
eSIMs are on my to-do list too, and the supporting infrastructure, as well as Over the Air updating of USIMs.
The numbering system lists all the phone numbers in Australia and the carrier they’re assigned to (Donor Carrier when ported).
This means each carrier downloads this data from what is now ACMA (but was at the time ACA) and for each outgoing call look the number up in that data set and route to the correct carrier.
This routing is used when a number has not been ported – The CSP it’s allocated to by ACMA is the CSP to route calls to.
During Porting – Donor & Losing CSP the Same
At the time the number is ported the Losing CSP provides Donor Transit Routing – In practice this is nothing more than a fancy redirection/ forward to the new carrier.
Once the port is completed (and the Emergency Porting Return period has expired) the Losing CSP (who in this case is also the donor CSP) updates their PLNR file, to include the number that’s just been ported out and the new carrier’s code.
During Porting – Donor & Losing CSP Different (3rd Party Porting)
At the time the number is ported the Losing CSP provides Donor Transit Routing – In practice this is nothing more than a fancy redirection/ forward to the new carrier.
Once the port is completed (and the Emergency Porting Return period has expired) the Donor CSP updates their PLNR file, to update the record for the ported number, which previously showed the code of the Losing CSP and now must show the new CSP code.
After Porting – PLNR Update
After the PLNR has been updated by the Donor CPSs, other CPS that participate in porting must update their routing records to ensure they route directly to the Gaining CSP and not to the Donor CSP and rely on Donor Routing.
Donor Routing is only required to be in place for a short time, meaning carriers that don’t update their routes will find the ported destination unreachable once donor transit routing has been stopped.