Network Slicing, is a new 5G Technology. Or is it?
Pre 3GPP Release 16 the capability to “Slice” a network already existed, in fact the functionality was introduced way back at the advent of GPRS, so what is so new about 5G’s Network Slicing?
Network Slice: A logical network that provides specific network capabilities and network characteristics
3GPP TS 123 501 / 3 Definitions and Abbreviations
Let’s look at the old and the new ways, of slicing up networks, pre release 16, on LTE, UMTS and GSM.
Old Ways: APN Separation
The APN or “Access Point Name” is used so the SGSN / MME knows which gateway to that subscriber’s traffic should be terminated on when setting up the session.
APN separation is used heavily by MVNOs where the MVNO operates their own P-GW / GGSN. This allows the MNVO can handle their own rating / billing / subscriber management when it comes to data. A network operator just needs to setup their SGSN / MME to point all requests to setup a bearer on the MVNO’s APN to the MNVO’s gateways, and presoto, it’s no longer their problem.
Later as customers wanted MPLS solutions extended over mobile (Typically LTE), MNOs were able to offer “private APNs”. An enterprise could be allocated an APN by the MNO that would ensure traffic on that APN would be routed into the enterprise’s MPLS VRF. The MNO handles the P-GW / GGSN side of things, adding the APN configuration onto it and ensuring the traffic on that APN is routed into the enterprise’s VRF.
Different QCI values can be assigned to each APN, to allow some to have higher priority than others, but by slicing at an APN level you lock all traffic to those QoS characteristics (Typically mobile devices only support one primary APN used for routing all traffic), and don’t have the flexibility to steer which networks which traffic from a subscriber goes to.
It’s not really practical for everyone to have their own APNs, due in part to the namespace limitations, the architecture of how this is usually done limits this, and the simple fact of everyone having to populate an APN unique to them would be a real headache.
5G replaces APNs with “DNNs” – Data Network Names, but the functionality is otherwise the same.
In Summary: APN separation slices all traffic from a subscriber using a special APN and provide a bearer with QoS/QCI values set for that APN, but does not allow granular slicing of individual traffic flows, it’s an all-or-nothing approach and all traffic in the APN is treated equally.
The old Ways: Dedicated Bearers
Dedicated bearers allow traffic matching a set rule to be provided a lower QCI value than the default bearer. This allows certain traffic to/from a UE to use GBR or Non-GBR bearers for traffic matching the rule.
The rule itself is known as a “TFT” (Traffic Flow Template) and is made up of a 5 value Tuple consisting of IP Source, IP Destination, Source Port, Destination Port & Protocol Number. Both the UE and core network need to be aware of these TFTs, so the traffic matching the TFT can get the QCI allocated to it.
This can be done a variety of different ways, in LTE this ranges from rules defined in a PCRF or an external interface like those of an IMS network using the Rx interface to request a dedicated bearers matching the specified TFTs via the PCRF.
Unlike with 5G network slicing, dedicated bearers still traverse the same network elements, the same MME, S-GW & P-GW is used for this traffic. This means you can’t “locally break out” certain traffic.
In Summary: Dedicated bearers allow you to treat certain traffic to/from subscribers with different precedence & priority, but the traffic still takes the same path to it’s ultimate destination.
This means one eNodeB can broadcast more than one PLMN and server more than one mobile network.
This slicing is very coarse – it allows two operators to share the same eNodeBs, but going beyond a handful of PLMNs on one eNB isn’t practical, and the PLMN space is quite limited (1000 PLMNs per country code max).
In Summary: MOCN allows slicing of the RAN on a very coarse level, to slice traffic from different operators/PLMNs sharing the same RAN.
Its use is focused on sharing RAN rather than slicing traffic for users.
If you’re building IMS Networks, the AMR config is a must, but FreeSWITCH does not ship with AMR due to licencing constraints, but has all the hard work done, you just need to add the headers for AMR support and compile.
LibOpenCore has support for AMR which we build, and then with a few minor tweaks to copy the C++ header files over to the FreeSWITCH source directory, and enable support in modules.conf.
Then when building FreeSWITCH you’ve got the AMR Codec to enable you to manage IMS / VoLTE media streams from mobile devices.
Instead of copying and pasting a list of commands to do this, I’ve published a Dockerfile here you can use to build a Docker image, or on a straight Debian Buster machine if you’re working on VMs or Bare Metal if you run the commands from the Dockerfile on the VM / bare metal.
When it comes to setting up dedicated bearers, the Flow-Description AVP is perhaps the most important,
The specially encoded string (IPFilterRule) in the FlowDescription AVP is what our P-GW (Ok, our PCEF) uses to create Traffic Flow Templates to steer certain types of traffic down Dedicated Bearers.
So let’s take a look at how we can lovingly craft an artisanal Flow-Description.
The contents of the AVP are technically not a string, but a IPFilterRule.
IPFilterRules are actually defined in the Diameter Base Protocol (IETF RFC 6733), where we can learn the basics of encoding them,
Which are in turn based loosely off the ipfw utility in BSD.
They take the format:
action dir proto from src to dst
The action is fairly simple, for all our Dedicated Bearer needs, and the Flow-Description AVP, the action is going to be permit. We’re not blocking here.
The direction (dir) in our case is either in or out, from the perspective of the UE.
Next up is the protocol number (proto), as defined by IANA, but chances are you’ll be using 17 (UDP) or 6 (TCP) in most scenarios.
The from value is followed by an IP address with an optional subnet mask in CIDR format, for example from 10.45.0.0/16would match everything in the 10.45.0.0/16 network. Following from you can also specify the port you want the rule to apply to, or, a range of ports, For example to match a single port you could use 10.45.0.0/16 1234 to match anything on port 1234, but we can also specify ranges of ports like 10.45.0.0/16 0 – 4069 or even mix and match lists and single ports, like 10.45.0.0/16 5060, 1000-2000
Protip: using any is the same as 0.0.0.0/0
Like the from, the tois encoded in the same way, with either a single IP, or a subnet, and optional ports specified.
And that’s it!
Keep in mind that Flow-Descriptions are typically sent in pairs as a minimum, as you want to match the traffic into and out of the network (not just one way), but often there can be quite a few sent, in order to match all the possible traffic that needs to be matched that may be across multiple different subnets, etc.
There is an optional Options parameter that allows you to set things like to only apply the rule to open TCP sessions, fragmentation, etc, although I’ve not seen this implemented in the wild.
Example IP filter Rules
permit in 6 from 10.98.254.0/24 5061 to 10.98.0.0/24 5060
permit out 6 from 10.98.254.0/24 5060 to 10.98.0.0/24 5061
permit in 6 from any 80 to 172.16.1.1 80
permit out 6 from 172.16.1.1 80 to any 80
permit in 17 from 10.98.254.0/24 50000-60100 to 10.98.0.0/24 50000-60100
permit out 17 from 10.98.254.0/24 50000-60100 to 10.98.0.0/24 50000-60100
permit in 17 from 10.98.254.0/24 5061, 5064 to 10.98.0.0/24 5061, 5064
permit out 17 from 10.98.254.0/24 5061, 5064 to 10.98.0.0/24 5061, 5064
permit in 17 from 172.16.0.0/16 50000-60100, 5061, 5064 to 172.16.0.0/16 50000-60100, 5061, 5064
permit out 17 from 172.16.0.0/16 50000-60100, 5061, 5064 to 172.16.0.0/16 50000-60100, 5061, 5064
Since the beginning of time, SIP has used the 2xx responses to confirm all went OK.
If you thought sending an SMS in a VoLTE/IMS network would see a 2xx OK response and then that’s the end of it, you’d be wrong.
So let’s take a look into sending SMS over VoLTE/IMS networks!
So our story starts with the Subscriber sending an SMS, which generate a SIP MESSAGE.
The Content-Type of this SIP MESSAGE is set to application/vnd.3gpp.sms rather than Text, and that’s because SMS over IMS uses the Short Message Transfer Protocol (SM-TP) inherited from GSM.
The Short Message Transfer Protocol (SM-TP) (Not related to Simple Message Transfer Protocol used in Email clients) is made up of Transfer Protocol Data Units (TPDU) that contain our message information, even though we have the Destination in our SIP headers, it’s again defined in the SM-TP body.
At first this may seem like a bit of duplication, but this allows older SMS Switching Centers (SMSc) to add support for IMS networks without any major changes, just what the SM-TP payload is wrapped up in changes.
SIP MESSAGE Request Body encoded in SM-TP
So back to our SIP MESSAGE request, typed out by the Subscriber, the UE sends this a SIP MESSAGE onto our IMS Network.
The IMS network follows it’s IFCs and routing rules, and makes it to the termination points for SMS traffic – the SMSc.
The SMSc sends back either a 200 OK or a 202 Accepted, and you’d think that’s the end of it, but no.
Our Subscriber still sees “Sending” on the screen, and the SMS is not shown as sent yet.
Instead, when the SMS has been delivered or buffered, relayed, etc, the SMSc generates a new SIP request, (as in new Call-ID / Dialog) with the request type MESSAGE, addressed to the Subscriber.
The payload of this request is another application/vnd.3gpp.sms encoded request body, again, containing SM-TP encoded data.
When the UE receives this, it will then consider the message delivered.
SM-TP encoded Delivery Report
Of course things change slightly when delivery reports are enabled, but that’s another story!
PDU Session Type: The type of PDU Session which can be IPv4, IPv6, IPv4v6, Ethernet or Unstructured
ETSI TS 123 501 – System Architecture for the 5G System
No longer are we limited to just IP transport, meaning at long last I can transport my Token Ring traffic over 5G, or in reality, customers can extend Layer 2 networks (Ethernet) over 3GPP technologies, without resorting to overlay networking, and much more importantly, fixed line networks, typically run at Layer 2, can leverage the 5G core architecture.
How does this work?
With TFTs and the N6 interfaces relying on the 5 value tuple with IPs/Ports/Protocol #s to make decisions, transporting Ethernet or Non-IP Data over 5G networks presents a problem.
But with fixed (aka Wireline) networks being able to leverage the 5G core (“Wireline Convergence”), we need a mechanism to handle Ethernet.
For starters in the PDU Session Establishment Request the UE indicates which PDN types, historically this was IPv4/6, but now if supported by the UE, Ethernet or Unstructured are available as PDU types.
We’ll focus on Ethernet as that’s the most defined so far,
Once an Ethernet PDU session has been setup, the N6 interface looks a bit different, for starters how does it know where, or how, to route unstructured traffic?
As far as 3GPP is concerned, that’s your problem:
Regardless of addressing scheme used from the UPF to the DN, the UPF shall be able to map the address used between the UPF and the DN to the PDU Session.
5.6.10.3 Support of Unstructured PDU Session type
In short, the UPF will need to be able to make the routing decisions to support this, and that’s up to the implementer of the UPF.
In the Ethernet scenario, the UPF would need to learn the MAC addresses behind the UE, handle ARP and use this to determine which traffic to send to which UE, encapsulate it into trusty old GTP, fill in the correct TEID and then send it to the gNodeB serving that user (if they are indeed on a RAN not a fixed network).
So where does this leave QoS? Without IPs to apply with TFTs and Packet Filter Sets to, how is this handled? In short, it’s not – Only the default QoS rule exist for a PDU Session of Type Unstructured. The QoS control for Unstructured PDUs is performed at the PDU Session level, meaning you can set the QFI when the PDU session is set up, but not based on traffic through that bearer.
Does this mean 5G RAN can transport Ethernet?
Well, it remains to be seen.
The specifications don’t cover if this is just for wireline scenarios or if it can be used on RAN.
The 5G PDU Creation signaling has a field to indicate if the traffic is Ethernet, but to work over a RAN we would need UE support as well as support on the Core.
And for E-UTRAN?
For the foreseeable future we’re going to be relying on LTE/E-UTRAN as well as 5G. So if you’re mobile with a non-IP PDU, and you enter an area only served by LTE, what happens?
PDU Session types “Ethernet” and “Unstructured” are transferred to EPC as “non-IP” PDN type (when supported by UE and network). … It is assumed that if a UE supports Ethernet PDU Session type and/or Unstructured PDU Session type in 5GS it will also support non-IP PDN type in EPS.
5.17.2 Interworking with EPC
If you were not aware of support in the EPC for Non-IP PDNs, I don’t blame you – So far support the CIoT EPS optimizations were initially for Non-IP PDN type has been for NB-IoT to supporting Non-IP Data Delivery (NIDD) for lightweight LwM2M traffic.
So why is this? Well, it may have to do with WO 2017/032399 Al which is a patent held by Ericsson, regarding “COMMUNICATION OF NON-IP DATA OVER PACKET DATA NETWORKS” which may be restricting wide scale deployment of this,
As Open5Gs has introduced network slicing, which led to a change in the database used,
Alas many users had subscribers provisioned in the old DB schema and no way to migrate the SDM data between the old and new schema,
If you’ve created subscribers on the old schema, and now after the updates your Subscriber Authentication is failing, check out this tool I put together, to migrate your data over.
I’d been trying for some time to get Kamailio acting as a Diameter Routing Agent with mixed success, and eventually got it working, after a few changes to the codebase of the ims_diameter_server module.
It is rather unstable, in that if it fails to dispatch to a Diameter peer, the whole thing comes crumbling down, but incoming Diameter traffic is proxied off to another Diameter peer, and Kamailio even adds an extra AVP.
Having used Kamailio for so long I was really hoping I could work with Kamailio as a DRA as easily as I do for SIP traffic, but it seems the Diameter module still needs a lot more love before it’ll be stable enough and simple enough for everyone to use.
I created a branch containing the fixes I made to make it work, and with an example config for use, but use with caution. It’s a long way from being production-ready, but hopefully in time will evolve.
One feature I’m pretty excited to share is the addition of a single config file for defining how PyHSS functions,
In the past you’d set variables in the code or comment out sections to change behaviour, which, let’s face it – isn’t great.
Instead the config.yaml file defines the PLMN, transport time (TCP or SCTP), the origin host and realm.
We can also set the logging parameters, SNMP info and the database backend to be used,
HSS Parameters
hss:
transport: "SCTP"
#IP Addresses to bind on (List) - For TCP only the first IP is used, for SCTP all used for Transport (Multihomed).
bind_ip: ["10.0.1.252"]
#Port to listen on (Same for TCP & SCTP)
bind_port: 3868
#Value to populate as the OriginHost in Diameter responses
OriginHost: "hss.localdomain"
#Value to populate as the OriginRealm in Diameter responses
OriginRealm: "localdomain"
#Value to populate as the Product name in Diameter responses
ProductName: "pyHSS"
#Your Home Mobile Country Code (Used for PLMN calcluation)
MCC: "999"
#Your Home Mobile Network Code (Used for PLMN calcluation)
MNC: "99"
#Enable GMLC / SLh Interface
SLh_enabled: True
logging:
level: DEBUG
logfiles:
hss_logging_file: log/hss.log
diameter_logging_file: log/diameter.log
database_logging_file: log/db.log
log_to_terminal: true
database:
mongodb:
mongodb_server: 127.0.0.1
mongodb_username: root
mongodb_password: password
mongodb_port: 27017
Stats Parameters
redis:
enabled: True
clear_stats_on_boot: False
host: localhost
port: 6379
snmp:
port: 1161
listen_address: 127.0.0.1
MSISDN AVP 700 / vendor ID 10415, used to advertise the subscriber’s MSISDN in signaling.
I formatted the data as an Octet String, with the MSISDN from the database and moved on my merry way.
Not so fast…
The MSISDN AVP is of type OctetString.
This AVP contains an MSISDN, in international number format as described in ITU-T Rec E.164 [8], encoded as a TBCD-string, i.e. digits from 0 through 9 are encoded 0000 to 1001;
1111 is used as a filler when there is an odd number of digits; bits 8 to 5 of octet n encode digit 2n; bits 4 to 1 of octet n encode digit 2(n-1)+1.
ETSI TS 129 329 / 6.3.2 MSISDN AVP
Come again?
In practice this means if you have an odd lengthed MSISDN value, we need to add some padding to round it out to an even-lengthed value.
This padding happens between the last and second last digit of the MSISDN (because if we added it at the start we’d break the Country Code, etc) and as MSISDNs are variable length subscriber numbers.
1111 in octet string is best known as the letter F,
A seperate server (hss_sctp.py) is run to handle SCTP connections, and if you’re looking for Multihoming, we got you dawg – Just edit the config file and set the bind_ip list to include each of your IPs to multi home listen on.
Ok, admittedly I haven’t actually seen “When a Stranger Calls”, or the less popular sequel “When a stranger Redials” (Ok may have made the last one up).
But the premise (as I read Wikipedia) is that the babysitter gets the call on the landline, and the police trace the call as originating from the landline.
But you can’t phone yourself, that’s not how local loops work – When the murderer goes off hook it loops the circuit, which busys it. You could apply ring current to the line I guess externally but unless our murder has a Ring generator or has setup a PBX inside the house, the call probably isn’t coming from inside the house.
On Topic – The GMLC
The GMLC (Gateway Mobile Location Centre) is a central server that’s used to locate subscribers within the network on different RATs (GSM/UMTS/LTE/NR).
The GMLC typically has interfaces to each of the radio access technologies, there is a link between the GMLC and the CS network elements (used for GSM/UMTS) such as the HLR, MSC & SGSN via Lh & Lg interfaces, and a link to the PS network elements (LTE/NR) via Diameter based SLh and SLg interfaces with the MME and HSS.
The GMLC’s tentacles run out to each of these network elements so it can query them as to a subscriber’s location,
LTE Call Flow
To find a subscriber’s location in LTE Diameter based signaling is used, to query the MME which in turn queries, the eNodeB to find the location.
But which MME to query?
The SLh Diameter interface is used to query the HSS to find out which MME is serving a particular Subscriber (identified by IMSI or MSISDN).
The LCS-Routing-Info-Request is sent by the GMLC to the HSS with the subscriber identifier, and the LCS-Routing-Info-Response is returned by the HSS to the GMLC with the details of the MME serving the subscriber.
Now we’ve got the serving MME, we can use the SLgDiameter interface to query the MME to the location of that particular subscriber.
The MME can report locations to the GMLC periodically, or the GMLC can request the MME provide a location at that point. For the GMLC to request a subscriber’s current location a Provide-Location-Request is set by the GMLC to the MME with the subscriber’s IMSI, and the MME responds after querying the eNodeB and optionally the UE, with the location info in the Provide-Location-Response.
(I’m in the process of adding support for these interfaces to PyHSS and all going well will release some software shortly to act at a GMLC so people can use this.)
Finding the actual Location
There are a few different ways the actual location of the UE is determined,
At the most basic level, Cell Global Identity (CGI) gives the identity of the eNodeB serving a user. If you’ve got a 3 sector site each sector typically has its own Cell Global Identity, so you can determine to a certain extent, with the known radiation pattern, bearing and location of the sector, in which direction a subscriber is. This happens on the network side and doesn’t require any input from the UE. But if we query the UE’s signal strength, this can then be combined with existing RF models and the signal strength reported by the UE to further pinpoint the user with a bit more accuracy. (Uplink and downlink cell coverage based positioning methods) Barometric pressure and humidity can also be reported by the base station as these factors will impact resulting signal strengths.
Timing Advance (TA) and Time of Arrival (TOA) both rely on timing signals to/from a UE to determine it’s distance from the eNodeB. If the UE is only served by a single cell this gives you a distance from the cell and potentially an angle inside which the subscriber is. This becomes far more useful with 3 or more eNodeBs in working range of the UE, where you can “triangulate” the UE’s location. This part happens on the network side with no interaction with the UE. If the UE supports it, EUTRAN can uses Enhanced Observed Time Difference (E-OTD) positioning method, which does TOD calcuation does this in conjunction with the UE.
GPS Assisted (A-GPS) positioning gives good accuracy but requires the devices to get it’s current location using the GPS, which isn’t part of the baseband typically, so isn’t commonly implimented.
Uplink Time Difference of Arrival (UTDOA) can also be used, which is done by the network.
So why do we need to get Subscriber Locations?
The first (and most noble) use case that springs to mind is finding the location of a subscriber making a call to emergency services. Often upon calling an emergency services number the GMLC is triggered to get the subscriber’s location in case the call is cut off, battery dies, etc.
But GMLCs can also be used for lots of other purposes, marketing purposes (track a user’s location and send targeted ads), surveillance (track movements of people) and network analytics (look at subscriber movement / behavior in a specific area for capacity planning).
Different countries have different laws regulating access to the subscriber location functions.
Hack to disable Location Reporting on Mobile Networks
If you’re wondering how you can disable this functionality, you can try the below hack to ensure that your phone does not report your location.
Press the power button on your phone
Turn it off
In reality, no magic super stealth SIM cards, special phones or fancy firmware will prevent the GMLC from finding your location. So far none of the “privacy” products I’ve looked at have actually done anything special at the Baseband level. Most are just snakeoil.
For as long as your device is connected to the network, the passive ways of determining location, such as Uplink Time Difference of Arrival (UTDOA) and the CGI are going to report your location.
Every now and then when looking into a problem I have to really stop and think about how things work low down, that I haven’t thought about for a long time, and MTU is one of those things.
I faced with an LTE MTU issue recently I thought I’d go back and brush up on my MTU knowhow and do some experimenting.
Note: This is an IPv4 discussion, IPv6 does not support fragmentation.
The very, very basics
MTU is the Maximum Transmission Unit.
In practice this is the largest datagram the layer can handle, and more often than not, this is based on a physical layer constraint, in that different physical layers can only stuff so much into a frame.
“The Internet” from a consumer perspective typically has an MTU of 1500 bytes or perhaps a bit under depending on their carrier, such as 1472 bytes. SANs in data centers typically use an MTU of around 9000 bytes, Out of the box, most devices if you don’t specify, will use an MTU of 1500 bytes.
As a general rule, service providers typically try to offer an MTU as close to 1500 as possible.
Messages that are longer than the Maximum Transmission Unit need to be broken up in a process known as “Fragmenting”. Fragmenting allows large frames to be split into smaller frames to make their way across hops with a lower MTU.
All about Fragmentation
So we can break up larger packets into smaller ones by Fragmenting them, so case closed on MTU right? Sadly not.
Fragmentation leads to reduced efficiency – Fragmenting frames takes up precious CPU cycles on the router performing it, and each time a frame is broken up, additional overhead is added by the device breaking it up, and by the receiver to reassemble it.
Fragmentation can happen multiple times across a path (Multi-Stage Fragmentation). For example if a frame is sent with a length of 9000 bytes, and needs to traverse a hop with an MTU of 4000, it would need to be fragmented (broken up) into 3 frames (Frame 1 and Frame 2 would be ~4000 bytes long and frame 3 would be ~1000 bytes long). If it then needs to traverse another hop with an MTU of 1500, then the 3 fragmented frame would each need to be further fragmented, with the first frame of ~4000 bytes being split up into 3 more fragmented frames. Lost track of what just happened? Spare a thought for the routers having to to do the fragmentation and the recipient having to reassemble their packets.
Fragmented frames are reassembled by the end recipient, other devices along the transmission path don’t reassemble packets.
In the end it boils down to this trade off: The larger the packet can be, the more user data we can stuff into each one as a percentage of the overall data. We want the percentage of user data for each packet to be as high as can be. This means we want to use the largest MTU possible, without having to fragment packets.
Overhead eats into our MTU
A 1500 byte MTU that has to be encapsulated in IPsec, GTP or PPP, is no longer a 1500 byte MTU as far as the customer is concerned.
Any of these encapsulation techniques add overhead, which shrinks the MTU available to the end customer.
This means we’ve got 50 bytes of transmission / transport overhead. This will be important later on!
How do subscribers know what to use as MTU?
Typically when a subscriber buys a DSL service or HFC connection, they’ll either get a preconfigured router from their carrier, or they will be given a list of values to use that includes MTU.
LTE and 5G on the other hand tell us the value we should use.
Inside the Protocol Configuration Options in the NAS PDU, the UE requests the MTU and DNS server to be used, and is provided back from the network.
This MTU value is actually set on the MME, not the P-GW. As the MME doesn’t actually know the maximum MTU of the network, it’s up to the operator to configure this to be a value that represents the network.
Why this Matters for LTE & 5G Transmission
As we covered earlier, fragmentation is costly. If we’re fragmenting packets we are:
Wasting resources on our transmission network / core networks – as we fragment Subscriber packets it’s taking up compute resources and therefore limiting throughput
Wasting radio resources as additional overhead is introduced for fragmented packets, and additional RBs need to be scheduled to handle the fragmented packets
To test this I’ve setup a scenario in the lab, and we’ll look at the packet captures to see how the MTU is advertised, and see how big we can make our MTU on the subscriber side.
I never cease to be amazed as to what I can do with Wireshark.
While we’re working with Smart Card readers and SIM cards, capturing and Decoding USB traffic to see what APDUs are actually being sent can be super useful, so in this post we’ll look at how we can use Wireshark to sniff the USB traffic to view APDUs being sent to smart cards from other software.
For the purposes of this post I’ll be reading the SIM cards with pySim, but in reality it’ll work with any proprietary SIM software, allowing you to see what’s actually being said to the card by your computer.
If you want to see what’s being sent between your phone and SIM card, the Osmocom SIMtrace is the device for you (And yes it also uses Wireshark for viewing this data!).
Ok, that’s all the prerequisites sorted, next we need to find the bus and device ID of our smart card reader,
We can get this listed with
lsusb
Here you can see I have a Smart Card reader on Bus 1 device 03 and another on Bus 2 device 10.
The reader I want to use is the “SCM Microsystems, Inc. SCR35xx USB Smart Card Reader” so I’ll jott down Bus 2 device 10. Yours will obviously be different, but you get the idea.
Finding the USB traffic in Wireshark
Next we’ll fire up Wireshark, if you’ve got your permissions right and followed along, you should see a few more interfaces starting with usbmonX in the capture list.
Because the device I want to capture from is on Bus 2, we’ll select usbmon2 and start capturing,
As you can see we’ve got a bit of a firehose of data, and we only care about device 10 on bus 2, so let’s filter for that.
So let’s generate some data and then filter for it, to generate some data I’m going to run pySim-read to read the data on a smart card that’s connected to my PC, and then filter to only see traffic on that USB device,
In my case as the USB device is 10 it’s got two sub addresses, so I’ll filter for USB Bus 2, device 10 sub-address 1 and 2, so the filter I’ll use is:
usb.addr=="2.10.1" or usb.addr=="2.10.2"
But this doesn’t really show us much, so let’s tell Wireshark this is PCSC/UCCID data to decode it as such;
So we’ll select some of this traffic -> Decode as -> USBCCID
Still not seeing straight APDUs, so let’s tell Wireshark one more bit of information – That we want to decode this information as GSM SIM data;
Again, we’ll select the data part of the USBCCID traffic -> Decode As -> GSM_SIM
And bingo, just like that we can now filter by gsm_sim and see the APDUs being sent / received.
This is part 3 of an n part tutorial series on working with SIM cards.
So in our last post we took a whirlwind tour of what an APDU does, is, and contains.
Interacting with a card involves sending the APDU data to the card as hex, which luckily isn’t as complicated as it seems.
While reading what the hex should look like on the screen is all well and good, actually interacting with cards is the name of the game, so that’s what we’ll be doing today, and we’ll start to abstract some of the complexity away.
Getting Started
To follow along you will need:
A Smart Card reader – SIM card / Smart Card readers are baked into some laptops, some of those multi-card readers that read flash/SD/CF cards, or if you don’t have either of these, they can be found online very cheaply ($2-3 USD).
A SIM card – No need to worry about ADM keys or anything fancy, one of those old SIM cards you kept in the draw because you didn’t know what to do with them is fine, or the SIM in our phone if you can find the pokey pin thing. We won’t go breaking anything, promise.
You may end up fiddling around with the plastic adapters to change the SIM form factor between regular smart card, SIM card (standard), micro and nano.
USB SIM / Smart Card reader supports all the standard form factors makes life a lot easier!
To keep it simple, we’re not going to concern ourselves too much with the physical layer side of things for interfacing with the card, so we’ll start with sending raw APDUs to the cards, and then we’ll use some handy libraries to make life easier.
PCSC Interface
To abstract away some complexity we’re going to use the industry-standard PCSC (PC – Smart Card) interface to communicate with our SIM card. Throughout this series we’ll be using a few Python libraries to interface with the Smart Cards, but under the hood all will be using PCSC to communicate.
pyscard
I’m going to use Python3 to interface with these cards, but keep in mind you can find similar smart card libraries in most common programming languages.
At this stage as we’re just interfacing with Smart Cards, our library won’t have anything SIM-specific (yet).
We’ll use pyscard to interface with the PCSC interface. pyscard supports Windows and Linux and you can install it using PIP with:
pip install pyscard
So let’s get started by getting pyscard to list the readers we have available on our system:
#!/usr/bin/env python3
from smartcard.System import *
print(readers())
Running this will output a list of the readers on the system:
Here we can see the two readers that are present on my system (To add some confusion I have two readers connected – One built in Smart Card reader and one USB SIM reader):
(If your device doesn’t show up in this list, double check it’s PCSC compatible, and you can see it in your OS.)
So we can see when we run readers() we’re returned a list of readers on the system.
I want to use my USB SIM reader (The one identified by Identiv SCR35xx USB Smart Card Reader CCID Interface 00 00), so the next step will be to start a connection with this reader, which is the first in the list.
So to make life a bit easier we’ll store the list of smart card readers and access the one we want from the list;
#!/usr/bin/env python3
from smartcard.System import *
r = readers()
connection = r[0].createConnection()
connection.connect()
So now we have an object for interfacing with our smart card reader, let’s try sending an APDU to it.
Actually Doing something Useful
Today we’ll select the EF that contains the ICCID of the card, and then we will read that file’s binary contents.
This means we’ll need to create two APDUs, one to SELECT the file, and the other to READ BINARY to get the file’s contents.
We’ll set the instruction byte to A4 to SELECT, and B0 to READ BINARY.
Table of Instruction bytes from TS 102 221
APDU to select EF ICCID
The APDU we’ll send will SELECT (using the INS byte value of A4 as per the above table) the file that contains the ICCID.
Each file on a smart card has been pre-created and in the case of SIM cards at least, is defined in a specification.
For this post we’ll be selecting the EF ICCID, which is defined in TS 102 221.
Information about EF-ICCID from TS 102 221
To select it we will need it’s identifier aka File ID (FID), for us the FID of the ICCID EF is 2FE2, so we’ll SELECT file 2FE2.
Parameter 1 – Selection Control (Limit search options)
00 (Select by File ID)
P2
Parameter 1 – More selection options
04 (No data returned)
Lc
Length of Data
02 (2 bytes of data to come)
Data
File ID of the file to Select
2FE2 (File ID of ICCID EF)
So that’s our APDU encoded, it’s final value will be A0 A4 00 04 02 2FE2
So let’s send that to the card, building on our code from before:
#!/usr/bin/env python3
from smartcard.System import *
from smartcard.util import *
r = readers()
connection = r[0].createConnection()
connection.connect()
print("Selecting ICCID File")
data, sw1, sw2 = connection.transmit(toBytes('00a40004022fe2'))
print("Returned data: " + str(data))
print("Returned Status Word 1: " + str(sw1))
print("Returned Status Word 2: " + str(sw2))
If we run this let’s have a look at the output we get,
We got back:
Selecting ICCID File
Returned data: []
Returned Status Word 1: 97
Returned Status Word 2: 33
So what does this all mean?
Well for starters no data has been returned, and we’ve got two status words returned, with a value of 97 and 33.
We can lookup what these status words mean, but there’s a bit of a catch, the values we’re seeing are the integer format, and typically we work in Hex, so let’s change the code to render these values as Hex:
#!/usr/bin/env python3
from smartcard.System import *
from smartcard.util import *
r = readers()
connection = r[0].createConnection()
connection.connect()
print("Selecting ICCID File")
data, sw1, sw2 = connection.transmit(toBytes('00a40004022fe2'))
print("Returned data: " + str(data))
print("Returned Status Word 1: " + str(hex(sw1)))
print("Returned Status Word 2: " + str(hex(sw2)))
Now we’ll get this as the output:
Selecting ICCID File Returned data: [] Returned Status Word 1: 0x61 Returned Status Word 2: 0x1e
Status Word 2 contains a value of 1e which tells us that there are 30 bytes of extra data available with additional info about the file. (We’ll cover this in a later post).
So now we’ve successfully selected the ICCID file.
Keeping in mind with smart cards we have to select a file before we can read it, so now let’s read the binary contents of the file we selected;
The READ BINARY command is used to read the binary contents of a selected file, and as we’ve already selected the file 2FE2 that contains our ICCID, if we run it, it should return our ICCID.
If we consult the table of values for the INS (Instruction) byte we can see that the READ BINARY instruction byte value is B0, and so let’s refer to the spec to find out how we should format a READ BINARY instruction:
Code
Meaning
Value
CLA
Class bytes – Coding options
A0 (ISO 7816-4 coding)
INS
Instruction (Command) to be called
B0 (READ BINARY)
P1
Parameter 1 – Coding / Offset
00 (No Offset)
P2
Parameter 2 – Offset Low
00
Le
How many bytes to read
0A (10 bytes of data to come)
We know the ICCID file is 10 bytes from the specification, so the length of the data to return will be 0A (10 bytes).
Let’s add this new APDU into our code and print the output:
#!/usr/bin/env python3
from smartcard.System import *
from smartcard.util import *
r = readers()
connection = r[0].createConnection()
connection.connect()
print("Selecting ICCID File")
data, sw1, sw2 = connection.transmit(toBytes('00a40000022fe2'))
print("Returned data: " + str(data))
print("Returned Status Word 1: " + str(hex(sw1)))
print("Returned Status Word 2: " + str(hex(sw2)))
And we have read the ICCID of the card.
Phew.
That’s the hardest thing we’ll need to do over.
From now on we’ll be building the concepts we covered here to build other APDUs to get our cards to do useful things. Now you’ve got the basics of how to structure an APDU down, the rest is just changing values here and there to get what you want.
In our next post we’ll read a few more files, write some files and delve a bit deeper into exactly what it is we are doing.
Australia is a strange country; As a kid I was scared of dogs, and in response, our family got a dog.
This year started off with adventures working with ASN.1 encoded data, and after a week of banging my head against the table, I was scared of ASN.1 encoding.
But now I love dogs, and slowly, I’m learning to embrace ASN.1 encoding.
What is ASN.1?
ASN.1 is an encoding scheme.
The best analogy I can give is to image a sheet of paper with a form on it, the form has fields for all the different bits of data it needs,
Each of the fields on the form has a data type, and the box is sized to restrict input, and some fields are mandatory.
Now imagine you take this form and cut a hole where each of the text boxes would be.
We’ve made a key that can be laid on top of a blank sheet of paper, then we can fill the details through the key onto the blank paper and reuse the key over and over again to fill the data out many times.
When we remove the key off the top of our paper, and what we have left on the paper below is the data from the form. Without the key on top this data doesn’t make much sense, but we can always add the key back and presto it’s back to making sense.
While this may seem kind of pointless let’s look at the advantages of this method;
The data is validated by the key – People can’t put a name wherever, and country code anywhere, it’s got to be structured as per our requirements. And if we tried to enter a birthday through the key form onto the paper below, we couldn’t.
The data is as small as can be – Without all the metadata on the key above, such as the name of the field, the paper below contains only the pertinent information, and if a field is left blank it doesn’t take up any space at all on the paper.
It’s these two things, rigidly defined data structures (no room for errors or misinterpretation) and the minimal size on the wire (saves bandwidth), that led to 3GPP selecting ASN.1 encoding for many of it’s protocols, such as S1, NAS, SBc, X2, etc.
It’s also these two things that make ASN.1 kind of a jerk; If the data structure you’re feeding into your ASN.1 compiler does not match it will flat-out refuse to compile, and there’s no way to make sense of the data in its raw form.
But working with a super simple ASN.1 definition you’ve created is one thing, using the 3GPP defined ASN.1 definitions is another,
With the aid of the fantastic PyCrate library, which is where the real magic happens, and this was the nut I cracked this week, compiling a 3GPP ASN.1 definition and communicating a standards-based protocol with it.
One of the key advantages of SCTP over TCP is the support for Multihoming,
From an application perspective, this enables one “socket”, to be shared across multiple IP Addresses, allowing multiple IP paths and physical NICs.
Through multihoming we can protect against failures in IP Routing and physical links, from a transport layer protocol.
So let’s take a look at how this actually works,
For starters there’s a few ways multihoming can be implemented, ideally we have multiple IPs on both ends (“client” and “server”), but this isn’t always achievable, so SCTP supports partial multi-homing, where for example the client has only one IP but can contact the server on multiple IP Addresses, and visa-versa.
The below image (Courtesy of Wikimedia) shows the ideal scenario, where both the client and the server have multiple IPs they can be reached on.
This would mean a failure of any one of the IP Addresses or the routing between them, would see the other secondary IP Addresses used for Transport, and the application not even necessarily aware of the interruption to the primary IP Path.
The Process
For starters, our SCTP Client/Server will each need to be aware of the IPs that can be used,
This is advertised in the INIT message, sent by the “client” to a “server” when the SCTP session is established.
SCTP INIT sent by the client at 10.0.1.185, but advertising two IPs
In the above screenshot we can see the two IPs for SCTP to use, the primary IP is the first one (10.0.1.185) and also the from IP, and there is just one additional IP (10.0.1.187) although there could be more.
In a production environment you’d want to ensure each of your IPs is in a different subnet, with different paths, hardware and routes.
So the INIT is then responded to by the client with an INIT_ACK, and this time the server advertises it’s IP addresses, the primary IP is the From IP address (10.0.1.252) and there is just one additional IP of 10.0.1.99,
SCTP INIT ACK showing Server’s Multi-homed IP Options
Next up we have the cookie exchange, which is used to protect against synchronization attacks, and then our SCTP session is up.
So what happens at this point? How do we know if a path is up and working?
Well the answer is heartbeat messages,
Sent from each of the IPs on the client to each of the IPs on the server, to make sure that there’s a path from every IP, to every other IP.
SCTP Heartbeats from each local IP to each remote IP
This means the SCTP stacks knows if a path fails, for example if the route to IP 10.0.1.252 on the server were to fail, the SCTP stack knows it has another option, 10.0.1.99, which it’s been monitoring.
So that’s multi-homed SCTP in action – While a lot of work has historically been done with LACP for aggregating multiple NICs together, and VRRP for ensuring a host is alive, SCTP handles this in a clean and efficient way.
I’ve attached a PCAP showing multi-homing on a Diameter S6a (HSS) interface between an MME and a HSS.
So dedicated appliances are dead and all our network functions are VMs or Containers, but there’s a performance hit when going virtual as the L2 processing has to be handled by the Hypervisor before being passed onto the relevant VM / Container.
If we have a 10Gb NIC in our server, we want to achieve a 10Gbps “Line Speed” on the Network Element / VNF we’re running on.
When we talked about appliances if you purchased an P-GW with 10Gbps NIC, it was a given you could get 10Gbps through it (without DPI, etc), but when we talk about virtualized network functions / network elements there’s a very real chance you won’t achieve the “line speed” of your interfaces without some help.
When you’ve got a Network Element like a S-GW, P-GW or UPF, you want to forward packets as quickly as possible – bottlenecks here would impact the user’s achievable speeds on the network.
To speed things up there are two technologies, that if supported by your software stack and hardware, allows you to significantly increase throughput on network interfaces, DPDK & SR-IOV.
DPDK – Data Plane Development Kit
Usually *Nix OSs handle packet processing on the Kernel level. As I type this the packets being sent to this WordPress server by Firefox are being handled by the Linux 5.8.0-36-generic kernel running on my machine.
The problem is the kernel has other things to do (interrupts), meaning increased delay in processing (due to waiting for processing capability) and decreased capacity.
DPDK shunts this processing to the “user space” meaning your application (the actual magic of the VNF / Network Element) controls it.
To go back to me writing this – If Firefox and my laptop supported DPDK, then the packets wouldn’t traverse the Linux kernel at all, and Firefox would be talking directly to my NIC. (Obviously this isn’t the case…)
So DPDK increases network performance by shifting the processing of packets to the application, bypassing the kernel altogether. You are still limited by the CPU and Memory available, but with enough of each you should reach very near to line speed.
SR-IOV – Single Root Input Output Virtualization
Going back to the me writing this analogy I’m running Linux on my laptop, but let’s imagine I’m running a VM running Firefox under Linux to write this.
If that’s the case then we have an even more convolted packet processing chain!
I type the post into Firefox which sends the packets to the Linux kernel, which waits to be scheduled resources by the hypervisor, which then process the packets in the hypervisor kernel before finally making it onto the NIC.
We could add DPDK which skips some of these steps, but we’d still have the bottleneck of the hypervisor.
With PCIe passthrough we could pass the NIC directly to the VM running the Firefox browser window I’m typing this, but then we have a problem, no other VMs can access these resources.
SR-IOV provides an interface to passthrough PCIe to VMs by slicing the PCIe interface up and then passing it through.
My VM would be able to access the PCIe side of the NIC, but so would other VMs.
So that’s the short of it, SR-IOR and DPDK enable better packet forwarding speeds on VNFs.
While we’ve already covered the inputs required by the authentication elements of the core network (The HSS in LTE/4G, the AuC in UMTS/3G and the AUSF in 5G) to generate an output, it’s worth noting that the Confidentiality Algorithms used in the process determines the output.
This means the Authentication Vector (Also known as an F1 and F1*) generated for a subscriber using Milenage Confidentiality Algorithms will generate a different output to that of Confidentiality Algorithms XOR or Comp128.
To put it another way – given the same input of K key, OPc Key (or OP key), SQN & RAND (Random) a run with Milenage (F1 and F1* algorithm) would yield totally different result (AUTN & XRES) to the same inputs run with a simple XOR.
Technically, as operators control the network element that generates the challenges, and the USIM that responds to them, it is an option for an operator to implement their own Confidentiality Algorithms (Beyond just Milenage or XOR) so long as it produced the same number of outputs. But rolling your own cryptographic anything is almost always a terrible idea.
So what are the differences between the Confidentiality Algorithms and which one to use? Spoiler alert, the answer is Milenage.
Milenage
Milenage is based on AES (Originally called Rijndael) and is (compared to a lot of other crypto implimentations) fairly easy to understand,
AES is very well studied and understood and unlike Comp128 variants, is open for anyone to study/analyse/break, although AES is not without shortcomings, it’s problems are at this stage, fairly well understood and mitigated.
There are a few clean open source examples of Milenage implementations, such as this C example from FreeBSD.
XOR
It took me a while to find the specifications for the XOR algorithm – it turns out XOR is available as an alternate to Milenage available on some SIM cards for testing only, and the mechanism for XOR Confidentiality Algorithm is only employed in testing scenarios, not designed for production.
Instead of using AES under the hood like Milenage, it’s just plan old XOR of the keys.
Comp128 was originally a closed source algorithm, with the maths behind it not publicly available to scrutinise. It is used in GSM A3 and A5 functions, akin to the F1 and F1* in later releases.
Due to its secretive nature it wasn’t able to be studied or analysed prior to deployment, with the idea that if you never said how your crypto worked no one would be able to break it. Spoiler alert; public weaknesses became exposed as far back as 1998, which led to Toll Fraud, SIM cloning and eventually the development of two additional variants, with the original Comp128 renamed Comp128-1, and Comp128-2 (stronger algorithm than the original addressing a few of its flaws) and Comp128-3 (Same as Comp128-2 but with a 64 bit long key generated).
Our BTS is going to need an accurate clock source in order to run, so without access to crazy accurate Timing over Packet systems or TDM links to use as reference sources, I’ve opted to use the GPS/GLONASS receiver built into the LMPT card.
Add new GPS with ID 0 on LMPT in slot 7 of cabinet 1:
Check GPS has sync (May take some time) using the Display GPS command;
DSP GPS: GN=0;
Assuming you’ve got an antenna connected and can see the sky, after ~10 minutes running the DSP GPS:; command again should show you an output like this:
+++ 4-PAL0089624 2020-11-28 01:06:55
O&M #806355684
%%DSP GPS: GN=0;%%
RETCODE = 0 Operation succeeded.
Display GPS State
-----------------
GPS Clock No. = 0
GPS Card State = Normal
GPS Card Type = M12M
GPS Work Mode = GPS
Hold Status = UNHOLDED
GPS Satellites Traced = 4
GLONASS Satellites Traced = 0
BDS Satellites Traced = 0
Antenna Longitude(1e-6 degree) = 144599999
Antenna Latitude(1e-6 degree) = -37000000
Antenna Altitude(m) = 613
Antenna Angle(degree) = 5
Link Active State = Activated
Feeder Delay(ns) = 15
GPS Version = NULL
(Number of results = 1)
--- END
Showing the GPS has got sync and a location fix,
Next we set BTS to use GPS as time source,
SET TIMESRC: TIMESRC=GPS;
Finally we’ll verify the Time is in sync on the BTS using the list time command:
DSP TIME:;
+++ 4-PAL0089624 2020-11-28 01:09:22
O&M #806355690
%%DSP TIME:;%%
RETCODE = 0 Operation succeeded.
Time Information
----------------
Time = 2020-11-28 01:09:22 GMT+00:00
--- END
Optionally you may wish to add a timezone, using the SET TZ:; command, but I’ve opted to keep it in UTC for simplicity.
Want more telecom goodness?
I have a good old fashioned RSS feed you can subscribe to.