Category Archives: LTE

3GPP Long Term Evolution (4G)

SCTP Parameter Tuning

There’s a lot to like about SCTP. No head of line blocking, MTU issues, sequenced, acknowledged delivery of messages, not to mention Multi-Homing and message bundling.

But if you really want to get the most bang for your buck, you’ll need to tune your SCTP parameters to match the network conditions.

While tuning the parameters per-association would be time consuming, most SCTP stacks allow you to set templates for SCTP parameters, for example you would have a different set of parameters for the SCTP stacks inside your network, compared to SCTP stacks for say a roaming scenario or across microwave links.

IETF kindly provides a table with their recommended starting values for SCTP parameter tuning:

RTO.Initial3 seconds
RTO.Min1 second
RTO.Max60 seconds
Max.Burst4
RTO.Alpha1/8
RTO.Beta1/4
Valid.Cookie.Life60 seconds
Association.Max.Retrans10 attempts
Path.Max.Retrans5 attempts (per destination address)
Max.Init.Retransmits8 attempts
HB.interval30 seconds
HB.Max.Burst1
IETF – RFC4960: SCTP – Suggested Protocol Parameter Values

But by adjusting the Max Retrans and Retransmission Timeout (RTO) values, we can detect failures on the network more quickly, and reduce the number of packets we’ll loose should we have a failure.

We begin with the engineered round-trip time (RTT) – that is made up of the time it takes to traverse the link, processing time for the remote SCTP stack and time for the response to traverse the link again. For the examples below we’ll take an imaginary engineered RTT of 200ms.

RTO.min is the minimum retransmission timeout.
If this value is set too low then before the other side has had time to receive the request, process it and send a response, we’ve already retransmitted it.

This should be set to the round trip delay plus processing needed to send and acknowledge a packet plus some allowance for variability due to jitter; a value of 1.15 times the Engineered RTT is often chosen

So for us, 200 * 1.15 = 230ms RTO.min value.

RTO.max is the maximum amount of time we should wait before transmitting a request.
Typically three times the Engineered RTT.

So for us, 200 * 3 = 600ms RTO.min value.

Path.Max.Retransmissions is the maximum number of retransmissions to be sent down a path before the path is considered to be failed.
For example if we loose a transmission path on a multi-homed server, how many retransmissions along that path should we send until we consider it to be down?

Values set are dependant on if you’re multi-homing or not (you can be more picky if you are) and the level of acceptable packet loss in your transmission link.

Typical values are 4 Retransmissions (per destination address) for a Single-Homed association, and 2 Retransmissions (per destination address) for a Multi-Homed association.

Association.Max.Retransmissions is the maximum number of retransmissions for an association. If a transmission link in a multi-homed SCTP scenario were to go down, we would pass the Path.Max.Retransmissions value and the SCTP stack would stop sending traffic out that path, and try another, but what if the remote side is down? In that scenario all our paths would fail, so we need another counter – Path.Max.Retransmissions to count the total number of retransmissions to an association / destination. When the Association.Max.Retransmissions is reached the association is considered down.

In practice this value would be the number of paths, multiplied by the Path.Max.Retransmissions.

IMS Routing with iFCs

SIP routing is complicated, there’s edge cases, traffic that can be switched locally and other traffic that needs to be proxied off to another Proxy or Application server. How can you define these rules and logic in a flexible way, that allows these rules to be distributed out to multiple different network elements and adjusted on a per-subscriber basis?

Enter iFCs – The Initial Filter Criteria.

iFCs are XML encoded rules to define which servers should handle traffic matching a set of rules.

Let’s look at some example rules we might want to handle through iFCs:

  • Send all SIP NOTIFY, SUBSCRIBE and PUBLISH requests to a presence server
  • Any Mobile Originated SMS to an SMSc
  • Calls to a specific destination to a MGC
  • Route any SIP INVITE requests with video codecs present to a VC bridge
  • Send calls to Subscribers who aren’t registered to a Voicemail server
  • Use 3rd party registration to alert a server that a Subscriber has registered

All of these can be defined and executed through iFCs, so let’s take a look,

iFC Structure

iFCs are encoded in XML and typically contained in the Cx-user-data AVP presented in a Cx Server Assignment Answer response.

Let’s take a look at an example iFC and then break down the details as to what we’re specifying.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

Each rule in an iFC is made up of a Priority, TriggerPoint and ApplicationServer.

So for starters we’ll look at the Priority tag.
The Priority tag allows us to have multiple-tiers of priority and multiple levels of matching,
For example if we had traffic matching the conditions outlined in this rule (TriggerPoint) but also matching another rule with a lower priority, the lower priority rule would take precedence.

Inside our <TriggerPoint> tag contains the specifics of the rules and how the rules will be joined / matched, which is what we’ll focus on predominantly, and is followed by the <ApplicationServer> which is where we will route the traffic to if the TriggerPoint is matched / triggered.

So let’s look a bit more about what’s going on inside the TriggerPoint.

Each TriggerPoint is made up of Service Point Trigger (SPTs) which are individual rules that are matched or not matched, that are either combined as logical AND or logical OR statements when evaluated.

By using fairly simple building blocks of SPTs we can create a complex set of rules by joining them together.

Service Point Triggers (SPTs)

Let’s take a closer look at what goes on in an SPT.
Below is a simple SPT that will match all SIP requests using the SIP MESSAGE method request type:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

So as you may have guessed, the <Method> tag inside the SPT defines what SIP request method we’re going to match.

But Method is only one example of the matching mechanism we can use, but we can also match on other attributes, such as Request URI, SIP Header, Session Case (Mobile Originated vs Mobile Terminated) and Session Description such as SDP.

Or an example of a SPT for anything Originating from the Subscriber utilizing the <SessionCase> tag inside the SPT.

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SessionCase>0</SessionCase>
        </SPT>

Below is another SPT that’s matching any requests where the request URI is sip:[email protected] by setting the <RequestURI> tag inside the SPT:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

We can match SIP headers, either looking for the existence of a header or the value it is set too,

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SIPHeader>
              <Header>To</Header>
              <Content>"Nick"</Content>
            </SIPHeader>
        </SPT>

Having <Header> will match if the header is present, while the optional Content tag can be used to match

In terms of the Content this is matched using Regular Expressions, but in this case, not so regular regular expressions. 3GPP selected Extended Regular Expressions (ERE) to be used (IEEE POSIX) which are similar to the de facto standard PCRE Regex, but with a few fewer parameters.

Condition Negated

The <ConditionNegated> tag inside the SPT allows us to do an inverse match.

In short it will match anything other than what is specified in the SPT.

For example if we wanted to match any SIP Methods other than MESSAGE, setting <ConditionNegated>1</ConditionNegated> would do just that, as shown below:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

And another example of ConditionNegated in use, this time we’re matching anything where the Request URI is not sip:[email protected]:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

Finally the <Group> tag allows us to group together a group of rules for the purpose of evaluating.
We’ll go into it more in in the below section.

ConditionTypeCNF / ConditionTypeDNF

As we touched on earlier, <TriggerPoints> contain all the SPTs, but also, very importantly, specify how they will be interpreted.

SPTs can be joined in AND or OR conditions.

For some scenarios we may want to match where METHOD is MESSAGE and RequestURI is sip:[email protected], which is different to matching where the METHOD is MESSAGE or RequestURI is sip:[email protected].

This behaviour is set by the presence of one of the ConditionTypeCNF (Conjunctive Normal Form) or ConditionTypeDNF (Disjunctive Normal Form) tags.

If each SPT has a unique number in the GroupTag and ConditionTypeCNF is set then we evaluate as AND.

If each SPT has a unique number in the GroupTag and ConditionTypeDNF is set then we evaluate as OR.

Let’s look at how the below rule is evaluated as AND as ConditionTypeCNF is set:

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated) as each SPT is in a different Group which leads to “and” behaviour.

If we were to flip to ConditionTypeDNF each of the SPTs are evaluated as OR.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeDNF>1</ConditionTypeDNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated).

Where this gets a little bit more complex is when we have multiple entries in the same Group tag.

Let’s say we have a trigger point made up of:

<SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
<SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

<SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

How would this be evaluated?

If we use ConditionTypeDNF every SPT inside the same Group are matched as AND, and SPTs with distinct are matched as OR.

Let’s look at our example rule evaluated as ConditionTypeDNF:

<ConditionTypeDNF>1</ConditionTypeDNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as AND – So Method is message and Session Case is 0, OR the header “P-Some-Header” is present.

Let’s do another one, this time as ConditionTypeCNF:

<ConditionTypeCNF>1</ConditionTypeCNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as OR – So Method is message OR Session Case is 0, AND the header “P-Some-Header” is present.

Pre-5G Network Slicing

Network Slicing, is a new 5G Technology. Or is it?

Pre 3GPP Release 16 the capability to “Slice” a network already existed, in fact the functionality was introduced way back at the advent of GPRS, so what is so new about 5G’s Network Slicing?

Network Slice: A logical network that provides specific network capabilities and network characteristics

3GPP TS 123 501 / 3 Definitions and Abbreviations

Let’s look at the old and the new ways, of slicing up networks, pre release 16, on LTE, UMTS and GSM.

Old Ways: APN Separation

The APN or “Access Point Name” is used so the SGSN / MME knows which gateway to that subscriber’s traffic should be terminated on when setting up the session.

APN separation is used heavily by MVNOs where the MVNO operates their own P-GW / GGSN.
This allows the MNVO can handle their own rating / billing / subscriber management when it comes to data.
A network operator just needs to setup their SGSN / MME to point all requests to setup a bearer on the MVNO’s APN to the MNVO’s gateways, and presoto, it’s no longer their problem.

Later as customers wanted MPLS solutions extended over mobile (Typically LTE), MNOs were able to offer “private APNs”.
An enterprise could be allocated an APN by the MNO that would ensure traffic on that APN would be routed into the enterprise’s MPLS VRF.
The MNO handles the P-GW / GGSN side of things, adding the APN configuration onto it and ensuring the traffic on that APN is routed into the enterprise’s VRF.

Different QCI values can be assigned to each APN, to allow some to have higher priority than others, but by slicing at an APN level you lock all traffic to those QoS characteristics (Typically mobile devices only support one primary APN used for routing all traffic), and don’t have the flexibility to steer which networks which traffic from a subscriber goes to.

It’s not really practical for everyone to have their own APNs, due in part to the namespace limitations, the architecture of how this is usually done limits this, and the simple fact of everyone having to populate an APN unique to them would be a real headache.

5G replaces APNs with “DNNs” – Data Network Names, but the functionality is otherwise the same.

In Summary:
APN separation slices all traffic from a subscriber using a special APN and provide a bearer with QoS/QCI values set for that APN, but does not allow granular slicing of individual traffic flows, it’s an all-or-nothing approach and all traffic in the APN is treated equally.

The old Ways: Dedicated Bearers

Dedicated bearers allow traffic matching a set rule to be provided a lower QCI value than the default bearer. This allows certain traffic to/from a UE to use GBR or Non-GBR bearers for traffic matching the rule.

The rule itself is known as a “TFT” (Traffic Flow Template) and is made up of a 5 value Tuple consisting of IP Source, IP Destination, Source Port, Destination Port & Protocol Number. Both the UE and core network need to be aware of these TFTs, so the traffic matching the TFT can get the QCI allocated to it.

This can be done a variety of different ways, in LTE this ranges from rules defined in a PCRF or an external interface like those of an IMS network using the Rx interface to request a dedicated bearers matching the specified TFTs via the PCRF.

Unlike with 5G network slicing, dedicated bearers still traverse the same network elements, the same MME, S-GW & P-GW is used for this traffic. This means you can’t “locally break out” certain traffic.

In Summary:
Dedicated bearers allow you to treat certain traffic to/from subscribers with different precedence & priority, but the traffic still takes the same path to it’s ultimate destination.

Old Ways: MOCN

Multi-Operator Core Network (MOCN) allows multiple MNOs to share the same active (tower) infrastructure.

This means one eNodeB can broadcast more than one PLMN and server more than one mobile network.

This slicing is very coarse – it allows two operators to share the same eNodeBs, but going beyond a handful of PLMNs on one eNB isn’t practical, and the PLMN space is quite limited (1000 PLMNs per country code max).

In Summary:
MOCN allows slicing of the RAN on a very coarse level, to slice traffic from different operators/PLMNs sharing the same RAN.

Its use is focused on sharing RAN rather than slicing traffic for users.

Diameter Droplets – The Flow-Description AVP and IPFilterRules

When it comes to setting up dedicated bearers, the Flow-Description AVP is perhaps the most important,

The specially encoded string (IPFilterRule) in the FlowDescription AVP is what our P-GW (Ok, our PCEF) uses to create Traffic Flow Templates to steer certain types of traffic down Dedicated Bearers.

So let’s take a look at how we can lovingly craft an artisanal Flow-Description.

The contents of the AVP are technically not a string, but a IPFilterRule.

IPFilterRules are actually defined in the Diameter Base Protocol (IETF RFC 6733), where we can learn the basics of encoding them,

Which are in turn based loosely off the ipfw utility in BSD.

They take the format:

action dir proto from src to dst

The action is fairly simple, for all our Dedicated Bearer needs, and the Flow-Description AVP, the action is going to be permit. We’re not blocking here.

The direction (dir) in our case is either in or out, from the perspective of the UE.

Next up is the protocol number (proto), as defined by IANA, but chances are you’ll be using 17 (UDP) or 6 (TCP) in most scenarios.

The from value is followed by an IP address with an optional subnet mask in CIDR format, for example from 10.45.0.0/16 would match everything in the 10.45.0.0/16 network.
Following from you can also specify the port you want the rule to apply to, or, a range of ports,
For example to match a single port you could use 10.45.0.0/16 1234 to match anything on port 1234, but we can also specify ranges of ports like 10.45.0.0/16 0 – 4069 or even mix and match lists and single ports, like 10.45.0.0/16 5060, 1000-2000

Protip: using any is the same as 0.0.0.0/0

Like the from, the to is encoded in the same way, with either a single IP, or a subnet, and optional ports specified.

And that’s it!

Keep in mind that Flow-Descriptions are typically sent in pairs as a minimum, as you want to match the traffic into and out of the network (not just one way), but often there can be quite a few sent, in order to match all the possible traffic that needs to be matched that may be across multiple different subnets, etc.

There is an optional Options parameter that allows you to set things like to only apply the rule to open TCP sessions, fragmentation, etc, although I’ve not seen this implemented in the wild.

Example IP filter Rules

permit in 6 from 10.98.254.0/24 5061 to 10.98.0.0/24 5060
permit out 6 from 10.98.254.0/24 5060 to 10.98.0.0/24 5061

permit in 6 from any 80 to 172.16.1.1 80
permit out 6 from 172.16.1.1 80 to any 80

permit in 17 from 10.98.254.0/24 50000-60100 to 10.98.0.0/24 50000-60100
permit out 17 from 10.98.254.0/24 50000-60100 to 10.98.0.0/24 50000-60100

permit in 17 from 10.98.254.0/24 5061, 5064 to 10.98.0.0/24  5061, 5064
permit out 17 from 10.98.254.0/24 5061, 5064 to 10.98.0.0/24  5061, 5064

permit in 17 from 172.16.0.0/16 50000-60100, 5061, 5064 to 172.16.0.0/16  50000-60100, 5061, 5064
permit out 17 from 172.16.0.0/16 50000-60100, 5061, 5064 to 172.16.0.0/16  50000-60100, 5061, 5064

For more info see:

RFC 6773 – Diameter Base Protocol – IP Filter Rule

3GPP TS 29.214 section 5.3.8 Flow-Description AVP

The Surprisingly Complicated world of MO SMS in IMS/VoLTE

Since the beginning of time, SIP has used the 2xx responses to confirm all went OK.

If you thought sending an SMS in a VoLTE/IMS network would see a 2xx OK response and then that’s the end of it, you’d be wrong.

So let’s take a look into sending SMS over VoLTE/IMS networks!

So our story starts with the Subscriber sending an SMS, which generate a SIP MESSAGE.

The Content-Type of this SIP MESSAGE is set to application/vnd.3gpp.sms rather than Text, and that’s because SMS over IMS uses the Short Message Transfer Protocol (SM-TP) inherited from GSM.

The Short Message Transfer Protocol (SM-TP) (Not related to Simple Message Transfer Protocol used in Email clients) is made up of Transfer Protocol Data Units (TPDU) that contain our message information, even though we have the Destination in our SIP headers, it’s again defined in the SM-TP body.

At first this may seem like a bit of duplication, but this allows older SMS Switching Centers (SMSc) to add support for IMS networks without any major changes, just what the SM-TP payload is wrapped up in changes.

SIP MESSAGE Request Body encoded in SM-TP

So back to our SIP MESSAGE request, typed out by the Subscriber, the UE sends this a SIP MESSAGE onto our IMS Network.

The IMS network follows it’s IFCs and routing rules, and makes it to the termination points for SMS traffic – the SMSc.

The SMSc sends back either a 200 OK or a 202 Accepted, and you’d think that’s the end of it, but no.

Our Subscriber still sees “Sending” on the screen, and the SMS is not shown as sent yet.

Instead, when the SMS has been delivered or buffered, relayed, etc, the SMSc generates a new SIP request, (as in new Call-ID / Dialog) with the request type MESSAGE, addressed to the Subscriber.

The payload of this request is another application/vnd.3gpp.sms encoded request body, again, containing SM-TP encoded data.

When the UE receives this, it will then consider the message delivered.

SM-TP encoded Delivery Report

Of course things change slightly when delivery reports are enabled, but that’s another story!

Open5Gs Logo

Open5Gs Database Schema Change

As Open5Gs has introduced network slicing, which led to a change in the database used,

Alas many users had subscribers provisioned in the old DB schema and no way to migrate the SDM data between the old and new schema,

If you’ve created subscribers on the old schema, and now after the updates your Subscriber Authentication is failing, check out this tool I put together, to migrate your data over.

The Open5Gs Python library I wrote has also been updated to support the new schema.

A very unstable Diameter Routing Agent (DRA) with Kamailio

I’d been trying for some time to get Kamailio acting as a Diameter Routing Agent with mixed success, and eventually got it working, after a few changes to the codebase of the ims_diameter_server module.

It is rather unstable, in that if it fails to dispatch to a Diameter peer, the whole thing comes crumbling down, but incoming Diameter traffic is proxied off to another Diameter peer, and Kamailio even adds an extra AVP.

Having used Kamailio for so long I was really hoping I could work with Kamailio as a DRA as easily as I do for SIP traffic, but it seems the Diameter module still needs a lot more love before it’ll be stable enough and simple enough for everyone to use.

I created a branch containing the fixes I made to make it work, and with an example config for use, but use with caution. It’s a long way from being production-ready, but hopefully in time will evolve.

https://github.com/nickvsnetworking/kamailio/tree/Diameter_Fix

PyHSS Update – YAML Config Files

One feature I’m pretty excited to share is the addition of a single config file for defining how PyHSS functions,

In the past you’d set variables in the code or comment out sections to change behaviour, which, let’s face it – isn’t great.

Instead the config.yaml file defines the PLMN, transport time (TCP or SCTP), the origin host and realm.

We can also set the logging parameters, SNMP info and the database backend to be used,

HSS Parameters
 hss:
   transport: "SCTP"
   #IP Addresses to bind on (List) - For TCP only the first IP is used, for SCTP all used for Transport (Multihomed).
   bind_ip: ["10.0.1.252"]
 #Port to listen on (Same for TCP & SCTP)
   bind_port: 3868
 #Value to populate as the OriginHost in Diameter responses
   OriginHost: "hss.localdomain"
 #Value to populate as the OriginRealm in Diameter responses
   OriginRealm: "localdomain"
 #Value to populate as the Product name in Diameter responses
   ProductName: "pyHSS"
 #Your Home Mobile Country Code (Used for PLMN calcluation)
   MCC: "999"
   #Your Home Mobile Network Code (Used for PLMN calcluation)
   MNC: "99"
 #Enable GMLC / SLh Interface
   SLh_enabled: True


 logging:
   level: DEBUG
   logfiles:
     hss_logging_file: log/hss.log
     diameter_logging_file: log/diameter.log
     database_logging_file: log/db.log
   log_to_terminal: true

 database:
   mongodb:
     mongodb_server: 127.0.0.1
     mongodb_username: root
     mongodb_password: password
     mongodb_port: 27017

 Stats Parameters
 redis:
   enabled: True
   clear_stats_on_boot: False
   host: localhost
   port: 6379
 snmp:
   port: 1161
   listen_address: 127.0.0.1
MSISDN Encoding - Brought to you by the letter F

MSISDN Encoding in Diameter AVPs – Brought to you by the letter F

So this one knocked me for six the other day,

MSISDN AVP 700 / vendor ID 10415, used to advertise the subscriber’s MSISDN in signaling.

I formatted the data as an Octet String, with the MSISDN from the database and moved on my merry way.

Not so fast…

The MSISDN AVP is of type OctetString.

This AVP contains an MSISDN, in international number format as described in ITU-T Rec E.164 [8], encoded as a TBCD-string, i.e. digits from 0 through 9 are encoded 0000 to 1001;

1111 is used as a filler when there is an odd number of digits; bits 8 to 5 of octet n encode digit 2n; bits 4 to 1 of octet n encode digit 2(n-1)+1.

ETSI TS 129 329 / 6.3.2 MSISDN AVP

Come again?

In practice this means if you have an odd lengthed MSISDN value, we need to add some padding to round it out to an even-lengthed value.

This padding happens between the last and second last digit of the MSISDN (because if we added it at the start we’d break the Country Code, etc) and as MSISDNs are variable length subscriber numbers.

1111 in octet string is best known as the letter F,

Not that complicated, just kind of confusing.

PyHSS Update – SCTP Support

Pleased to announce that PyHSS now supports SCTP for transport.

If you’re not already aware SCTP is the surprisingly attractive cousin of TCP, that addresses head of line blocking and enables multi-homing,

The fantastic PySCTP library from P1sec made adding this feature a snap. If you’re looking to add SCTP to a Python project, it’s surprisingly easy,

A seperate server (hss_sctp.py) is run to handle SCTP connections, and if you’re looking for Multihoming, we got you dawg – Just edit the config file and set the bind_ip list to include each of your IPs to multi home listen on.

And the call was coming from… INSIDE THE HOUSE. A look at finding UE Locations in LTE

Opening Tirade

Ok, admittedly I haven’t actually seen “When a Stranger Calls”, or the less popular sequel “When a stranger Redials” (Ok may have made the last one up).

But the premise (as I read Wikipedia) is that the babysitter gets the call on the landline, and the police trace the call as originating from the landline.

But you can’t phone yourself, that’s not how local loops work – When the murderer goes off hook it loops the circuit, which busys it. You could apply ring current to the line I guess externally but unless our murder has a Ring generator or has setup a PBX inside the house, the call probably isn’t coming from inside the house.

On Topic – The GMLC

The GMLC (Gateway Mobile Location Centre) is a central server that’s used to locate subscribers within the network on different RATs (GSM/UMTS/LTE/NR).

The GMLC typically has interfaces to each of the radio access technologies, there is a link between the GMLC and the CS network elements (used for GSM/UMTS) such as the HLR, MSC & SGSN via Lh & Lg interfaces, and a link to the PS network elements (LTE/NR) via Diameter based SLh and SLg interfaces with the MME and HSS.

The GMLC’s tentacles run out to each of these network elements so it can query them as to a subscriber’s location,

LTE Call Flow

To find a subscriber’s location in LTE Diameter based signaling is used, to query the MME which in turn queries, the eNodeB to find the location.

But which MME to query?

The SLh Diameter interface is used to query the HSS to find out which MME is serving a particular Subscriber (identified by IMSI or MSISDN).

The LCS-Routing-Info-Request is sent by the GMLC to the HSS with the subscriber identifier, and the LCS-Routing-Info-Response is returned by the HSS to the GMLC with the details of the MME serving the subscriber.

Now we’ve got the serving MME, we can use the SLg Diameter interface to query the MME to the location of that particular subscriber.

The MME can report locations to the GMLC periodically, or the GMLC can request the MME provide a location at that point.
For the GMLC to request a subscriber’s current location a Provide-Location-Request is set by the GMLC to the MME with the subscriber’s IMSI, and the MME responds after querying the eNodeB and optionally the UE, with the location info in the Provide-Location-Response.

(I’m in the process of adding support for these interfaces to PyHSS and all going well will release some software shortly to act at a GMLC so people can use this.)

Finding the actual Location

There are a few different ways the actual location of the UE is determined,

At the most basic level, Cell Global Identity (CGI) gives the identity of the eNodeB serving a user.
If you’ve got a 3 sector site each sector typically has its own Cell Global Identity, so you can determine to a certain extent, with the known radiation pattern, bearing and location of the sector, in which direction a subscriber is. This happens on the network side and doesn’t require any input from the UE.
But if we query the UE’s signal strength, this can then be combined with existing RF models and the signal strength reported by the UE to further pinpoint the user with a bit more accuracy. (Uplink and downlink cell coverage based positioning methods)
Barometric pressure and humidity can also be reported by the base station as these factors will impact resulting signal strengths.

Timing Advance (TA) and Time of Arrival (TOA) both rely on timing signals to/from a UE to determine it’s distance from the eNodeB. If the UE is only served by a single cell this gives you a distance from the cell and potentially an angle inside which the subscriber is. This becomes far more useful with 3 or more eNodeBs in working range of the UE, where you can “triangulate” the UE’s location. This part happens on the network side with no interaction with the UE.
If the UE supports it, EUTRAN can uses Enhanced Observed Time Difference (E-OTD) positioning method, which does TOD calcuation does this in conjunction with the UE.

GPS Assisted (A-GPS) positioning gives good accuracy but requires the devices to get it’s current location using the GPS, which isn’t part of the baseband typically, so isn’t commonly implimented.

Uplink Time Difference of Arrival (UTDOA) can also be used, which is done by the network.

So why do we need to get Subscriber Locations?

The first (and most noble) use case that springs to mind is finding the location of a subscriber making a call to emergency services. Often upon calling an emergency services number the GMLC is triggered to get the subscriber’s location in case the call is cut off, battery dies, etc.

But GMLCs can also be used for lots of other purposes, marketing purposes (track a user’s location and send targeted ads), surveillance (track movements of people) and network analytics (look at subscriber movement / behavior in a specific area for capacity planning).

Different countries have different laws regulating access to the subscriber location functions.

Hack to disable Location Reporting on Mobile Networks

If you’re wondering how you can disable this functionality, you can try the below hack to ensure that your phone does not report your location.

  1. Press the power button on your phone
  2. Turn it off

In reality, no magic super stealth SIM cards, special phones or fancy firmware will prevent the GMLC from finding your location.
So far none of the “privacy” products I’ve looked at have actually done anything special at the Baseband level. Most are just snakeoil.

For as long as your device is connected to the network, the passive ways of determining location, such as Uplink Time Difference of Arrival (UTDOA) and the CGI are going to report your location.

PyHSS New Features

Thanks to some recent developments, PyHSS has had a major overhaul recently, and is getting better than ever,

Some features that are almost ready for public release are:

Config File

Instead of having everything defined all over the place a single YAML config file is used to define how the HSS should function.

SCTP Support

No longer just limited to TCP, PyHSS now supports SCTP as well for transport,

SLh Interface for Location Services

So the GMLC can query the HSS as to the serving MME of a subscriber.

Additional Database Backends (MSSQL & MySQL)

No longer limited to just MongoDB, simple functions to add additional backends too and flexible enough to meet your existing database schema.

All these features will be merged into the mainline soon, and documented even sooner. I’ll share some posts on each of these features as I go.

MTU in LTE & 5G Transmission Networks – Part 1

Every now and then when looking into a problem I have to really stop and think about how things work low down, that I haven’t thought about for a long time, and MTU is one of those things.

I faced with an LTE MTU issue recently I thought I’d go back and brush up on my MTU knowhow and do some experimenting.

Note: This is an IPv4 discussion, IPv6 does not support fragmentation.

The very, very basics

MTU is the Maximum Transmission Unit.

In practice this is the largest datagram the layer can handle, and more often than not, this is based on a physical layer constraint, in that different physical layers can only stuff so much into a frame.

“The Internet” from a consumer perspective typically has an MTU of 1500 bytes or perhaps a bit under depending on their carrier, such as 1472 bytes.
SANs in data centers typically use an MTU of around 9000 bytes,
Out of the box, most devices if you don’t specify, will use an MTU of 1500 bytes.

As a general rule, service providers typically try to offer an MTU as close to 1500 as possible.

Messages that are longer than the Maximum Transmission Unit need to be broken up in a process known as “Fragmenting”.
Fragmenting allows large frames to be split into smaller frames to make their way across hops with a lower MTU.

All about Fragmentation

So we can break up larger packets into smaller ones by Fragmenting them, so case closed on MTU right? Sadly not.

Fragmentation leads to reduced efficiency – Fragmenting frames takes up precious CPU cycles on the router performing it, and each time a frame is broken up, additional overhead is added by the device breaking it up, and by the receiver to reassemble it.

Fragmentation can happen multiple times across a path (Multi-Stage Fragmentation).
For example if a frame is sent with a length of 9000 bytes, and needs to traverse a hop with an MTU of 4000, it would need to be fragmented (broken up) into 3 frames (Frame 1 and Frame 2 would be ~4000 bytes long and frame 3 would be ~1000 bytes long).
If it then needs to traverse another hop with an MTU of 1500, then the 3 fragmented frame would each need to be further fragmented, with the first frame of ~4000 bytes being split up into 3 more fragmented frames.
Lost track of what just happened? Spare a thought for the routers having to to do the fragmentation and the recipient having to reassemble their packets.

Fragmented frames are reassembled by the end recipient, other devices along the transmission path don’t reassemble packets.

In the end it boils down to this trade off:
The larger the packet can be, the more user data we can stuff into each one as a percentage of the overall data. We want the percentage of user data for each packet to be as high as can be.
This means we want to use the largest MTU possible, without having to fragment packets.

Overhead eats into our MTU

A 1500 byte MTU that has to be encapsulated in IPsec, GTP or PPP, is no longer a 1500 byte MTU as far as the customer is concerned.

Any of these encapsulation techniques add overhead, which shrinks the MTU available to the end customer.

Keep in mind we’re going to be encapsulating our subscriber’s data in GTP before it’s transmitted across LTE/NR, and this means we’ll be adding:

  • 8 bytes for the GTP header
  • 8 bytes for the transport UDP header
  • 20 bytes for the transport IPv4 header
  • 14 bytes if our transport is using Ethernet

This means we’ve got 50 bytes of transmission / transport overhead. This will be important later on!

How do subscribers know what to use as MTU?

Typically when a subscriber buys a DSL service or HFC connection, they’ll either get a preconfigured router from their carrier, or they will be given a list of values to use that includes MTU.

LTE and 5G on the other hand tell us the value we should use.

Inside the Protocol Configuration Options in the NAS PDU, the UE requests the MTU and DNS server to be used, and is provided back from the network.

This MTU value is actually set on the MME, not the P-GW. As the MME doesn’t actually know the maximum MTU of the network, it’s up to the operator to configure this to be a value that represents the network.

Why this Matters for LTE & 5G Transmission

As we covered earlier, fragmentation is costly. If we’re fragmenting packets we are:

  • Wasting resources on our transmission network / core networks – as we fragment Subscriber packets it’s taking up compute resources and therefore limiting throughput
  • Wasting radio resources as additional overhead is introduced for fragmented packets, and additional RBs need to be scheduled to handle the fragmented packets

To test this I’ve setup a scenario in the lab, and we’ll look at the packet captures to see how the MTU is advertised, and see how big we can make our MTU on the subscriber side.

Getting the GTP-U Packets flowing Fast – DPDK & SR-IOV

So dedicated appliances are dead and all our network functions are VMs or Containers, but there’s a performance hit when going virtual as the L2 processing has to be handled by the Hypervisor before being passed onto the relevant VM / Container.

If we have a 10Gb NIC in our server, we want to achieve a 10Gbps “Line Speed” on the Network Element / VNF we’re running on.

When we talked about appliances if you purchased an P-GW with 10Gbps NIC, it was a given you could get 10Gbps through it (without DPI, etc), but when we talk about virtualized network functions / network elements there’s a very real chance you won’t achieve the “line speed” of your interfaces without some help.

When you’ve got a Network Element like a S-GW, P-GW or UPF, you want to forward packets as quickly as possible – bottlenecks here would impact the user’s achievable speeds on the network.

To speed things up there are two technologies, that if supported by your software stack and hardware, allows you to significantly increase throughput on network interfaces, DPDK & SR-IOV.

DPDK – Data Plane Development Kit

Usually *Nix OSs handle packet processing on the Kernel level. As I type this the packets being sent to this WordPress server by Firefox are being handled by the Linux 5.8.0-36-generic kernel running on my machine.

The problem is the kernel has other things to do (interrupts), meaning increased delay in processing (due to waiting for processing capability) and decreased capacity.

DPDK shunts this processing to the “user space” meaning your application (the actual magic of the VNF / Network Element) controls it.

To go back to me writing this – If Firefox and my laptop supported DPDK, then the packets wouldn’t traverse the Linux kernel at all, and Firefox would be talking directly to my NIC. (Obviously this isn’t the case…)

So DPDK increases network performance by shifting the processing of packets to the application, bypassing the kernel altogether. You are still limited by the CPU and Memory available, but with enough of each you should reach very near to line speed.

SR-IOV – Single Root Input Output Virtualization

Going back to the me writing this analogy I’m running Linux on my laptop, but let’s imagine I’m running a VM running Firefox under Linux to write this.

If that’s the case then we have an even more convolted packet processing chain!

I type the post into Firefox which sends the packets to the Linux kernel, which waits to be scheduled resources by the hypervisor, which then process the packets in the hypervisor kernel before finally making it onto the NIC.

We could add DPDK which skips some of these steps, but we’d still have the bottleneck of the hypervisor.

With PCIe passthrough we could pass the NIC directly to the VM running the Firefox browser window I’m typing this, but then we have a problem, no other VMs can access these resources.

SR-IOV provides an interface to passthrough PCIe to VMs by slicing the PCIe interface up and then passing it through.

My VM would be able to access the PCIe side of the NIC, but so would other VMs.

So that’s the short of it, SR-IOR and DPDK enable better packet forwarding speeds on VNFs.

Information stored on USIM / SIM Card for LTE / EUTRAN / EPC - K key, OP/OPc key and SQN Sequence Number

Confidentiality Algorithms in 3GPP Networks: MILENAGE, XOR & Comp128

We’ve covered a fair bit on authentication in 3GPP networks, SIM cards, HSS / AuC, etc, but never actually looked at the Confidentiality Algorithms in use,

LTE USIM Authentication - Mutual Authentication of the Network and Subscriber

While we’ve already covered the inputs required by the authentication elements of the core network (The HSS in LTE/4G, the AuC in UMTS/3G and the AUSF in 5G) to generate an output, it’s worth noting that the Confidentiality Algorithms used in the process determines the output.

This means the Authentication Vector (Also known as an F1 and F1*) generated for a subscriber using Milenage Confidentiality Algorithms will generate a different output to that of Confidentiality Algorithms XOR or Comp128.

To put it another way – given the same input of K key, OPc Key (or OP key), SQN & RAND (Random) a run with Milenage (F1 and F1* algorithm) would yield totally different result (AUTN & XRES) to the same inputs run with a simple XOR.

Technically, as operators control the network element that generates the challenges, and the USIM that responds to them, it is an option for an operator to implement their own Confidentiality Algorithms (Beyond just Milenage or XOR) so long as it produced the same number of outputs. But rolling your own cryptographic anything is almost always a terrible idea.

So what are the differences between the Confidentiality Algorithms and which one to use?
Spoiler alert, the answer is Milenage.

Milenage

Milenage is based on AES (Originally called Rijndael) and is (compared to a lot of other crypto implimentations) fairly easy to understand,

AES is very well studied and understood and unlike Comp128 variants, is open for anyone to study/analyse/break, although AES is not without shortcomings, it’s problems are at this stage, fairly well understood and mitigated.

There are a few clean open source examples of Milenage implementations, such as this C example from FreeBSD.

XOR

It took me a while to find the specifications for the XOR algorithm – it turns out XOR is available as an alternate to Milenage available on some SIM cards for testing only, and the mechanism for XOR Confidentiality Algorithm is only employed in testing scenarios, not designed for production.

Instead of using AES under the hood like Milenage, it’s just plan old XOR of the keys.

Osmocom have an implementation of this in their CN code, you can find here.

Defined under 3GPP TS 34.108 8.1.2.1

Comp128

Comp128 was originally a closed source algorithm, with the maths behind it not publicly available to scrutinise. It is used in GSM A3 and A5 functions, akin to the F1 and F1* in later releases.

Due to its secretive nature it wasn’t able to be studied or analysed prior to deployment, with the idea that if you never said how your crypto worked no one would be able to break it. Spoiler alert; public weaknesses became exposed as far back as 1998, which led to Toll Fraud, SIM cloning and eventually the development of two additional variants, with the original Comp128 renamed Comp128-1, and Comp128-2 (stronger algorithm than the original addressing a few of its flaws) and Comp128-3 (Same as Comp128-2 but with a 64 bit long key generated).

Into the Future & 5G later releases

As options beyond just USIM authentication become available for authentication in 5G SA networks, additional algorithms can be used beyond EAP and AKA, but at the time of writing only TLS has been added. 5G adds SUCI and SUPI which provide a mechanism to keep the private identifier (IMSI) away from prying eyes (or antenna), which I’ve detailed in this post.

Enable GPS/GLONASS Sync on Huawei BTS3900

Our BTS is going to need an accurate clock source in order to run, so without access to crazy accurate Timing over Packet systems or TDM links to use as reference sources, I’ve opted to use the GPS/GLONASS receiver built into the LMPT card.

Add new GPS with ID 0 on LMPT in slot 7 of cabinet 1:

ADD GPS: GN=0, CN=1, SRN=7, CABLE_LEN=3, MODE=GPS/GLONASS;

Check GPS has sync (May take some time) using the Display GPS command;

DSP GPS: GN=0;

Assuming you’ve got an antenna connected and can see the sky, after ~10 minutes running the DSP GPS:; command again should show you an output like this:

+++    4-PAL0089624        2020-11-28 01:06:55
O&M    #806355684
%%DSP GPS: GN=0;%%
RETCODE = 0  Operation succeeded.

Display GPS State
-----------------
                 GPS Clock No.  =  0
                GPS Card State  =  Normal
                 GPS Card Type  =  M12M
                 GPS Work Mode  =  GPS
                   Hold Status  =  UNHOLDED
         GPS Satellites Traced  =  4
     GLONASS Satellites Traced  =  0
         BDS Satellites Traced  =  0
Antenna Longitude(1e-6 degree)  =  144599999
 Antenna Latitude(1e-6 degree)  =  -37000000
           Antenna Altitude(m)  =  613
         Antenna Angle(degree)  =  5
             Link Active State  =  Activated
              Feeder Delay(ns)  =  15
                   GPS Version  =  NULL
(Number of results = 1)


---    END

Showing the GPS has got sync and a location fix,

Next we set BTS to use GPS as time source,

SET TIMESRC: TIMESRC=GPS;

Finally we’ll verify the Time is in sync on the BTS using the list time command:

DSP TIME:;
+++    4-PAL0089624        2020-11-28 01:09:22
O&M    #806355690
%%DSP TIME:;%%
RETCODE = 0  Operation succeeded.

Time Information
----------------
Time  =  2020-11-28 01:09:22 GMT+00:00

---    END

Optionally you may wish to add a timezone, using the SET TZ:; command, but I’ve opted to keep it in UTC for simplicity.

SIM / Smart Card Deep Dive – Part 3 – APDUs and Hello Card

In our last post we covered the file system structure of a smart card and the basic concepts of communication with cards. In this post we’ll look at what happens on the application layer, and how to interact with a card.

For these examples I’ll be using SIM cards, because admit it, you’ve already got a pile sitting in a draw, and this is a telco blog after all. You won’t need the ADM keys for the cards, we’ll modify files we’ve got write access to by default.

Commands & Instructions

So to do anything useful with the card we need issue commands / instructions to the card, to tell it to do things. Instructions like select this file, read it’s contents, update the contents to something else, verify my PIN, authenticate to the network, etc.

The term Command and Instruction are used somewhat interchangeably in the spec, I realise that I’ve done the same here to make it just as confusing, but instruction means the name of the specific command to be called, and command typically means the APDU as a whole.

The “Generic Commands” section of 3GPP TS 31.101 specifies the common commands, so let’s take a look at one.

The creatively named SELECT command/instruction is used to select the file we want to work with. In the SELECT command we’ll include some parameters, like where to find the file, so some parameters are passed with the SELECT Instruction to limit the file selection to a specific area, etc, the length of the file identifier to come, and the identifier of the file.

The card responds with a Status Word, returned by the card, to indicate if it was successful. For example if we selected a file that existed and we had permission to select, we’d get back a status word indicating the card had successfully selected the file. Status Words are 2 byte responses that indicate if the instruction was successful, but also the card has data it wants to send to the terminal as a result of the instruction, how much data the terminal should expect.

So if we just run a SELECT command, telling the card to select a file, we’ll get back a successful response from the card with a data length. Next need to get that data from the card. As the card can’t initiate communication, the GET RESPONSE instruction is sent to the card to get the data from the card, along with the length of the data to be returned.

The GET RESPONSE instruction/command is answered by the card with an APDU containing the data the card has to send, and the last 2 bytes contain the Status Word indicating if it was successful or not.

APDUs

So having covered the physical and link layers, we now move onto the Application Layer – where the magic happens.

Smart card communications is strictly master-slave based when it comes to the application layer.

The terminal sends a command to the card, which in turn sends back a response. Command -> Response, Command -> Response, over and over.

These commands are contained inside APplication Data Units (APDUs).

So let’s break down a simple APDU as it appears on the wire, so to speak.

The first byte of our command APDU is taken up with a header called the class byte, abbreviated to CLA. This specifies class coding, secure messaging options and channel options.

In the next byte we specify the Instruction for the command, that’s the task / operation we want the card to perform, in the spec this is abbreviated to INS.

The next two bytes, called P1 & P2 (Parameter 1 & Parameter 2) specify the parameters of how the instruction is to be to be used.

Next comes Lc – Length of Command, which specifies the length of the command data to follow,

Data comes next, this is instruction data of the length specified in Lc.

Finally an optional Le – Length of expected response can be added to specify how long the response from the card should be.

Crafting APDUs

So let’s encode our own APDU to send to a card, for this example we’ll create the APDU to tell the card to select the Master File (MF) – akin to moving to the root directory on a *nix OS.

For this we’ll want a copy of ETSI TS 102 221 – the catchily named “Smart cards; UICC-Terminal interface; Physical and logical characteristics” which will guide in the specifics of how to format the command, because all the commands are encoded in hexadecimal format.

So here’s the coding for a SELECT command from section 11.1.1.1 “SELECT“,

For the CLA byte in our example we’ll indicate in our header that we’re using ISO 7816-4 encoding, with nothing fancy, which is denoted by the byte A0.

For the next but we’ve got INS (Instruction) which needs to be set to the hex value for SELECT, which is represented by the hex value A4, so our second byte will have that as it’s value.

The next byte is P1, which specifies “Selection Control”, the table in the specification outlines all the possible options, but we’ll use 00 as our value, meaning we’ll “Select DF, EF or MF by file id”.

The next byte P2 specifies more selection options, we’ll use “First or only occurrence” which is represented by 00.

The Lc byte defines the length of the data (file id) we’re going to give in the subsequent bytes, we’ve got a two byte File ID so we’ll specify 2 (represented by 02).

Finally we have the Data field, where we specify the file ID we want to select, for the example we’ll select the Master File (MF) which has the file ID ‘3F00‘, so that’s the hex value we’ll use.

So let’s break this down;

CodeMeaningValue
CLAClass bytes – Coding optionsA0 (ISO 7816-4 coding)
INSInstruction (Command) to be calledA4 (SELECT)
P1Parameter 1 – Selection Control (Limit search options)00 (Select by File ID)
P2Parameter 1 – More selection options00 (First occurrence)
LcLength of Data 02 (2 bytes of data to come)
DataFile ID of the file to Select3F00 (File ID of master file)

So that’s our APDU encoded, it’s final value will be A0 A4 00 00 02 3F00

So there we have it, a valid APDU to select the Master File.

In the next post we’ll put all this theory into practice and start interacting with a real life SIM cards using PySIM, and take a look at the APDUs with Wireshark.

SIM / Smart Card Deep Dive – Part 2 – Meet & Greet

Layer 1 – Pinout and Connections

Before we can get all excited about talking to cards, let’s look at how we interface with them on a physical level.

For “Classic” smart cards interface is through the fingernail sized contacts on the card.

As you’d expect there’s a VCC & Ground line for powering the card, a clock input pin for clocking it and a single I/O pin.

ISO/IEC 7816-3 defines the electrical interface and transmission protocols.

The pins on the terminal / card reader are arranged so that when inserting a card, the ground contact is the first contact made with the reader, this clever design consideration to protect the card and the reader from ESD damage.

Operating Voltages

When Smart Cards were selected for use in GSM for authenticating subscribers, all smart cards operated at 5v. However as mobile phones got smaller, the operating voltage range became more limited, the amount of space inside the handset became a premium and power efficiency became imperative. The 5v supply for the SIM became a difficult voltage to provide (needing to be buck-boosted) so lower 3v operation of the cards became a requirement, these cards are referred to as “Class B” cards. This has since been pushed even further to 1.8v for “Class C” cards.

If you found a SIM from 1990 it’s not going to operate in a 1.8v phone, but it’s not going to damage the phone or the card.

The same luckily goes in reverse, a card designed for 1.8v put into a phone from 1990 will work just fine at 5v.

This is thanks to the class flag in the ATR response, which we’ll cover later on.

Clocks

As we’re sharing one I/O pin for TX and RX, clocking is important for synchronising the card and the reader. But when smart cards were initially designed the clock pin on the card also served as the clock for the micro controller it contained, as stable oscillators weren’t available in such a tiny form factor. Modern cards implement their own clock, but the clock pin is still required for synchronising the communication.

I/O Pin

The I/O pin is used for TX & RX between the terminal/phone/card reader and the Smart Card / SIM card. Having only one pin means the communications is half duplex – with the Terminal then the card taking it in turns to transmit.

Reset Pin

Resets the card’s communications with the terminal.

Filesystem

So a single smart card can run multiple applications, the “SIM” is just an application, as is USIM, ISIM and any other applications on the card.

These applications are arranged on a quasi-filesystem, with 3 types of files which can be created, read updated or deleted. (If authorised by the card.)

Because the file system is very basic, and somewhat handled like a block of contiguous storage, you often can’t expand a file – when it is created the required number of bytes are allocated to it, and no more can be added, and if you add file A, B and C, and delete file B, the space of file B won’t be available to be used until file C is deleted.

This is why if you cast your mind back to when contacts were stored on your phone’s SIM card, you could only have a finite number of contacts – because that space on the card had been allocated for contacts, and additional space can no longer be allocated for extra contacts.

So let’s take a look at our 3 file types:

MF (Master File)

The MF is like the root directory in Linux, under it contains all the files on the card.

DF (Dedciated File)

An dedicated file (DF) is essentially a folder – they’re sometimes (incorrectly) referred to as Directory Files (which would be a better name).

They contain one or more Elementary Files (see below), and can contain other DFs as well.

Dedicated Files make organising the file system cleaner and easier. DFs group all the relevant EFs together. 3GPP defines a dedicated file for Phonebook entries (DFphonebook), MBMS functions (DFtv) and 5G functions (DF5gs).

We also have ADFs – Application Dedicated Files, for specific applications, for example ADFusim contains all the EFs and DFs for USIM functionality, while ADFgsm contains all the GSM SIM functionality.

The actual difference with an ADF is that it’s not sitting below the MF, but for the level of depth we’re going into it doesn’t matter.

DFs have a name – an Application Identifier (AID) used to address them, meaning we can select them by name.

EF (Elementary File)

Elementary files are what would actually be considered a file in Linux systems.

Like in a Linux file systems EFs can have permissions, some EFs can be read by anyone, others have access control restrictions in place to limit who & what can access the contents of an EF.

There are multiple types of Elementary Files; Linear, Cyclic, Purse, Transparent and SIM files, each with their own treatment by the OS and terminal.

Most of the EFs we’ll deal with will be Transparent, meaning they ##

ATR – Answer to Reset

So before we can go about working with all our files we’ll need a mechanism so the card, and the terminal, can exchange capabilities.

There’s an old saying that the best thing about standards is that there’s so many to choose, from and yes, we’ve got multiple variants/implementations of the smart card standard, and so the card and the terminal need to agree on a standard to use before we can do anything.

This is handled in a process called Answer to Reset (ATR).

When the card is powered up, it sends it’s first suggestion for a standard to communicate over, if the terminal doesn’t want to support that, it just sends a pulse down the reset line, the card resets and comes back with a new offer.

If the card offers a standard to communicate over that the terminal does like, and does support, the terminal will send the first command to the card via the I/O line, this tells the card the protocol preferences of the terminal, and the card responds with it’s protocol preferences. After that communications can start.

Basic Principles of Smart Cards Communications

So with a single I/O line to the card, it kind of goes without saying the communications with the card is half-duplex – The card and the terminal can’t both communicate at the same time.

Instead a master-slave relationship is setup, where the smart card is sent a command and sends back a response. Command messages have a clear ending so the card knows when it can send it’s response and away we go.

Like most protocols, smart card communications is layered.

At layer 1, we have the physical layer, defining the operating voltages, encoding, etc. This is standardised in ISO/IEC 7816-3.

Above that comes our layer 2 – our Link Layer. This is also specified in ISO/IEC 7816-3, and typically operates in one of two modes – T0 or T1, with the difference between the two being one is byte-oriented the other block-oriented. For telco applications T0 is typically used.

Our top layer (layer 7) is the application layer. We’ll cover the details of this in the next post, but it carries application data units to and from the card in the form of commands from the terminal, and responses from the card.

Coming up Next…

In the next post we’ll look into application layer communications with cards, the commands and the responses.

SIM / Smart Card Deep Dive – Part 1 – Introduction to Smart Cards

I know a little bit about SIM cards / USIM cards / ISIM Cards.
Enough to know I don’t know very much about them at all.

So throughout this series of posts of unknown length, I’ll try and learn more and share what I’m learning, citing references as much as possible.

So where to begin? I guess at the start,

A supposedly brief history of Smart Cards

There are two main industries that have driven the development and evolution of smart cards – telecom & banking / finance, both initially focused on the idea that carrying cash around is unseemly.

This planet has – or rather had – a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movement of small green pieces of paper, which was odd because on the whole it wasn’t the small green pieces of paper that were unhappy.

Douglas Adams – The Hitchhiker’s Guide to the Galaxy

When the idea of Credit / Debit Cards were first introduced the tech was not electronic, embossed letters on the card were fed through that clicky-clacky-transfer machine (Google tells me this was actually called the “credit card imprinter”) and the card details imprinted onto carbon copy paper.

Customers wanted something faster, so banks delivered magnetic strip cards, where the card data could be read even more quickly, but as the security conscious of you will be aware, storing data on magnetic strips on a card to be read by any reader, allows them to be read by any reader, and therefore duplicated really easily, something the banks quickly realised.

To combat this, card readers typically would have a way to communicate back to a central bank computer. The central computer verified the PIN entered by the customer was correct, confirmed that the customer had enough money in their balance for the transaction and it wasn’t too suspicious. This was, as you would imagine in the late 1980’s early 1990’s, rather difficult to achieve. A reliable (and cheap) connection back to a central bank computer wasn’t always a given, nor instant, and so this was still very much open to misuse.

“Carders” emmerged, buying/selling/capturing credit card details, and after programming a blank card with someone else’s fraudulently obtained card details, could write them on a blank card before going on a spending spree for a brief period of time. Racking up a giant debt that wasn’t reconciled against the central computer until later, when the card was thrown away and replaced with another.

I know what you’re thinking – I come to this blog for ramblings about Telecommunications, not the history of the banking sector. So let’s get onto telco;

The telecom sector faced similar issues, at the time mobile phones were in their infancy, and so Payphones were how people made calls when out and about.

A phone call from a payphone in Australia has sat at about $0.40 for a long time, not a huge amount, but enough you’d always want to be carrying some change if you wanted to make calls. Again, an inconvenience for customers as coins are clunky, and an inconvenience for operators as collecting the coins from tens of thousands of payphones is expensive.

Telcos around the world trailed solutions, including cards with magnetic strips containing the balance of the card, but again people quickly realised that you could record the contents of the magnetic stripe data of the card when it had a full balance, use all the balance on the card, and then write back the data you stored earlier with a full balance.

So two industries each facing the same issue: it’s hard to securely process payments offline in a way that can’t be abused.

Enter the smart card – a tiny computer in a card that the terminal (Payphone or Credit Card Reader) interacts with, but the card is very much in charge.

When used in a payphone, the caller inserts the smart card and dials the number, and dialog goes something like this (We’ll assume Meter Pulses are 40c worth):

Payphone: “Hey SmartCard, how much credit do you have on you?”

Smart Card: “I have $1.60 balance”

*Payphone ensures card has enough credit for the first meter pulse, and begins listening for Meter Pulses*

*When a meter pulse received:*

Payphone: “Please deduct $0.40 from your Balance”

Smart Card: “Ok, you have $1.20 remaining”

This process repeats for each meter pulse (Payphone metering is a discussion for another day) until all the credit has been used / Balance is less than 1 meter pulse charge.

While anyone could ask the smart card “Hey SmartCard, how much credit do you have on you?” it would only return the balance, and if you told the smart card “I used $1 credit, please deduct it” like the payphone did, you’d just take a dollar off the credit stored on the card.

Saying “Hey SmartCard set the balance to $1,000,000” would result in a raised eyebrow from the SmartCard who rejects the request.

After all – It’s a smart card. It has the capability to do that.

So in the telecom sector single use smart cards were rolled out, programmed in the factory with a set dollar value of credit, sold at that dollar value and thrown away when depleted.

The banking industry saw even more potential, balance could be stored on the card, and the PIN could be verified by the card, the user needs to know the correct PIN, as does the smart card, but the terminal doesn’t need to know this, nor does it need to talk back to a central bank computer all the time, just every so often so the user gets the bill.

It worked much the same way, although before allowing a deduction to be made from the balance of the card, a user would have to enter their PIN which was verified by the card before allowing the transaction.

Eventually these worlds collided (sort of), both wanting much the same thing from smart cards. So the physical characteristics, interface specs (rough ones) and basic communications protocol was agreed on, and what eventually became ISO/IEC 7816 was settled upon.

Any card could be read by any terminal, and it was up to the systems implementer (banks and telecos initially) what data the card did and what the terminal did.

Active RFID entered the scene and there wasn’t even a need for a physical connection to the card, but the interaction was the same. We won’t really touch on the RFID side, but all of this goes for most active RFID cards too.

Enter Software

Now the card was a defined standard all that was important really was the software on the card. Banks installed bank card software on their cards, while telcos installed payphone card software on theirs.

But soon other uses emerged, ID cards could provide a verifiable and (reasonably) secure way to verify the card’s legitimacy, public transport systems could store commuter’s fares on the card, and vending machines, time card clocks & medical records could all jump on the bandwagon.

These were all just software built on the smart card platform.

Hello SIM Cards

A early version Smart card was used in the German C-Netz cellular network, which worked in “mobile” phones and also payphones, to authenticate subscribers.

After that the first SIM cards came into the public sphere in 1991 with GSM as a way to allow a subscriber’s subscription to be portable between devices, and this was standardised by ETSI to become the SIM cards still used in networks using GSM, and evolved into the USIM used in 3G/4G/5G networks.

Names of Smart Cards & Readers

To make life a bit easier I thought I’d collate all the names for smart cards and readers that are kind of different but used interchangeably depending on the context.

Smart Card|Terminal
UICC (Universal Integrated Circuit Card) – Standards name for Smart CardCard Reader (Generic)
SIM (Mobile Telco application running on UICC)Phone (Telco)
USIM (Mobile Telco application running on UICC)SIM Slot (Telco)
Credit / Debit / EFTPOS Card (Banking)UE (Telco)
Java Card (Type of Smart Card OS)EFTPOS Terminal (Banking)
Phone Card (Telco / Payphone)

And then…

From here we’ll look at various topics:

  • Introduction to Smart Cards (This post)
  • Meet & Greet (The basics of Smart Cards & their File System)
  • APDUs and Hello Card (How terminals interact with a smart cards)
  • (Interacting with real life cards using Smart Card readers and SIM cards)
  • Mixing It Up (Changing values on Cards)

Other topics we may cover are Javacard and Global Platform, creating your own smart card applications, a deeper look at the different Telco apps like SIM/USIM/ISIM, OTA Updates for cards / Remote File Management (RFM), and developing for SimToolkit.

Power cables feeding Ericsson RBS rack

Cell Broadcast in LTE

Recently I’ve been wrapping my head around Cell Broadcast in LTE, and thought I’d share my notes on 3GPP TS 38.413.

The interface between the MME and the Cell Broadcast Center (CBC) is the SBc interface, which as two types of “Elementary Procedures”:

  • Class 1 Procedures are of the request – response nature (Request followed by a Success or Failure response)
  • Class 2 Procedures do not get a response, and are informational one-way. (Acked by SCTP but not an additional SBc message).

SCTP is used as the transport layer, with the CBC establishing a point to point connection to the MME over SCTP (Unicast only) on port 29168 with SCTP Payload Protocol Identifier 24.

The SCTP associations between the MME and the CBC should normally remain up – meaning the SCTP association / transport connection is up all the time, and not just brought up when needed.

Elementary Procedures

Write-Replace Warning (Class 1 Procedure)

The purpose of Write-Replace Warning procedure is to start, overwrite the broadcasting of warning message, as defined in 3GPP TS 23.041 [14].

Write-Replace Warning procedure, initiated by WRITE-REPLACE WARNING REQUEST sent by the CBC to the MMEs contains the emergency message to be broadcast and the parameters such as TAC to broadcast to, severity level, etc.

A WRITE-REPLACE WARNING RESPONSE is sent back by the MME to the MME, if successful, along with information as to where it was sent out. CBC messages are unacknowledged by UEs, meaning it’s not possible to confirm if a UE has actually received the message.

The request includes the message identifier and serial number, list of TAIs, repetition period, number of broadcasts requested, warning type, and of course, the warning message contents.

Stop Warning Procedure (Class 1 Procedure)

Stop Warning Procedure, initiated by STOP WARNING REQUEST and answered with a STOP WARNING RESPONSE, requests the MME inform the eNodeBs to stop broadcasting the CBC in their SIBs.

Includes TAIs of cells this should apply to and the message identifier,

Error Indication (Class 2)

The ERROR INDICATION is used to indicate an error (duh). Contains a Cause and Criticality IEs and can be sent by the MME or CBC.

Write Replace Warning (Class 2)

The WRITE REPLACE WARNING INDICATION is used to indicate warning scenarios for some instead of a WRITE-REPLACE WARNING RESPONSE,

PWS Restart (Class 2)

The PWS RESTART INDICATION is used to list the eNodeBs / cells, that have become available or have restarted, since the CBC message and have no warning message data – for example eNodeBs that have just come back online during the period when all the other cells are sending Cell Broadcast messages.

Returns a the Restarted-Cell-List IE, containing the Global eNB ID IE and List of TAI, of the restarted / reconnected cells.

PWS Failure Indication (Class 2)

The PWS FAILURE INDICATION is essentially the reverse of PWS RESTART INDICATION, indicating which eNodeBs are no longer available. These cells may continue to send Cell Broadcast messages as the MME has essentially not been able to tell it to stop.

Contains a list of Failed cells (eNodeBs) with the Global-eNodeB-ID of each.