Category Archives: SDM

If you like Pina Coladas, and service the control plane – Intro to NRF in 5GC

The Network Repository Function plays matchmaker to all the elements in our 5G Core.

For our 5G Service-Based-Architecture (SBA) we use Service Based Interfaces (SBIs) to communicate between Network Functions. Sometimes a Network Function acts as a server for these interfaces (aka “Service Producer”) and sometimes it acts as a client on these interfaces (aka “Service Consumer”).

For service consumers to be able to find service producers (Clients to be able to find servers), we need a directory mechanism for clients to be able to find the servers to serve their needs, this is the role of the NRF.

With every Service Producer registering to the NRF, the NRF has knowledge of all the available Service Producers in the network, so when a Service Consumer NF comes along (Like an AMF looking for UDM), it just queries the NRF to get the details of who can serve it.

Basic Process – NRF Registration

In order to be found, a service producer NF has to register with the NRF, so the NRF has enough info on the service-producer to be able to recommend it to service-consumers.

This is all the basic info, the Service Based Interfaces (SBIs) that this NF serves, the PLMN, and the type of NF.

The NRF then stores this information in a database, ready to be found by SBI Service Consumers.

This is achieved by the Service Producing NF sending a HTTP2 PUT to the NRF, with the message body containing all the particulars about the services it offers.

Simplified example of an SMSc registering with the NRF in a 5G Core

Basic Process – NRF Discovery

With an NRF that has a few SBI Service Producers registered in it, we can now start querying it from SBI Service Consumers, to find SBI Service Producers.

The SBI Service Consumer looking for a SBI Service Producer, queries the NRF with a little information about itself, and the SBI Service Producer it’s looking for.

For example a SMF looking for a UDM, sends a request like:

http://[::1]:7777/nnrf-disc/v1/nf-instances?requester-nf-type=SMF&target-nf-type=UDM

To the NRF, and the NRF responds with SBI Service Producing NFs that match in JSON body of the response.

SMSF being found by the AMF using the NRF

More Info

I’ve written in a more technical detail on the NRF in this post, you can learn about setting up Open5Gs NRF in this post, and keep tuned for a lot more content on 5GC!

Backing up and Restoring Open5GS

You may find you need to move your Open5GS deployments from one server to another, or split them between servers.
This post covers the basics of migrating Open5GS config and data between servers by backing up and restoring it elsewhere.

The Database

Open5GS uses MongoDB as the database for the HSS and PCRF. This database contains all our SDM data, like our SIM Keys, Subscriber profiles, PCC Rules, etc.

Backup Database

To backup the MongoDB database run the below command (It doesn’t need sudo / root to run):

mongodump -o Open5Gs_"`date +"%d-%m-%Y"`"

You should get a directory called Open5Gs_todaysdate, the files in that directory are the output of the MongoDB database.

Restore Database

If you copy the backup we just took (the directory named Open5Gs_todaysdate) to the new server, you can restore the complete database by running:

mongorestore Open5Gs_todaysdate

This restores everything in the database, including profiles and user accounts for the WebUI,

You may instead just restore the Subscribers table, leaving the Profiles and Accounts unchanged with:

mongorestore Open5Gs_todaysdate/open5gs/subscribers.bson -c subscribers -d open5gs

The database schema used by Open5GS changed earlier this year, meaning you cannot migrate directly from an old database to a new one without first making a few changes.

To see if your database is affected run:

mongo open5gs --eval 'db.subscribers.find({"__v" : 0}).toArray()' | grep "imsi" | wc -l

Which will let you know how many subscribers are using the old database type. If it’s anything other than 0 running this Python script will update the database as required.

Once you have installed Open5GS onto the new server you’ll need to backup the data from the old one, and restore it onto the new one.

The Config Files

The text based config files define how Open5Gs will behave, everything from IP Addresses to bind on, to the interfaces and PLMN.

Again, you’ll need to copy them from the old server to the new, and update any IP Addresses that may change between the two.

On the old server run:

cp -r /etc/open5gs /tmp/

Then copy the “open5gs” folder to the new server into the /etc/ directory.

If you’re also changing the IP Address you’re binding on, you’ll need to update that in the YAML files.

Bringing Everything Online

Finally you’ll need to restart all the services,

sudo systemctl start open5gs-*

Run a basic health check to ensure the services are running,

ps aux | grep open5gs-

Should list all the running Open5Gs services,

And then check the logs to ensure everything is working as expected,

tail -f /var/log/open5gs/*.log

Jaffa Cakes explain the nuances between Centralized vs Decentralized Online Charging in 3GPP Networks

While reading through the 3GPP docs regarding Online Charging, there’s a concept that can be a tad confusing, and that’s the difference between Centralized and Non-Centralized Charging architectures.

The overall purpose of online charging is to answer that deceptively simple question of “does the user have enough credit for this action?”.

In order to answer that question, we need to perform rating and unit determination.

Rating

Rating is just converting connectivity credit units into monetary units.

If you go to the supermarket and they have boxes of Jaffa Cakes at $2.50 each, they have rated a box of Jaffa Cakes at $2.50.

1 Box of Jaffa Cakes rated at $2.50 per box

In a non-snack-cake context, such as 3GPP Online Charging, then we might be talking about data services, for example $1 per GB is a rate for data.
Or for a voice calls a cost per minute to call a destination, such as is $0.20 per minute for a local call.

Rating is just working out the cost per connectivity unit (Data or Minutes) into a monetary cost, based on the tariff to be applied to that subscriber.

Unit Determination

The other key piece of information we need is the unit determination which is the calculation of the number of non-monetary units the OCS will offer prior to starting a service, or during a service.

This is done after rating so we can take the amount of credit available to the subscriber and calculate the number of non-monetary units to be offered.

Converting Hard-Currency into Soft-Snacks

In our rating example we rated a box of Jaffa Cakes at $2.50 per box. If I have $10 I can go to the shops and buy 4x boxes of Jaffa cakes at $2.50 per box. The cashier will perform unit determination and determine that at $2.50 per box and my $10, I can have 4 boxes of Jaffa cakes.

Again, steering away from the metaphor of the hungry author, Unit Determination in a 3GPP context could be determining how many minutes of talk time to be granted.
Question: At $0.20 per minute to a destination, for a subscriber with a current credit of $20, how many minutes of talk time should they be granted?
Answer: 100 minutes ($20 divided by $0.20 per minute is 100 minutes).

Or to put this in a data perspective,
Question: Subscriber has $10 in Credit and data is rated at $1 per GB. How many GB of data should the subscriber be allowed to use?
Answer: 10GB.

Putting this Together

So now we understand rating (working out the conversion of connectivity units into monetary units) and unit determination (determining the number of non-monetary units to be granted for a given resource), let’s look at the the Centralized and Decentralized Online Charging.

Centralized Rating

In Centralized Rating the CTF (Our P-GW or S-CSCF) only talk about non-monetary units.
There’s no talk of money, just of the connectivity units used.

The CTFs don’t know the rating information, they have no idea how much 1GB of data costs to transfer in terms of $$$.

For the CTF in the P-GW/PCEF this means it talks to the OCS in terms of data units (data In/out), not money.

For the CTF in the S-CSCF this means it only ever talks to the OCS in voice units (minutes of talk time), not money.

This means our rates only need to exist in the OCS, not in the CTF in the other network elements. They just talk about units they need.

De-Centralized Rating

In De-Centralized Rating the CTF performs the unit conversion from money into connectivity units.
This means the OCS and CTF talk about Money, with the CTF determining from that amount of money granted, what the subscriber can do with that money.

This means the CTF in the S-CSCF needs to have a rating table for all the destinations to determine the cost per minute for a call to a destination.

And the CTF in the P-GW/PCEF has to know the cost per octet transferred across the network for the subscriber.

In previous generations of mobile networks it may have been desirable to perform decentralized rating, as you can spread the load of calculating our the pricing, however today Centralized is the most common way to approach this, as ensuring the correct rates are in each network element is a headache.

Centralized Unit Determination

In Centralized Unit Determination the CTF tells the OCS the type of service in the Credit Control Request (Requested Service Units), and the OCS determines the number of non-monetary units of a certain service the subscriber can consume.

The CTF doesn’t request a value, just tells the OCS the service being requested and subscriber, and the OCS works out the values.

For example, the S-CSCF specifies in the Credit Control Request the destination the caller wishes to reach, and the OCS replies with the amount of talk time it will grant.

Or for a subscriber wishing to use data, the P-GW/PCEF sends a Credit Control Request specifying the service is data, and the OCS responds with how much data the subscriber is entitled to use.

De-Centralized Unit Determination

In De-Centralized Unit Determination, the CTF determines how many units are required to start the service, and requests these units from the OCS in the Credit Control Request.

For a data service,the CTF in the P-GW would determine how many data units it is requesting for a subscriber, and then request that many units from the OCS.

For a voice call a S-CSCF may request an initial call duration, of say 5 minutes, from the OCS. So it provides the information about the destination and the request for 300 seconds of talk time.

Session Charging with Unit Reservation (SCUR)

Arguably the most common online charging scenario is Session Charging with Unit Reservation (SCUR).

SCUR relies on reserving an amount of funds from the subscriber’s balance, so no other services can those funds and translating that into connectivity units (minutes of talk time or data in/out based on the Requested Session Unit) at the start of the session, and then subsequent requests to debit the reserved amount and reserve a new amount, until all the credit is used.

This uses centralized Unit Determination and centralized Rating.

Let’s take a look at how this would look for the CTF in a P-GW/PCEF performing online charging for a subscriber wishing to use data:

  1. Session Request: The subscriber has attached to the network and is requesting service.
  2. The CTF built into the P-GW/PCEF sends a Credit Control Request: Initial Request (As this subscriber has just attached) to the OCS, with Requested Service Units (RSU) of data in/out to the OCS.
  3. The OCS performs rating and unit determination, and according to it’s credit risk policies, and a whole lot of other factors, comes back with an amount of data the subscriber can use, and reserves the amount from the account.
    (It’s worth noting at this point that this is not necessarily all of the subscriber’s credit in the form of data, just an amount the OCS is willing to allocate. More data can be requested once this allocated data is used up.)
  4. The OCS sends a Credit Control Answer back to our P-GW/PCEF. This contains the Granted Service Unit (GSU), in our case the GSU is data so defines much data up/down the user can transfer. It also may include a Validity Time (VT), which is the number of seconds the Credit Control Answer is valid for, after it’s expired another Credit Control Request must be sent by the CTF.
  5. Our P-GW/PCEF processes this, starts measuring the data used by the subscriber for reporting later, and sets a timer for the Validity Time to send another CCR at that point.
    At this stage, our subscriber is able to start using data.
  1. Some time later, either when all the data allocated in the Granted Service Units has been consumed, or when the Validity Time has expired, the CTF in the P-GW/PCEF sends another Credit Control Request: Update, and again includes the RSU (Requested Service Units) as data in/out, and also a USU (Used Service Units) specifying how much data the subscriber has used since the first Credit Control Answer.
  2. The OCS receives this information. It compares the Used Session Units to the Granted Session Units from earlier, and with this is able to determine how much data the subscriber has actually used, and therefore how much credit that equates to, and debit that amount from the account.
    With this information the OCS can reserve more funds and allocate another GSU (Granted Session Unit) if the subscriber has the required balance. If the subscriber only has a small amount of credit left the FUI (Final Unit Indication AVP) is set to determine this is all the subscriber has left in credit, and if this is exhausted to end the session, rather than sending another Credit Control Request.
  3. The Credit Control Answer with new GSU and the FUI is sent back to the P-GW/PCEF
  4. The P-GW/PCEF allows the session to continue, again monitoring used traffic against the GSU (Granted Session Units).
  1. Once the subscriber has used all the data in the Granted Session Units, and as the last CCA included the Final Unit Indicator, the CTF in the P-GW/PCEF knows it can’t just request more credit in the form of a CCR Update, so cuts of the subscribers’s session.
  2. The P-GW/PCEF then sends a Credit Control Request: Termination Request with the final Used Service Units to the OCS.
  3. The OCS debits the used service units from the subscriber’s balance, and refunds any unused credit reservation.
  4. The OCS sends back a Credit Control Answer which may include the CI value for Credit Information, to denote the cost information which may be passed to the subscriber if required.

EIR in 5G Networks (N5g-eir_EquipmentIdentityCheck)

Today, we’re going to look at one of the simplest Service Based Interfaces in the 5G Core, the Equipment Identity Register (EIR).

The purpose of the EIR is very simple – When a subscriber connects to the network it’s Permanent Equipment Identifier (PEI) can be queried against an EIR to determine if that device should be allowed onto the network or not.

The PEI is the IMEI of a phone / device, with the idea being that stolen phones IMEIs are added to a forbidden list on the EIR, and prohibited from connecting to the network, making them useless, in turn making stolen phones harder to resell, deterring mobile phone theft.

In reality these forbidden-lists are typically either country specific or carrier specific, meaning if the phone is used in a different country, or in some cases a different carrier, the phone’s IMEI is not in the forbidden-list of the overseas operator and can be freely used.

The dialog goes something like this:

AMF: Hey EIR, can PEI 49-015420-323751-8 connect to the network?
EIR: (checks if 49-015420-323751-8 in forbidden list - It's not) Yes.

or

AMF: Hey EIR, can PEI 58-241992-991142-3 connect to the network?
EIR: (checks if 58-241992-991142-3 is in forbidden list - It is) No.

(Optionally the SUPI can be included in the query as well, to lock an IMSI to an IMEI, which is a requirement in some jurisdictions)

As we saw in the above script, the AMF queries the EIR using the N5g-eir_EquipmentIdentityCheck service.

The N5g-eir_EquipmentIdentityCheck service only offers one operation – CheckEquipmentIdentity.

It’s called by sending an HTTP GET to:

http://{apiRoot}/n5g-eir-eic/v1/equipment-status

Obviously we’ll need to include the PEI (IMEI) in the HTTP GET, which means if you remember back to basic HTTP GET, you may remember means you have to add ?attribute=value&attribute=value… for each attribute / value you want to share.

For the CheckEquipmentIdentity operation, the PEI is a mandatory parameter, and optionally the SUPI can be included, this means to query our PEI (The IMSI of the phone) against our EIR we’d simply send an HTTP GET to:

AMF: HTTP GET http://{apiRoot}/n5g-eir-eic/v1/equipment-status?pei=490154203237518
EIR: 200 (Body EirResponseData: status "WHITELISTED")

And how it would look for a blacklisted IMEI:

AMF: HTTP GET http://{apiRoot}/n5g-eir-eic/v1/equipment-status?pei=490154203237518
EIR: 404 (Body EirResponseData: status "BLACKLISTED")

Because it’s so simple, the N5g-eir_EquipmentIdentityCheck service is a great starting point for learning about 5G’s Service Based Interfaces.

You can find all the specifics in 3GPP TS 29.511 – Equipment Identity Register Services; Stage 3

Credit Control Request / Answer call flow in IMS Charging

Basics of EPC/LTE Online Charging (OCS)

Early on as subscriber trunk dialing and automated time-based charging was introduced to phone networks, engineers were faced with a problem from Payphones.

Previously a call had been a fixed price, once the caller put in their coins, if they put in enough coins, they could dial and stay on the line as long as they wanted.

But as the length of calls began to be metered, it means if I put $3 of coins into the payphone, and make a call to a destination that costs $1 per minute, then I should only be allowed to have a 3 minute long phone call, and the call should be cutoff before the 4th minute, as I would have used all my available credit.

Conversely if I put $3 into the Payphone and only call a $1 per minute destination for 2 minutes, I should get $1 refunded at the end of my call.

We see the exact same problem with prepaid subscribers on IMS Networks, and it’s solved in much the same way.

In LTE/EPC Networks, Diameter is used for all our credit control, with all online charging based on the Ro interface. So let’s take a look at how this works and what goes on.

Generic 3GPP Online Charging Architecture

3GPP defines a generic 3GPP Online charging architecture, that’s used by IMS for Credit Control of prepaid subscribers, but also for prepaid metering of data usage, other volume based flows, as well as event-based charging like SMS and MMS.

Network functions that handle chargeable services (like the data transferred through a P-GW or calls through a S-CSCF) contain a Charging Trigger Function (CTF) (While reading the specifications, you may be left thinking that the Charging Trigger Function is a separate entity, but more often than not, the CTF is built into the network element as an interface).

The CTF is a Diameter application that generates requests to the Online Charging Function (OCF) to be granted resources for the session / call / data flow, the subscriber wants to use, prior to granting them the service.

So network elements that need to charge for services in realtime contain a Charging Trigger Function (CTF) which in turn talks to an Online Charging Function (OCF) which typically is part of an Online Charging System (AKA OCS).

For example when a subscriber turns on their phone and a GTP session is setup on the P-GW/PCEF, but before data is allowed to flow through it, a Diameter “Credit Control Request” is generated by the Charging Trigger Function (CTF) in the P-GW/PCEF, which is sent to our Online Charging Server (OCS).

The “Credit Control Answer” back from the OCS indicates the subscriber has the balance needed to use data services, and specifies how much data up and down the subscriber has been granted to use.

The P-GW/PCEF grants service to the subscriber for the specified amount of units, and the subscriber can start using data.

This is a simplified example – Decentralized vs Centralized Rating and Unit Determination enter into this, session reservation, etc.

The interface between our Charging Trigger Functions (CTF) and the Online Charging Functions (OCF), is the Ro interface, which is a Diameter based interface, and is common not just for online charging for data usage, IMS Credit Control, MMS, value added services, etc.

3GPP define a reference online-charging interface, the Ro interface, and all the application-specific interfaces, like the Gy for billing data usage, build on top of the Ro interface spec.

Basic Credit Control Request / Credit Control Answer Process

This example will look at a VoLTE call over IMS.

When a subscriber sends an INVITE, the Charging Trigger Function baked in our S-CSCF sends a Diameter “Credit Control Request” (CCR) to our Online Charging Function, with the type INITIAL, meaning this is the first CCR for this session.

The CCR contains the Service Information AVP. It’s this little AVP that is where the majority of the magic happens, as it defines what the service the subscriber is requesting. The main difference between the multitude of online charging interfaces in EPC networks, is just what the service the customer is requesting, and the specifics of that service.

For this example it’s a voice call, so this Service Information AVP contains a “IMS-Information” AVP. This AVP defines all the parameters for a IMS phone call to be online charged, for a voice call, this is the called-party, calling party, SDP (for differentiating between voice / video, etc.).

It’s the contents of this Service Information AVP the OCS uses to make decision on if service should be granted or not, and how many service units to be granted. (If Centralized Rating and Unit Determination is used, we’ll cover that in another post)
The actual logic, relating to this decision is typically based on the the rating and tariffing, credit control profiles, etc, and is outside the scope of the interface, but in short, the OCS will make a yes/no decision about if the subscriber should be granted access to the particular service, and if yes, then how many minutes / Bytes / Events should be granted.

In the received Credit Control Answer is received back from our OCS, and the Granted-Service-Unit AVP is analysed by the S-CSCF.
For a voice call, the service units will be time. This tells the S-CSCF how long the call can go on before the S-CSCF will need to send another Credit Control Request, for the purposes of this example we’ll imagine the returned value is 600 seconds / 10 minutes.

The S-CSCF will then grant service, the subscriber can start their voice call, and start the countdown of the time granted by the OCS.

As our chatty subscriber stays on their call, the S-CSCF approaches the limit of the Granted Service units from the OCS (Say 500 seconds used of the 600 seconds granted).
Before this limit is reached the S-CSCF’s CTF function sends another Credit Control Request with the type UPDATE_REQUEST. This allows the OCS to analyse the remaining balance of the subscriber and policies to tell the S-CSCF how long the call can continue to proceed for in the form of granted service units returned in the Credit Control Answer, which for our example can be 300 seconds.

Eventually, and before the second lot of granted units runs out, our subscriber ends the call, for a total talk time of 700 seconds.

But wait, the subscriber been granted 600 seconds for our INITIAL request, and a further 300 seconds in our UPDATE_REQUEST, for a total of 900 seconds, but the subscriber only used 700 seconds?

The S-CSCF sends a final Credit Control Request, this time with type TERMINATION_REQUEST and lets the OCS know via the Used-Service-Unit AVP, how many units the subscriber actually used (700 seconds), meaning the OCS will refund the balance for the gap of 200 seconds the subscriber didn’t use.

If this were the interface for online charging of data, we’d have the PS-Information AVP, or for online charging of SMS we’d have the SMS-Information, and so on.

The architecture and framework for how the charging works doesn’t change between a voice call, data traffic or messaging, just the particulars for the type of service we need to bill, as defined in the Service Information AVP, and the OCS making a decision on that based on if the subscriber should be granted service, and if yes, how many units of whatever type.

Diameter – Insert Subscriber Data Request / Response

While we’ve covered the Update Location Request / Response, where an MME is able to request subscriber data from the HSS, what about updating a subscriber’s profile when they’re already attached? If we’re just relying on the Update Location Request / Response dialog, the update to the subscriber’s profile would only happen when they re-attach.

We need a mechanism where the HSS can send the Request and the MME can send the response.

This is what the Insert Subscriber Data Request/Response is used for.

Let's imagine we want to allow a subscriber to access an additional APN, or change an AMBR values of an existing APN;

We'd send an Insert Subscriber Data Request from the HSS, to the MME, with the Subscription Data AVP populated with the additional APN the subscriber can now access.

Beyond just updating the Subscription Data, the Insert Subscriber Data Request/Response has a few other funky uses.

Through it the HSS can request the EPS Location information of a Subscriber, down to the TAC / eNB ID serving that subscriber. It’s not the same thing as the GMLC interfaces used for locating subscribers, but will wake Idle UEs to get their current serving eNB, if the Current Location Request is set in the IDR Flags.

But the most common use for the Insert-Subscriber-Data request is to modify the Subscription Profile, contained in the Subscription-Data AVP,

If the All-APN-Configurations-Included-Indicator is set in the AVP info, then all the existing AVPs will be replaced, if it’s not then everything specified is just updated.

The Insert Subscriber Data Request/Response is a bit novel compared to other S6a requests, in this case it’s initiated by the HSS to the MME (Like the Cancel Location Request), and used to update an existing value.

PS Data Off

Imagine a not-too distant future, one without flying cars – just one where 2G and 3G networks have been switched off.

And the imagine a teenage phone user, who has almost run out of their prepaid mobile data allocation, and so has switched mobile data off, or a roaming scenario where the user doesn’t want to get stung by an unexpectedly large bill.

In 2G/3G networks the Circuit Switched (Voice & SMS) traffic was separate to the Packet Switched (Mobile Data).

This allowed users to turn of mobile data (GPRS/HSDPA), etc, but still be able to receive phone calls and send SMS, etc.

With LTE, everything is packet switched, so turning off Mobile Data would cut off VoLTE connectivity, meaning users wouldn’t be able to make/recieve calls or SMS.

In 3GPP Release 14 (2017) 3GPP introduced the PS Data Off feature.

This feature is primarily implemented on the UE side, and simply blocks uplink user traffic from the UE, while leaving other background IP services, such as IMS/VoLTE and MMS, to continue working, even if mobile data is switched off.

The UE can signal to the core it is turning off PS Data, but it’s not required to, so as such from a core perspective you may not know if your subscriber has PS Data off or not – The default APN is still active and in the implementations I’ve tried, it still responds to ICMP Pings.

IMS Registration stays in place, SMS and MMS still work, just the UE just drops the requests from the applications on the device (In this case I’m testing with an Android device).

What’s interesting about this is that a user may still find themselves consuming data, even if data services are turned off. A good example of this would be push notifications, which are sent to the phone (Downlink data). The push notification will make it to the UE (or at least the TCP SYN), after all downlink services are not blocked, however the response (for example the SYN-ACK for TCP) will not be sent. Most TCP stacks when ignored, try again, so you’ll find that even if you have PS Data off, you may still use some of your downlink data allowance, although not much.

The SIM EF 3GPPPSDATAOFF defines the services allowed to continue flowing when PS Data is off, and the 3GPPPSDATAOFFservicelist EF lists which IMS services are allowed when PS Data is off.

Usually at this point, I’d include a packet capture and break down the flow of how this all looks in signaling, but when I run this in my lab, I can’t differentiate between a PS Data Off on the UE and just a regular bearer idle timeout… So have an irritating blinking screenshot instead…

ENUM – DNS based Call Routing

DNS is commonly used for resolving domain names to IP Addresses, and is often described as being like “the phone book of the Internet”.

So what’s the phone book of phone books?

The answer, is (kind of) DNS. With the aid of E.164 number to URI mapping (ENUM), DNS can be used to resolve phone numbers into SIP URIs to route the traffic to.

So what is ENUM?

ENUM allows us to bypass the need for a central switch for routing calls to numbers, and instead, through a DNS lookup, resolve a phone number into a reachable SIP URI that is the ultimate destination for the traffic.

Imagine you want to call a company, you dial the phone number for that company, your phone does a DNS query against the phone number, which returns the SIP URI of the company’s PBX, and your phone sends the SIP INVITE directly to the company’s PBX, with no intermediary party carrying the call.

3GPP have specified ENUM as the prefered mechanism for resolving phone numbers into SIP addresses, and while it’s widespread adoption on the public Internet is still in its early days (See my post on The Sad story of ENUM in Australia) it is increasingly common in IMS networks and inside operator networks.

IETF defined RFC 6116 for “The E.164 to Uniform Resource Identifiers (URI) Dynamic Delegation Discovery System (DDDS) Application (ENUM)”, which defines how the system works.

So how does ENUM actually work?

ENUM allow us to lookup a phone number on a DNS server and find the SIP URI a server that will handle traffic for the phone number, but it’s a bit more complicated than the A or AAAA records you’d use to resolve a website, ENUM relies on NAPTR records.

Let’s look at the steps involved in taking an E.164 number and knowing where to send it.

Step 1 – Reverse the Numbers

We read phone numbers from left to right.

This is because historically the switch needs to get all the long-distance routing sorted first. The switch has to route your call to the exchange that serves that subscriber, which is what all the area codes and prefixes assigned to areas are all about (Throwback to SZU for any old Telco buffs).

For an E.164 number you’ve got a Country Code, Area Code and then the Subscriber Number. The number gets more specific as it goes along.

But getting more specific as you go along is the opposite how how DNS works, millions of domains share the .com suffix, and the unique / specific part is the bits before that.

So the first step in the ENUM process is to reverse the phone number, so let’s take phone number (03) 5550 0912, which in E.164 is +61 3 5550 0912.

As the spaces in the phone numbers are there for the humans, we’ll drop all of them and reverse the number, as DNS is more specific right-to-left, so we end up with

2.1.9.0.0.5.5.5.3.1.6

Step 2 – Add the Suffix

The ITU ENUM specifies the suffix e164.arpa be assigned for public ENUM entries. Private ENUM deployments may use their own suffix, but to make life simple I’m going to use e164.arpa as if it were public.

So we’ll append the e164.arpa domain onto our reversed and formatted E.164 phone number:

2.1.9.0.0.5.5.5.3.1.6.e164.arpa

Step 3 – Query it

Next we’ll run a Naming Authority Pointer (NAPTR) query against the domain, to get back a list of records for that number.

DNS is a big topic, and NAPTR and SRV takes up a good chunk of it, but what you need to know is that by using NAPTR we’re not limited to just a single response, we could have a weighted pool of servers handling traffic for this phone number, and be able to control load through the use of NAPTR, amongst other things.

Of course, if our phone can query the public NAPTR records, then so can anyone else, so we can just use a tool like Dig to query the record ourselves,

dig @10.0.1.252 -t naptr 2.1.9.0.0.5.5.5.3.1.6.e164.arpa

In the answers section I’ve setup this DNS server to only return a single response, with the regex SIP URI to use, in my case that’s sip:[email protected]

You’ll obviously need to replace the DNS server with your DNS server, and the query with the reversed and formatted version of the E.164 number you wish to query.

Step 4 – Send SIP traffic

After looking at the NAPTR records returned and using the weight and priority to determine which server/s to send to first, our phone forwards an INVITE to the URI returned in the NAPTR record.

How to interpret the returned results?

The first thing to keep in mind when working with ENUM is multiple records being returned is supported, and even encouraged.

NAPTR results return 7 fields, which define how it should be handled.

The host part is fairly obvious, and defines the host / DNS entry we’re talking about.

The Service defines what type of service this is. ENUM can be expanded beyond just voice, for example you may want to also return an email address or IM address as well as a SIP Address on an ENUM query, which you can do. By default voice uses the “E2U+sip” service to identify SIP URIs to route calls to, so in this context that’s what we’re interested in, but keep in mind there are other types out there,

Example ENUM query against a phone number showing other types of services (Email & Web)

The Order simply defines the order in which the rules are to be parsed. Lower numbers are processed first, if no matches then the next lowest, and so on until the highest number is reached.

The Pref is the processing preference. For load balancing 50/50 between two sites say a Melbourne and Sydney site, we’d return two results, with the same Order, and the same Pref, would see traffic split 50/50 between the two sites.
We could split this further, a Pref value of 10 for Melbourne, 10 for Sydney, 5 for Brisbane and 5 for Perth would see 33% of calls route to Melbourne, 33% of calls route to Sydney, 16.5% of calls route to Brisbane and 16.5% of calls route to Perth.
This is because we’d have a total preference value of 30, and the individual preference for each entry would work out as the fraction of the total (ie Pref 10 out of 30 = 10/30 or 33.3%).

The Flags denote the type of record we’re going to get, for most ENUM traffic this is going to be set to U, to denote a SIP URI with Regex.

The regexp field contains our SIP URI in the form of a Regular expression, which can include pattern matching and replacement. This is most commonly used to fill in the phone number into the SIP URI, for example instead of hardcoding the phone number into the response, we could use a Regular expression to fill in the requested number into the SIP URI.

!(^.*$)!sip:+1\[email protected]!

ENUM sounds great, how do I get it?

Here’s the tricky part.

If you’re looking to implement ENUM for an internal network, great, I’ll have some more posts here over the next few weeks covering off configuration of a DNS server to support ENUM lookups, and using Kamailio to lookup ENUM routes.

In terms of public ENUM, while many carriers are using ENUM inside their networks, public adoption of ENUM in most markets has been slow, for a number of reasons.

Many incumbent operators have been reluctant to embrace public ENUM as their role as an operator would be relegated to that of a Domain registrar.
Additionally, there’s real security risks involved in moving to ENUM – opening your phone system up to the world to accept inbound calls from anywhere. This could lead to DOS-style attacks of flooding phone numbers with automatically generated traffic, privacy risks and even less validation in terms of caller ID trust.

RIPE maintains the EnumData.org website listing the status of ENUM for each country / region.

Telephony binary-coded decimal (TBCD) in Python with Examples

Chances are if you’re reading this, you’re trying to work out what Telephony Binary-Coded Decimal encoding is. I got you.

Again I found myself staring at encoding trying to guess how it worked, reading references that looped into other references, in this case I was encoding MSISDN AVPs in Diameter.

How to Encode a number using Telephony Binary-Coded Decimal encoding?

First, Group all the numbers into pairs, and reverse each pair.

So a phone number of 123456, becomes:

214365

Because 1 & 2 are swapped to become 21, 3 & 4 are swapped to become 34, 5 & 6 become 65, that’s how we get that result.

TBCD Encoding of numbers with an Odd Length?

If we’ve got an odd-number of digits, we add an F on the end and still flip the digits,

For example 789, we add the F to the end to pad it to an even length, and then flip each pair of digits, so it becomes:

87F9

That’s the abbreviated version of it. If you’re only encoding numbers that’s all you’ll need to know.

Detail Overload

Because the numbers 0-9 can be encoded using only 4 bits, the need for a whole 8 bit byte to store this information is considered excessive.

For example 1 represented as a binary 8-bit byte would be 00000001, while 9 would be 00001001, so even with our largest number, the first 4 bits would always going to be 0000 – we’d only use half the available space.

So TBCD encoding stores two numbers in each Byte (1 number in the first 4 bits, one number in the second 4 bits).

To go back to our previous example, 1 represented as a binary 4-bit word would be 0001, while 9 would be 1001. These are then swapped and concatenated, so the number 19 becomes 1001 0001 which is hex 0x91.

Let’s do another example, 82, so 8 represented as a 4-bit word is 1000 and 2 as a 4-bit word is 0010. We then swap the order and concatenate to get 00101000 which is hex 0x28 from our inputted 82.

Final example will be a 3 digit number, 123. As we saw earlier we’ll add an F to the end for padding, and then encode as we would any other number,

F is encoded as 1111.

1 becomes 0001, 2 becomes 0010, 3 becomes 0011 and F becomes 1111. Reverse each pair and concatenate 00100001 11110011 or hex 0x21 0xF3.

Special Symbols (#, * and friends)

Because TBCD Encoding was designed for use in Telephony networks, the # and * symbols are also present, as they are on a telephone keypad.

Astute readers may have noticed that so far we’ve covered 0-9 and F, which still doesn’t use all the available space in the 4 bit area.

The extended DTMF keys of A, B & C are also valid in TBCD (The D key was sacrificed to get the F in).

Symbol4 Bit Word
*1 0 1 0
#1 0 1 1
a1 1 0 0
b1 1 0 1
c1 1 1 0

So let’s run through some more examples,

*21 is an odd length, so we’ll slap an F on the end (*21F), and then encoded each pair of values into bytes, so * becomes 1010, 2 becomes 0010. Swap them and concatenate for our first byte of 00101010 (Hex 0x2A). F our second byte 1F, 1 becomes 0001 and F becomes 1111. Swap and concatenate to get 11110001 (Hex 0xF1). So *21 becomes 0x2A 0xF1.

And as promised, some Python code from PyHSS that does it for you:

    def TBCD_special_chars(self, input):
        if input == "*":
            return "1010"
        elif input == "#":
            return "1011"
        elif input == "a":
            return "1100"
        elif input == "b":
            return "1101"
        elif input == "c":
            return "1100"      
        else:
            print("input " + str(input) + " is not a special char, converting to bin ")
            return ("{:04b}".format(int(input)))


    def TBCD_encode(self, input):
        print("TBCD_encode input value is " + str(input))
        offset = 0
        output = ''
        matches = ['*', '#', 'a', 'b', 'c']
        while offset < len(input):
            if len(input[offset:offset+2]) == 2:
                bit = input[offset:offset+2]    #Get two digits at a time
                bit = bit[::-1]                 #Reverse them
                #Check if *, #, a, b or c
                if any(x in bit for x in matches):
                    new_bit = ''
                    new_bit = new_bit + str(TBCD_special_chars(bit[0]))
                    new_bit = new_bit + str(TBCD_special_chars(bit[1]))    
                    bit = str(int(new_bit, 2))
                output = output + bit
                offset = offset + 2
            else:
                bit = "f" + str(input[offset:offset+2])
                output = output + bit
                print("TBCD_encode output value is " + str(output))
                return output
    

    def TBCD_decode(self, input):
        print("TBCD_decode Input value is " + str(input))
        offset = 0
        output = ''
        while offset < len(input):
            if "f" not in input[offset:offset+2]:
                bit = input[offset:offset+2]    #Get two digits at a time
                bit = bit[::-1]                 #Reverse them
                output = output + bit
                offset = offset + 2
            else:   #If f in bit strip it
                bit = input[offset:offset+2]
                output = output + bit[1]
                print("TBCD_decode output value is " + str(output))
                return output

The PLMN Problem for Private LTE / 5G

So it’s the not to distant future and the pundits vision of private LTE and 5G Networks was proved correct, and private networks are plentiful.

But what PLMN do they use?

The PLMN (Public Land Mobile Network) ID is made up of a Mobile Country Code + Mobile Network Code. MCCs are 3 digits and MNCs are 2-3 digits. It’s how your phone knows to connect to a tower belonging to your carrier, and not one of their competitors.

For example in Australia (Mobile Country Code 505) the three operators each have their own MCC. Telstra as the first licenced Mobile Network were assigned 505/01, Optus got 505/02 and VHA / TPG got 505/03.

Each carrier was assigned a PLMN when they started operating their network. But the problem is, there’s not much space in this range.

The PLMN can be thought of as the SSID in WiFi terms, but with a restriction as to the size of the pool available for PLMNs, we’re facing an IPv4 exhaustion problem from the start if we’re facing an explosion of growth in the space.

Let’s look at some ways this could be approached.

Everyone gets a PLMN

If every private network were to be assigned a PLMN, we’d very quickly run out of space in the range. Best case you’ve got 3 digits, so only space for 1,000 networks.

In certain countries this might work, but in other areas these PLMNs may get gobbled up fast, and when they do, there’s no more. New operators will be locked out of the market.

Loaner PLMNs

Carriers already have their own PLMNs, they’ve been using for years, some kit vendors have been assigned their own as well.

If you’re buying a private network from an existing carrier, they may permit you to use their PLMN,

Or if you’re buying kit from an existing vendor you may be able to use their PLMN too.

But what happens then if you want to move to a different kit vendor or another service provider? Do you have to rebuild your towers, reconfigure your SIMs?

Are you contractually allowed to continue using the PLMN of a third party like a hardware vendor, even if you’re no longer purchasing hardware from them? What happens if they change their mind and no longer want others to use their PLMN?

Everyone uses 999 / 99

The ITU have tried to preempt this problem by reallocating 999/99 for use in Private Networks.

The problem here is if you’ve got multiple private networks in close proximity, especially if you’re using CBRS or in close proximity to other networks, you may find your devices attempting to attach to another network with the same PLMN but that isn’t part of your network,

Mobile Country or Geographical Area Codes
Note from TSB
Following the agreement on the Appendix to Recommendation ITU-T E.212 on “shared E.212 MCC 999 for internal use within a private network” at the closing plenary of ITU-T SG2 meeting of 4 to 13 July 2018, upon the advice of ITU-T Study Group 2, the Director of TSB has assigned the Mobile Country Code (MCC) “999” for internal use within a private network. 

Mobile Network Codes (MNCs) under this MCC are not subject to assignment and therefore may not be globally unique. No interaction with ITU is required for using a MNC value under this MCC for internal use within a private network. Any MNC value under this MCC used in a network has
significance only within that network. 

The MNCs under this MCC are not routable between networks. The MNCs under this MCC shall not be used for roaming. For purposes of testing and examples using this MCC, it is encouraged to use MNC value 99 or 999. MNCs under this MCC cannot be used outside of the network for which they apply. MNCs under this MCC may be 2- or 3-digit.

(Recommendation ITU-T E.212 (09/2016))

The Crystal Ball?

My bet is we’ll see the ITU allocate an MCC – or a range of MCCs – for private networks, allowing for a pool of PLMNs to use.

When deploying networks, Private network operators can try and pick something that’s not in use at the area from a pool of a few thousand options.

The major problem here is that there still won’t be an easy way to identify the operator of a particular network; the SPN is local only to the SIM and the Network Name is only present in the NAS messaging on an attach, and only after authentication.

If you’ve got a problem network, there’s no easy way to identify who’s operating it.

But as eSIMs become more prevalent and BIP / RFM on SIMs will hopefully allow operators to shift PLMNs without too much headache.

Diameter Agents

Let’s take a look at each of the common Diameter agent variants in use today:

Diameter Relay Agent / Diameter Routing Agent (DRA)

This is the simplest of the Diameter agents, but also probably the most common. The Diameter Relay agent does not look at the contents of the AVPs, it just routes messages based on the Application ID or Destination realm.

A Diameter Relay Agent does not change any AVPs except routing AVPs.

DRAs are transaction aware, but not dialog aware. This means they know if the Diameter request made it to the destination, but have no tracking of getting a response.

DRAs are common as a central hub for all Diameter hub in a network. This allows for a star topology where every Diameter service connects to a central DRA (typically two DRAs for redundancy) for a central place to manage Diameter routing, instead of having to do a full-mesh topology, which would be a nightmare on larger networks.

I recently wrote about creating a simple but unstable DRA with Kamailio.

Diameter Edge Agent

A Diameter Edge Agent is a special DRA that sits on the border between two networks and acts as a gateway between them.

Imagine a roaming exchange scenario, where each operator has to expose their core Diameter servers or DRAs to all the other operators they have roaming agreements with. Like we saw with the DRA to do a full-mesh style connection arrangement would be a mess, and wouldn’t allow internal changes inside the network without significant headaches.

Instead by putting a Diameter Edge Agent at the edge of the network, the operators who wish to access our Diameter information for roaming, only need to connect to a single point, and we can change whatever we like on the inside of the network, adding and removing servers, without having to update our roaming information (IR 21).

We can also strictly enforce security policies on rate limits and admission control, centrally, for all connections in from other operators.

Diameter Proxy Agent

The Diameter Proxy Agent does everything a DRA does, and more!

The Diameter Proxy Agent is application aware, meaning it can decode the AVPs and make decisions based upon the contents of the AVPs. It’s also able to edit / add / delete AVPs and Sub-AVPs.

These are useful for interconnect scenarios where you might need to re-write the value of an AVP, or translate a realm etc, on a Diameter request/response journey.

Diameter Translation Agent

Diameter Translation agents are used for translating between protocols, for example Diameter into MAP for GSM authentication, or into HTTP for 5G authentication.

For 5GC a new network element – the “Binding Support Function” (BSF) is introduced to translate between HTTP for 5G and Diameter for LTE, however this can be thought of as another Diameter Translation Agent.

IMS Routing with iFCs

SIP routing is complicated, there’s edge cases, traffic that can be switched locally and other traffic that needs to be proxied off to another Proxy or Application server. How can you define these rules and logic in a flexible way, that allows these rules to be distributed out to multiple different network elements and adjusted on a per-subscriber basis?

Enter iFCs – The Initial Filter Criteria.

iFCs are XML encoded rules to define which servers should handle traffic matching a set of rules.

Let’s look at some example rules we might want to handle through iFCs:

  • Send all SIP NOTIFY, SUBSCRIBE and PUBLISH requests to a presence server
  • Any Mobile Originated SMS to an SMSc
  • Calls to a specific destination to a MGC
  • Route any SIP INVITE requests with video codecs present to a VC bridge
  • Send calls to Subscribers who aren’t registered to a Voicemail server
  • Use 3rd party registration to alert a server that a Subscriber has registered

All of these can be defined and executed through iFCs, so let’s take a look,

iFC Structure

iFCs are encoded in XML and typically contained in the Cx-user-data AVP presented in a Cx Server Assignment Answer response.

Let’s take a look at an example iFC and then break down the details as to what we’re specifying.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

Each rule in an iFC is made up of a Priority, TriggerPoint and ApplicationServer.

So for starters we’ll look at the Priority tag.
The Priority tag allows us to have multiple-tiers of priority and multiple levels of matching,
For example if we had traffic matching the conditions outlined in this rule (TriggerPoint) but also matching another rule with a lower priority, the lower priority rule would take precedence.

Inside our <TriggerPoint> tag contains the specifics of the rules and how the rules will be joined / matched, which is what we’ll focus on predominantly, and is followed by the <ApplicationServer> which is where we will route the traffic to if the TriggerPoint is matched / triggered.

So let’s look a bit more about what’s going on inside the TriggerPoint.

Each TriggerPoint is made up of Service Point Trigger (SPTs) which are individual rules that are matched or not matched, that are either combined as logical AND or logical OR statements when evaluated.

By using fairly simple building blocks of SPTs we can create a complex set of rules by joining them together.

Service Point Triggers (SPTs)

Let’s take a closer look at what goes on in an SPT.
Below is a simple SPT that will match all SIP requests using the SIP MESSAGE method request type:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

So as you may have guessed, the <Method> tag inside the SPT defines what SIP request method we’re going to match.

But Method is only one example of the matching mechanism we can use, but we can also match on other attributes, such as Request URI, SIP Header, Session Case (Mobile Originated vs Mobile Terminated) and Session Description such as SDP.

Or an example of a SPT for anything Originating from the Subscriber utilizing the <SessionCase> tag inside the SPT.

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SessionCase>0</SessionCase>
        </SPT>

Below is another SPT that’s matching any requests where the request URI is sip:[email protected] by setting the <RequestURI> tag inside the SPT:

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

We can match SIP headers, either looking for the existence of a header or the value it is set too,

        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <SIPHeader>
              <Header>To</Header>
              <Content>"Nick"</Content>
            </SIPHeader>
        </SPT>

Having <Header> will match if the header is present, while the optional Content tag can be used to match

In terms of the Content this is matched using Regular Expressions, but in this case, not so regular regular expressions. 3GPP selected Extended Regular Expressions (ERE) to be used (IEEE POSIX) which are similar to the de facto standard PCRE Regex, but with a few fewer parameters.

Condition Negated

The <ConditionNegated> tag inside the SPT allows us to do an inverse match.

In short it will match anything other than what is specified in the SPT.

For example if we wanted to match any SIP Methods other than MESSAGE, setting <ConditionNegated>1</ConditionNegated> would do just that, as shown below:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>

And another example of ConditionNegated in use, this time we’re matching anything where the Request URI is not sip:[email protected]:

        <SPT>
            <ConditionNegated>1</ConditionNegated>
            <Group>0</Group>
            <RequestURI>sip:[email protected]</RequestURI>
        </SPT>

Finally the <Group> tag allows us to group together a group of rules for the purpose of evaluating.
We’ll go into it more in in the below section.

ConditionTypeCNF / ConditionTypeDNF

As we touched on earlier, <TriggerPoints> contain all the SPTs, but also, very importantly, specify how they will be interpreted.

SPTs can be joined in AND or OR conditions.

For some scenarios we may want to match where METHOD is MESSAGE and RequestURI is sip:[email protected], which is different to matching where the METHOD is MESSAGE or RequestURI is sip:[email protected].

This behaviour is set by the presence of one of the ConditionTypeCNF (Conjunctive Normal Form) or ConditionTypeDNF (Disjunctive Normal Form) tags.

If each SPT has a unique number in the GroupTag and ConditionTypeCNF is set then we evaluate as AND.

If each SPT has a unique number in the GroupTag and ConditionTypeDNF is set then we evaluate as OR.

Let’s look at how the below rule is evaluated as AND as ConditionTypeCNF is set:

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeCNF>1</ConditionTypeCNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated) as each SPT is in a different Group which leads to “and” behaviour.

If we were to flip to ConditionTypeDNF each of the SPTs are evaluated as OR.

<InitialFilterCriteria>
    <Priority>10</Priority>
    <TriggerPoint>
        <ConditionTypeDNF>1</ConditionTypeDNF>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>0</Group>
            <Method>MESSAGE</Method>
        </SPT>
        <SPT>
            <ConditionNegated>0</ConditionNegated>
            <Group>1</Group>
            <SessionCase>0</SessionCase>
        </SPT>
    </TriggerPoint>
    <ApplicationServer>
        <ServerName>sip:smsc.mnc001.mcc001.3gppnetwork.org:5060</ServerName>
        <DefaultHandling>0</DefaultHandling>
    </ApplicationServer>
</InitialFilterCriteria>

This means we will match if the method is MESSAGE and Session Case is 0 (Mobile Originated).

Where this gets a little bit more complex is when we have multiple entries in the same Group tag.

Let’s say we have a trigger point made up of:

<SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
<SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

<SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

How would this be evaluated?

If we use ConditionTypeDNF every SPT inside the same Group are matched as AND, and SPTs with distinct are matched as OR.

Let’s look at our example rule evaluated as ConditionTypeDNF:

<ConditionTypeDNF>1</ConditionTypeDNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as AND – So Method is message and Session Case is 0, OR the header “P-Some-Header” is present.

Let’s do another one, this time as ConditionTypeCNF:

<ConditionTypeCNF>1</ConditionTypeCNF>
  <SPT><Method>MESSAGE</Method><Group>1</Group></SPT>
  <SPT><SessionCase>0</SessionCase><Group>1</Group></SPT> 

  <SPT><Header>P-Some-Header</Header><Group>2</Group></SPT> 

This means the two entries in Group 1 are evaluated as OR – So Method is message OR Session Case is 0, AND the header “P-Some-Header” is present.

Diameter Droplets – The Flow-Description AVP and IPFilterRules

When it comes to setting up dedicated bearers, the Flow-Description AVP is perhaps the most important,

The specially encoded string (IPFilterRule) in the FlowDescription AVP is what our P-GW (Ok, our PCEF) uses to create Traffic Flow Templates to steer certain types of traffic down Dedicated Bearers.

So let’s take a look at how we can lovingly craft an artisanal Flow-Description.

The contents of the AVP are technically not a string, but a IPFilterRule.

IPFilterRules are actually defined in the Diameter Base Protocol (IETF RFC 6733), where we can learn the basics of encoding them,

Which are in turn based loosely off the ipfw utility in BSD.

They take the format:

action dir proto from src to dst

The action is fairly simple, for all our Dedicated Bearer needs, and the Flow-Description AVP, the action is going to be permit. We’re not blocking here.

The direction (dir) in our case is either in or out, from the perspective of the UE.

Next up is the protocol number (proto), as defined by IANA, but chances are you’ll be using 17 (UDP) or 6 (TCP) in most scenarios.

The from value is followed by an IP address with an optional subnet mask in CIDR format, for example from 10.45.0.0/16 would match everything in the 10.45.0.0/16 network.
Following from you can also specify the port you want the rule to apply to, or, a range of ports,
For example to match a single port you could use 10.45.0.0/16 1234 to match anything on port 1234, but we can also specify ranges of ports like 10.45.0.0/16 0 – 4069 or even mix and match lists and single ports, like 10.45.0.0/16 5060, 1000-2000

Protip: using any is the same as 0.0.0.0/0

Like the from, the to is encoded in the same way, with either a single IP, or a subnet, and optional ports specified.

And that’s it!

Keep in mind that Flow-Descriptions are typically sent in pairs as a minimum, as you want to match the traffic into and out of the network (not just one way), but often there can be quite a few sent, in order to match all the possible traffic that needs to be matched that may be across multiple different subnets, etc.

There is an optional Options parameter that allows you to set things like to only apply the rule to open TCP sessions, fragmentation, etc, although I’ve not seen this implemented in the wild.

Example IP filter Rules

permit in 6 from 10.98.254.0/24 5061 to 10.98.0.0/24 5060
permit out 6 from 10.98.254.0/24 5060 to 10.98.0.0/24 5061

permit in 6 from any 80 to 172.16.1.1 80
permit out 6 from 172.16.1.1 80 to any 80

permit in 17 from 10.98.254.0/24 50000-60100 to 10.98.0.0/24 50000-60100
permit out 17 from 10.98.254.0/24 50000-60100 to 10.98.0.0/24 50000-60100

permit in 17 from 10.98.254.0/24 5061, 5064 to 10.98.0.0/24  5061, 5064
permit out 17 from 10.98.254.0/24 5061, 5064 to 10.98.0.0/24  5061, 5064

permit in 17 from 172.16.0.0/16 50000-60100, 5061, 5064 to 172.16.0.0/16  50000-60100, 5061, 5064
permit out 17 from 172.16.0.0/16 50000-60100, 5061, 5064 to 172.16.0.0/16  50000-60100, 5061, 5064

For more info see:

RFC 6773 – Diameter Base Protocol – IP Filter Rule

3GPP TS 29.214 section 5.3.8 Flow-Description AVP

Open5Gs Logo

Open5Gs Database Schema Change

As Open5Gs has introduced network slicing, which led to a change in the database used,

Alas many users had subscribers provisioned in the old DB schema and no way to migrate the SDM data between the old and new schema,

If you’ve created subscribers on the old schema, and now after the updates your Subscriber Authentication is failing, check out this tool I put together, to migrate your data over.

The Open5Gs Python library I wrote has also been updated to support the new schema.

A very unstable Diameter Routing Agent (DRA) with Kamailio

I’d been trying for some time to get Kamailio acting as a Diameter Routing Agent with mixed success, and eventually got it working, after a few changes to the codebase of the ims_diameter_server module.

It is rather unstable, in that if it fails to dispatch to a Diameter peer, the whole thing comes crumbling down, but incoming Diameter traffic is proxied off to another Diameter peer, and Kamailio even adds an extra AVP.

Having used Kamailio for so long I was really hoping I could work with Kamailio as a DRA as easily as I do for SIP traffic, but it seems the Diameter module still needs a lot more love before it’ll be stable enough and simple enough for everyone to use.

I created a branch containing the fixes I made to make it work, and with an example config for use, but use with caution. It’s a long way from being production-ready, but hopefully in time will evolve.

https://github.com/nickvsnetworking/kamailio/tree/Diameter_Fix

SIM Card Sniffing with Wireshark

I never cease to be amazed as to what I can do with Wireshark.

While we’re working with Smart Card readers and SIM cards, capturing and Decoding USB traffic to see what APDUs are actually being sent can be super useful, so in this post we’ll look at how we can use Wireshark to sniff the USB traffic to view APDUs being sent to smart cards from other software.

For the purposes of this post I’ll be reading the SIM cards with pySim, but in reality it’ll work with any proprietary SIM software, allowing you to see what’s actually being said to the card by your computer.

If you want to see what’s being sent between your phone and SIM card, the Osmocom SIMtrace is the device for you (And yes it also uses Wireshark for viewing this data!).

Getting your System Setup

We’ve got to get some permissions setup,

sudo adduser $USER wireshark
sudo dpkg-reconfigure wireshark-common

Followed by a reboot to take effect, then we’ll run these two commands, which will need to be run each time we want to capture USB traffic:

modprobe usbmon
sudo setfacl -m u:$USER:r /dev/usbmon*

Ok, that’s all the prerequisites sorted, next we need to find the bus and device ID of our smart card reader,

We can get this listed with

lsusb

Here you can see I have a Smart Card reader on Bus 1 device 03 and another on Bus 2 device 10.

The reader I want to use is the “SCM Microsystems, Inc. SCR35xx USB Smart Card Reader” so I’ll jott down Bus 2 device 10. Yours will obviously be different, but you get the idea.

Finding the USB traffic in Wireshark

Next we’ll fire up Wireshark, if you’ve got your permissions right and followed along, you should see a few more interfaces starting with usbmonX in the capture list.

Because the device I want to capture from is on Bus 2, we’ll select usbmon2 and start capturing,

As you can see we’ve got a bit of a firehose of data, and we only care about device 10 on bus 2, so let’s filter for that.

So let’s generate some data and then filter for it, to generate some data I’m going to run pySim-read to read the data on a smart card that’s connected to my PC, and then filter to only see traffic on that USB device,

In my case as the USB device is 10 it’s got two sub addresses, so I’ll filter for USB Bus 2, device 10 sub-address 1 and 2, so the filter I’ll use is:

usb.addr=="2.10.1" or usb.addr=="2.10.2"

But this doesn’t really show us much, so let’s tell Wireshark this is PCSC/UCCID data to decode it as such;

So we’ll select some of this traffic -> Decode as -> USBCCID

Still not seeing straight APDUs, so let’s tell Wireshark one more bit of information – That we want to decode this information as GSM SIM data;

Again, we’ll select the data part of the USBCCID traffic -> Decode As -> GSM_SIM

And bingo, just like that we can now filter by gsm_sim and see the APDUs being sent / received.

Wireshark is pretty good at decoding what is going on, SELECT, has all the File IDs populated for 3GPP SIM specification. (last year I submitted a patch to include the latests 5G EFs for decoding).

I’ve found this super useful for seeing what commercial software is doing to read cards, and to make it easy to reproduce myself.

SIM / Smart Card Deep Dive – Part 4 – Interacting with Cards IRL

This is part 3 of an n part tutorial series on working with SIM cards.

So in our last post we took a whirlwind tour of what an APDU does, is, and contains.

Interacting with a card involves sending the APDU data to the card as hex, which luckily isn’t as complicated as it seems.

While reading what the hex should look like on the screen is all well and good, actually interacting with cards is the name of the game, so that’s what we’ll be doing today, and we’ll start to abstract some of the complexity away.

Getting Started

To follow along you will need:

  • A Smart Card reader – SIM card / Smart Card readers are baked into some laptops, some of those multi-card readers that read flash/SD/CF cards, or if you don’t have either of these, they can be found online very cheaply ($2-3 USD).
  • A SIM card – No need to worry about ADM keys or anything fancy, one of those old SIM cards you kept in the draw because you didn’t know what to do with them is fine, or the SIM in our phone if you can find the pokey pin thing. We won’t go breaking anything, promise.

You may end up fiddling around with the plastic adapters to change the SIM form factor between regular smart card, SIM card (standard), micro and nano.

USB SIM / Smart Card reader supports all the standard form factors makes life a lot easier!

To keep it simple, we’re not going to concern ourselves too much with the physical layer side of things for interfacing with the card, so we’ll start with sending raw APDUs to the cards, and then we’ll use some handy libraries to make life easier.

PCSC Interface

To abstract away some complexity we’re going to use the industry-standard PCSC (PC – Smart Card) interface to communicate with our SIM card. Throughout this series we’ll be using a few Python libraries to interface with the Smart Cards, but under the hood all will be using PCSC to communicate.

pyscard

I’m going to use Python3 to interface with these cards, but keep in mind you can find similar smart card libraries in most common programming languages.

At this stage as we’re just interfacing with Smart Cards, our library won’t have anything SIM-specific (yet).

We’ll use pyscard to interface with the PCSC interface. pyscard supports Windows and Linux and you can install it using PIP with:

pip install pyscard

So let’s get started by getting pyscard to list the readers we have available on our system:

#!/usr/bin/env python3
from smartcard.System import *
print(readers())

Running this will output a list of the readers on the system:

Here we can see the two readers that are present on my system (To add some confusion I have two readers connected – One built in Smart Card reader and one USB SIM reader):

(If your device doesn’t show up in this list, double check it’s PCSC compatible, and you can see it in your OS.)

So we can see when we run readers() we’re returned a list of readers on the system.

I want to use my USB SIM reader (The one identified by Identiv SCR35xx USB Smart Card Reader CCID Interface 00 00), so the next step will be to start a connection with this reader, which is the first in the list.

So to make life a bit easier we’ll store the list of smart card readers and access the one we want from the list;

#!/usr/bin/env python3
from smartcard.System import *
r = readers()
connection = r[0].createConnection()
connection.connect()

So now we have an object for interfacing with our smart card reader, let’s try sending an APDU to it.

Actually Doing something Useful

Today we’ll select the EF that contains the ICCID of the card, and then we will read that file’s binary contents.

This means we’ll need to create two APDUs, one to SELECT the file, and the other to READ BINARY to get the file’s contents.

We’ll set the instruction byte to A4 to SELECT, and B0 to READ BINARY.

Table of Instruction bytes from TS 102 221

APDU to select EF ICCID

The APDU we’ll send will SELECT (using the INS byte value of A4 as per the above table) the file that contains the ICCID.

Each file on a smart card has been pre-created and in the case of SIM cards at least, is defined in a specification.

For this post we’ll be selecting the EF ICCID, which is defined in TS 102 221.

Information about EF-ICCID from TS 102 221

To select it we will need it’s identifier aka File ID (FID), for us the FID of the ICCID EF is 2FE2, so we’ll SELECT file 2FE2.

Going back to what we learned in the last post about structuring APDUs, let’s create the APDU to SELECT 2FE2.

CodeMeaningValue
CLAClass bytes – Coding optionsA0 (ISO 7816-4 coding)
INSInstruction (Command) to be calledA4 (SELECT)
P1Parameter 1 – Selection Control (Limit search options)00 (Select by File ID)
P2Parameter 1 – More selection options04 (No data returned)
LcLength of Data 02 (2 bytes of data to come)
DataFile ID of the file to Select2FE2 (File ID of ICCID EF)

So that’s our APDU encoded, it’s final value will be A0 A4 00 04 02 2FE2

So let’s send that to the card, building on our code from before:

#!/usr/bin/env python3
from smartcard.System import *
from smartcard.util import *
r = readers()
connection = r[0].createConnection()
connection.connect()

print("Selecting ICCID File")
data, sw1, sw2 = connection.transmit(toBytes('00a40004022fe2'))
print("Returned data: " + str(data))
print("Returned Status Word 1: " + str(sw1))
print("Returned Status Word 2: " + str(sw2))

If we run this let’s have a look at the output we get,

We got back:

Selecting ICCID File
 Returned data: []
 Returned Status Word 1: 97
 Returned Status Word 2: 33

So what does this all mean?

Well for starters no data has been returned, and we’ve got two status words returned, with a value of 97 and 33.

We can lookup what these status words mean, but there’s a bit of a catch, the values we’re seeing are the integer format, and typically we work in Hex, so let’s change the code to render these values as Hex:

#!/usr/bin/env python3
from smartcard.System import *
from smartcard.util import *
r = readers()
connection = r[0].createConnection()
connection.connect()

print("Selecting ICCID File")
data, sw1, sw2 = connection.transmit(toBytes('00a40004022fe2'))
print("Returned data: " + str(data))
print("Returned Status Word 1: " + str(hex(sw1)))
print("Returned Status Word 2: " + str(hex(sw2)))

Now we’ll get this as the output:

Selecting ICCID File
Returned data: []
Returned Status Word 1: 0x61
Returned Status Word 2: 0x1e

So what does this all mean?

Well, there’s this handy website with a table to help work this out, but in short we can see that Status Word 1 has a value of 61, which we can see means the command was successfully executed.

Status Word 2 contains a value of 1e which tells us that there are 30 bytes of extra data available with additional info about the file. (We’ll cover this in a later post).

So now we’ve successfully selected the ICCID file.

Keeping in mind with smart cards we have to select a file before we can read it, so now let’s read the binary contents of the file we selected;

The READ BINARY command is used to read the binary contents of a selected file, and as we’ve already selected the file 2FE2 that contains our ICCID, if we run it, it should return our ICCID.

If we consult the table of values for the INS (Instruction) byte we can see that the READ BINARY instruction byte value is B0, and so let’s refer to the spec to find out how we should format a READ BINARY instruction:

CodeMeaningValue
CLAClass bytes – Coding optionsA0 (ISO 7816-4 coding)
INSInstruction (Command) to be calledB0 (READ BINARY)
P1Parameter 1 – Coding / Offset00 (No Offset)
P2Parameter 2 – Offset Low00
LeHow many bytes to read0A (10 bytes of data to come)

We know the ICCID file is 10 bytes from the specification, so the length of the data to return will be 0A (10 bytes).

Let’s add this new APDU into our code and print the output:

#!/usr/bin/env python3
from smartcard.System import *
from smartcard.util import *
r = readers()
connection = r[0].createConnection()
connection.connect()

print("Selecting ICCID File")
data, sw1, sw2 = connection.transmit(toBytes('00a40000022fe2'))
print("Returned data: " + str(data))
print("Returned Status Word 1: " + str(hex(sw1)))
print("Returned Status Word 2: " + str(hex(sw2)))

And we have read the ICCID of the card.

Phew.

That’s the hardest thing we’ll need to do over.

From now on we’ll be building the concepts we covered here to build other APDUs to get our cards to do useful things. Now you’ve got the basics of how to structure an APDU down, the rest is just changing values here and there to get what you want.

In our next post we’ll read a few more files, write some files and delve a bit deeper into exactly what it is we are doing.

Want more? 
You can also get the weekly posts on the blog by Connecting on LinkedIn, following me on Twitter, or Subscribing via RSS.
Information stored on USIM / SIM Card for LTE / EUTRAN / EPC - K key, OP/OPc key and SQN Sequence Number

Confidentiality Algorithms in 3GPP Networks: MILENAGE, XOR & Comp128

We’ve covered a fair bit on authentication in 3GPP networks, SIM cards, HSS / AuC, etc, but never actually looked at the Confidentiality Algorithms in use,

LTE USIM Authentication - Mutual Authentication of the Network and Subscriber

While we’ve already covered the inputs required by the authentication elements of the core network (The HSS in LTE/4G, the AuC in UMTS/3G and the AUSF in 5G) to generate an output, it’s worth noting that the Confidentiality Algorithms used in the process determines the output.

This means the Authentication Vector (Also known as an F1 and F1*) generated for a subscriber using Milenage Confidentiality Algorithms will generate a different output to that of Confidentiality Algorithms XOR or Comp128.

To put it another way – given the same input of K key, OPc Key (or OP key), SQN & RAND (Random) a run with Milenage (F1 and F1* algorithm) would yield totally different result (AUTN & XRES) to the same inputs run with a simple XOR.

Technically, as operators control the network element that generates the challenges, and the USIM that responds to them, it is an option for an operator to implement their own Confidentiality Algorithms (Beyond just Milenage or XOR) so long as it produced the same number of outputs. But rolling your own cryptographic anything is almost always a terrible idea.

So what are the differences between the Confidentiality Algorithms and which one to use?
Spoiler alert, the answer is Milenage.

Milenage

Milenage is based on AES (Originally called Rijndael) and is (compared to a lot of other crypto implimentations) fairly easy to understand,

AES is very well studied and understood and unlike Comp128 variants, is open for anyone to study/analyse/break, although AES is not without shortcomings, it’s problems are at this stage, fairly well understood and mitigated.

There are a few clean open source examples of Milenage implementations, such as this C example from FreeBSD.

XOR

It took me a while to find the specifications for the XOR algorithm – it turns out XOR is available as an alternate to Milenage available on some SIM cards for testing only, and the mechanism for XOR Confidentiality Algorithm is only employed in testing scenarios, not designed for production.

Instead of using AES under the hood like Milenage, it’s just plan old XOR of the keys.

Osmocom have an implementation of this in their CN code, you can find here.

Defined under 3GPP TS 34.108 8.1.2.1

Comp128

Comp128 was originally a closed source algorithm, with the maths behind it not publicly available to scrutinise. It is used in GSM A3 and A5 functions, akin to the F1 and F1* in later releases.

Due to its secretive nature it wasn’t able to be studied or analysed prior to deployment, with the idea that if you never said how your crypto worked no one would be able to break it. Spoiler alert; public weaknesses became exposed as far back as 1998, which led to Toll Fraud, SIM cloning and eventually the development of two additional variants, with the original Comp128 renamed Comp128-1, and Comp128-2 (stronger algorithm than the original addressing a few of its flaws) and Comp128-3 (Same as Comp128-2 but with a 64 bit long key generated).

Into the Future & 5G later releases

As options beyond just USIM authentication become available for authentication in 5G SA networks, additional algorithms can be used beyond EAP and AKA, but at the time of writing only TLS has been added. 5G adds SUCI and SUPI which provide a mechanism to keep the private identifier (IMSI) away from prying eyes (or antenna), which I’ve detailed in this post.

SIM / Smart Card Deep Dive – Part 3 – APDUs and Hello Card

In our last post we covered the file system structure of a smart card and the basic concepts of communication with cards. In this post we’ll look at what happens on the application layer, and how to interact with a card.

For these examples I’ll be using SIM cards, because admit it, you’ve already got a pile sitting in a draw, and this is a telco blog after all. You won’t need the ADM keys for the cards, we’ll modify files we’ve got write access to by default.

Commands & Instructions

So to do anything useful with the card we need issue commands / instructions to the card, to tell it to do things. Instructions like select this file, read it’s contents, update the contents to something else, verify my PIN, authenticate to the network, etc.

The term Command and Instruction are used somewhat interchangeably in the spec, I realise that I’ve done the same here to make it just as confusing, but instruction means the name of the specific command to be called, and command typically means the APDU as a whole.

The “Generic Commands” section of 3GPP TS 31.101 specifies the common commands, so let’s take a look at one.

The creatively named SELECT command/instruction is used to select the file we want to work with. In the SELECT command we’ll include some parameters, like where to find the file, so some parameters are passed with the SELECT Instruction to limit the file selection to a specific area, etc, the length of the file identifier to come, and the identifier of the file.

The card responds with a Status Word, returned by the card, to indicate if it was successful. For example if we selected a file that existed and we had permission to select, we’d get back a status word indicating the card had successfully selected the file. Status Words are 2 byte responses that indicate if the instruction was successful, but also the card has data it wants to send to the terminal as a result of the instruction, how much data the terminal should expect.

So if we just run a SELECT command, telling the card to select a file, we’ll get back a successful response from the card with a data length. Next need to get that data from the card. As the card can’t initiate communication, the GET RESPONSE instruction is sent to the card to get the data from the card, along with the length of the data to be returned.

The GET RESPONSE instruction/command is answered by the card with an APDU containing the data the card has to send, and the last 2 bytes contain the Status Word indicating if it was successful or not.

APDUs

So having covered the physical and link layers, we now move onto the Application Layer – where the magic happens.

Smart card communications is strictly master-slave based when it comes to the application layer.

The terminal sends a command to the card, which in turn sends back a response. Command -> Response, Command -> Response, over and over.

These commands are contained inside APplication Data Units (APDUs).

So let’s break down a simple APDU as it appears on the wire, so to speak.

The first byte of our command APDU is taken up with a header called the class byte, abbreviated to CLA. This specifies class coding, secure messaging options and channel options.

In the next byte we specify the Instruction for the command, that’s the task / operation we want the card to perform, in the spec this is abbreviated to INS.

The next two bytes, called P1 & P2 (Parameter 1 & Parameter 2) specify the parameters of how the instruction is to be to be used.

Next comes Lc – Length of Command, which specifies the length of the command data to follow,

Data comes next, this is instruction data of the length specified in Lc.

Finally an optional Le – Length of expected response can be added to specify how long the response from the card should be.

Crafting APDUs

So let’s encode our own APDU to send to a card, for this example we’ll create the APDU to tell the card to select the Master File (MF) – akin to moving to the root directory on a *nix OS.

For this we’ll want a copy of ETSI TS 102 221 – the catchily named “Smart cards; UICC-Terminal interface; Physical and logical characteristics” which will guide in the specifics of how to format the command, because all the commands are encoded in hexadecimal format.

So here’s the coding for a SELECT command from section 11.1.1.1 “SELECT“,

For the CLA byte in our example we’ll indicate in our header that we’re using ISO 7816-4 encoding, with nothing fancy, which is denoted by the byte A0.

For the next but we’ve got INS (Instruction) which needs to be set to the hex value for SELECT, which is represented by the hex value A4, so our second byte will have that as it’s value.

The next byte is P1, which specifies “Selection Control”, the table in the specification outlines all the possible options, but we’ll use 00 as our value, meaning we’ll “Select DF, EF or MF by file id”.

The next byte P2 specifies more selection options, we’ll use “First or only occurrence” which is represented by 00.

The Lc byte defines the length of the data (file id) we’re going to give in the subsequent bytes, we’ve got a two byte File ID so we’ll specify 2 (represented by 02).

Finally we have the Data field, where we specify the file ID we want to select, for the example we’ll select the Master File (MF) which has the file ID ‘3F00‘, so that’s the hex value we’ll use.

So let’s break this down;

CodeMeaningValue
CLAClass bytes – Coding optionsA0 (ISO 7816-4 coding)
INSInstruction (Command) to be calledA4 (SELECT)
P1Parameter 1 – Selection Control (Limit search options)00 (Select by File ID)
P2Parameter 1 – More selection options00 (First occurrence)
LcLength of Data 02 (2 bytes of data to come)
DataFile ID of the file to Select3F00 (File ID of master file)

So that’s our APDU encoded, it’s final value will be A0 A4 00 00 02 3F00

So there we have it, a valid APDU to select the Master File.

In the next post we’ll put all this theory into practice and start interacting with a real life SIM cards using PySIM, and take a look at the APDUs with Wireshark.

SIM / Smart Card Deep Dive – Part 2 – Meet & Greet

Layer 1 – Pinout and Connections

Before we can get all excited about talking to cards, let’s look at how we interface with them on a physical level.

For “Classic” smart cards interface is through the fingernail sized contacts on the card.

As you’d expect there’s a VCC & Ground line for powering the card, a clock input pin for clocking it and a single I/O pin.

ISO/IEC 7816-3 defines the electrical interface and transmission protocols.

The pins on the terminal / card reader are arranged so that when inserting a card, the ground contact is the first contact made with the reader, this clever design consideration to protect the card and the reader from ESD damage.

Operating Voltages

When Smart Cards were selected for use in GSM for authenticating subscribers, all smart cards operated at 5v. However as mobile phones got smaller, the operating voltage range became more limited, the amount of space inside the handset became a premium and power efficiency became imperative. The 5v supply for the SIM became a difficult voltage to provide (needing to be buck-boosted) so lower 3v operation of the cards became a requirement, these cards are referred to as “Class B” cards. This has since been pushed even further to 1.8v for “Class C” cards.

If you found a SIM from 1990 it’s not going to operate in a 1.8v phone, but it’s not going to damage the phone or the card.

The same luckily goes in reverse, a card designed for 1.8v put into a phone from 1990 will work just fine at 5v.

This is thanks to the class flag in the ATR response, which we’ll cover later on.

Clocks

As we’re sharing one I/O pin for TX and RX, clocking is important for synchronising the card and the reader. But when smart cards were initially designed the clock pin on the card also served as the clock for the micro controller it contained, as stable oscillators weren’t available in such a tiny form factor. Modern cards implement their own clock, but the clock pin is still required for synchronising the communication.

I/O Pin

The I/O pin is used for TX & RX between the terminal/phone/card reader and the Smart Card / SIM card. Having only one pin means the communications is half duplex – with the Terminal then the card taking it in turns to transmit.

Reset Pin

Resets the card’s communications with the terminal.

Filesystem

So a single smart card can run multiple applications, the “SIM” is just an application, as is USIM, ISIM and any other applications on the card.

These applications are arranged on a quasi-filesystem, with 3 types of files which can be created, read updated or deleted. (If authorised by the card.)

Because the file system is very basic, and somewhat handled like a block of contiguous storage, you often can’t expand a file – when it is created the required number of bytes are allocated to it, and no more can be added, and if you add file A, B and C, and delete file B, the space of file B won’t be available to be used until file C is deleted.

This is why if you cast your mind back to when contacts were stored on your phone’s SIM card, you could only have a finite number of contacts – because that space on the card had been allocated for contacts, and additional space can no longer be allocated for extra contacts.

So let’s take a look at our 3 file types:

MF (Master File)

The MF is like the root directory in Linux, under it contains all the files on the card.

DF (Dedciated File)

An dedicated file (DF) is essentially a folder – they’re sometimes (incorrectly) referred to as Directory Files (which would be a better name).

They contain one or more Elementary Files (see below), and can contain other DFs as well.

Dedicated Files make organising the file system cleaner and easier. DFs group all the relevant EFs together. 3GPP defines a dedicated file for Phonebook entries (DFphonebook), MBMS functions (DFtv) and 5G functions (DF5gs).

We also have ADFs – Application Dedicated Files, for specific applications, for example ADFusim contains all the EFs and DFs for USIM functionality, while ADFgsm contains all the GSM SIM functionality.

The actual difference with an ADF is that it’s not sitting below the MF, but for the level of depth we’re going into it doesn’t matter.

DFs have a name – an Application Identifier (AID) used to address them, meaning we can select them by name.

EF (Elementary File)

Elementary files are what would actually be considered a file in Linux systems.

Like in a Linux file systems EFs can have permissions, some EFs can be read by anyone, others have access control restrictions in place to limit who & what can access the contents of an EF.

There are multiple types of Elementary Files; Linear, Cyclic, Purse, Transparent and SIM files, each with their own treatment by the OS and terminal.

Most of the EFs we’ll deal with will be Transparent, meaning they ##

ATR – Answer to Reset

So before we can go about working with all our files we’ll need a mechanism so the card, and the terminal, can exchange capabilities.

There’s an old saying that the best thing about standards is that there’s so many to choose, from and yes, we’ve got multiple variants/implementations of the smart card standard, and so the card and the terminal need to agree on a standard to use before we can do anything.

This is handled in a process called Answer to Reset (ATR).

When the card is powered up, it sends it’s first suggestion for a standard to communicate over, if the terminal doesn’t want to support that, it just sends a pulse down the reset line, the card resets and comes back with a new offer.

If the card offers a standard to communicate over that the terminal does like, and does support, the terminal will send the first command to the card via the I/O line, this tells the card the protocol preferences of the terminal, and the card responds with it’s protocol preferences. After that communications can start.

Basic Principles of Smart Cards Communications

So with a single I/O line to the card, it kind of goes without saying the communications with the card is half-duplex – The card and the terminal can’t both communicate at the same time.

Instead a master-slave relationship is setup, where the smart card is sent a command and sends back a response. Command messages have a clear ending so the card knows when it can send it’s response and away we go.

Like most protocols, smart card communications is layered.

At layer 1, we have the physical layer, defining the operating voltages, encoding, etc. This is standardised in ISO/IEC 7816-3.

Above that comes our layer 2 – our Link Layer. This is also specified in ISO/IEC 7816-3, and typically operates in one of two modes – T0 or T1, with the difference between the two being one is byte-oriented the other block-oriented. For telco applications T0 is typically used.

Our top layer (layer 7) is the application layer. We’ll cover the details of this in the next post, but it carries application data units to and from the card in the form of commands from the terminal, and responses from the card.

Coming up Next…

In the next post we’ll look into application layer communications with cards, the commands and the responses.