CAMEL is primarily focused on charging for Voice & SMS services, as data generally uses Diameter, so it’s voice and SMS we’ll focus on.
CAMEL is spoken between the MSC (gsmSSF) and the OCS (gsmSCF).
Basic Call State Model
CAMEL is closely related to the Intelligent Network stuff on the 1980s, and steals a lot of it’s ideas from there, unfortunately if you’re to read the CAMEL standard it also implies you were involved in IN stuff and had been born at that point, alas I was neither.
So the key to understanding CAMEL is the Basic Call State Model (BCSM) which is a model of all the different states a call can be in, such as ringing, answered, abandoned, call failed, etc, etc.
Over CAMEL, our OCS can be told by the MSC when a certain event happens; the MSC can tell the OCS, that the call has changed state. For example a BCSM event might indicate the call has hung up, is ringing, cancelled, etc.
Below is the list of all the valid BCSM states:
List of BCSM states for events
Basic MO Call with CAMEL
Our subscriber makes an outbound call.
Based on the data the MSC has in it from the HLR, it knows that we should use CAMEL for this call, and it has the SCCP Address of the OCS (gsmSCF) it needs to send the CAMEL messages to.
So the MSC sends an InitialDP message to the OCS (via it’s Global Title Address) to Authorize the call that the user is trying to make.
This is like any other Authorization step for an OCS, which allows the OCS to authorize the call by checking the subscriber is valid, check if they’re allowed to call that destination and they’ve got the balance to do so, etc.
initialDP message from an MSC to an OCS
The initialDP (Initial Detection Point) is telling our OCS all about the call event that’s being requested, who’s calling, what number they’ve dialed, where they are in the network (of note especially if they’re roaming), etc, etc.
Generally the OCS also uses this message as a chance to subscribe to BCSM Events using RequestReportBCSMEventArg so the OCS will get notified by the MSC when the state of the call changes. This means the MSC will tell us when the state of the call changes; events like the call getting answered, disconnected, etc. This is critical so we know when the call gets answered and hung-up, so we can charge correctly.
In the below example, as well as sending the Continue and RequestReportBCSMEventArg the OCS is also setting the ChargingArgs for this call, so the MSC knows who to charge (the caller) set via sendingSide and that the MSC must send an Apply Charging Report (ACR) messages every 300 units (1 unit = 100 ms, so a value of 300 = 300 x 100 milliseconds = 30 seconds) so the OCS keeps track of what’s going on.
continue sent by the OCS to the MSC, also including reportBCSMEvent and applyCharging messages
Or in a slightly less appropriate analogy but easier to understand for SIP folks, the InitialDP is sent for INVITE and the 180 RINGING is sent once the continue message is received.
Call is Answered
So at this stage our call can start to ring.
As we’ve subscribed to BCSM events in our last message, the MSC is going to tell us when the call gets answered or the call times out, is abandoned or the sun burns out.
The MSC provides this info a eventReportBCSM, which is very simple and just tells us the event that’s been triggered, in the example below, the call was answered.
eventReportBCSM from MSC to OCS
These eventReportBCSM are informational from the MSC to the OCS, so the OCS doesn’t need to send anything back, but the OCS does need to mark the call as answered so it can start timing the call.
At this stage, the call is connected and our two parties are talking, but our MSC has been told it needs to send us applyChargingReports every 30 seconds (due to the value of 300 in maxCallPeriodDuration) after the call was connected, so the MSC sends the OCS it’s first applyChargingReport 30 seconds after the call was answered:
applyChargingReport sent by the MSC to the OCS every reporting period
We can calculate the duration of the call so far based on the time of the eventReportBCSM, then the OCS must make a decision of if it should allow the call to continue or not.
For simplicity’s sake, let’s imagine we’re still got a balance in the OCS and the OCS wants the call to continue, the OCS send back an applyCharging message to the MSC in response, and includes the current allowed maxCallPeriodDuration, keeping in mind the value is x100 and in nanoseconds (so this is 30 seconds).
applyCharging from the OCS back to the MSC
Perfect, our call is good to go for another 30 more seconds, son in 30 seconds we’ll get another ACR messages from MSC to the OCS to keep it abreast of what’s going on.
Now one of two things is going to happen, either subscriber is going to burn through all of their minutes, and get their call cutoff, or the call will end while they’ve still got balance, let’s look at both scenarios.
Normal Hangup Scenario
When the call ends, we get an applyChargingReport from the MSC to the OCS.
As we’ve subscribed to reportBCSMEvent we get both the applyChargingReport with legActive: False` so we know the call has hungup, and we’ve got an event report to tell us more about the event, in this case a hangup from the Originating Side.
reportBCSMEvent and applyChargingReport Sent by the MSC to the OCS to indicate the call has ended, note the legActive flag is now false
Lastly the OCS confirms by sending a releaseCall to the MSC, to indicate all legs should now terminate.
releaseCall Sent by OCS to MSC at the very end
So that’s it!
Obviously there are other flows, such as running out of balance mid-call, rejecting a call, SMS and PBX / VPN services that rely on CAMEL, but hopefully you now understand the basics of how CAMEL based charging looks and works.
If you’re looking for a CAMEL capable OCS or a CAMEL to Diameter or API gateway, get in touch!
When Dickens wrote of Doctor Manette in the 1859, I doubt his intention was to write about the repeating history of RAN fronthaul standards – but I can’t really say for sure.
Setting the Scene
Our story starts with introducing CPRI (Common Public Radio Interface) interface, having been imprisoned in the Bastille of vendor lock in for the better part of twenty years.
Think of CPRI is less of a hard interoperable standard and more like how the Italian and French languages are both derived from Latin; it doesn’t mean that the two languages are the same, but they’ve got the same root and may share some common words and structures.
In practice this means that taking an Ericsson Radio and plugging it into a Huawei Baseband simply won’t work – With CPRI you must use the same vendor for the Baseband and the Radios.
“Nuts to this” the industry said after being stuck locked between the same radios and baseband for years; we should create a standard so we can mix and match between radio vendors, and even standardize some other stuff that’s been bothering us, so we’ll have a happy world of interoperability.
A world with interoperable fronthaul
With kit created that followed this standard, we’d be able to take components from vendor A, B & C, and fit them together like Lego, saving you some money along the way and giving you’ve got a working solution made of “best of breed” components, where everything is interoperable.
Omnitouch Lego base stations, which also fit together like Lego – Part of the Omnitouch Network Services “swag” from 2024
So the industry created a group to chart a path for a better tomorrow by standardizing these interfaces.
The group had many industry heavyweights like Nokia, NEC, LG, ZTE and Samsung joining.
The key benefits espoused on their website:
An open market will substantially reduce the development effort and costs that have been traditionally associated with creating new base station product ranges. The availability of off-the-shelf base station modules will enable manufacturers to focus their development efforts on creating further added value within the base station, encouraging greater innovation and more cost-effective products. Furthermore, as product development cycles will be reduced, new base station functions will become available on the market more quickly.
Mission statement of the group
In addition to being able to mix and match radios and basebands from different vendors, the group defined standards for centralized baseband, and interoperable standards, to allow a multi-vendor ecosystem to flourish.
And here’s the plot twist – The text above, was not written about OpenRAN, and it was not written about the benefits of eCPRI.
It was written about Open Base Station Architecture Initiative (OBSAI) and it was written 22 years ago.
*record screech sound*
Standards War you’ve never heard of: OBSAI vs CPRI
When OBSAI was defined it was not without competition; there was another competing fronthaul standard; that’s right, the mustache twirling lowlife from earlier in the story – CPRI.
Supported by Huawei, Nortel, NEC & Ericsson (among others), CPRI took a “gentle parenting” approach to the standards world, in contrast to OBSAI. Instead of telling all the vendors to agree on an interoperable front haul standard, CPRI just encouraged everyone to implement what their heart told them and what felt right to them.
As it happened, the industry favored the CPRI approach.
If a vendor wanted to add a new “word” in their CPRI “language” to add a new feature, they just went ahead and added it – It didn’t require anyone else to agree with them or changes to a common standard used by the industry, vendors could talk to the kit they made how they wanted.
CPRI has been the defacto non-standard used by all the big kit vendors for the past ~10 years.
The Death of OBSAI & the Birth of OpenRAN’s eCPRI
Why haven’t you heard of OBSAI? Why didn’t the OBSAI standard just serve as the basis for eCPRI – After all the last OBSAI release was less than 5 years before TIP started working on eCPRI publicly.
Is no more. It has ceased to be.
Did a schism over “uplink performance improvement” options lead to “irreconcilable differences” between parties leading to the breakup of the OBSAI group?
Nope.
Customers (MNOs) didn’t buy OBSAI based equipment in measurably larger quantities than CPRI kit. That’s it.
This meant the vendors invested less in paying teams to further develop the standards, the OBSAI group met less frequently, and in the end, member vendors didn’t bother adding support for OBSAI to new equipment and just used the easier and more flexible CPRI option instead.
At some point someone just stopped paying for the domain renewal and that was it, OBSAI was no more.
This is how the standards body ends, not with a bang, but with a whimper.
T.S. Elliot’s writings on the death of obsai
Those who do not learn from history…
The goals of the OBSAI Group and OpenRAN working groups are almost identical, so what lessons did Marconi, Motorola and Alcatel learn as members of OBSAI that other vendors could learn about OpenRAN strategy?
There are no mentions of OBSAI in any of the information published by OpenRAN advocates, and I’m wondering if folks aren’t aware that history tends to repeat and are ignorant to what came before it, or they’re just not learning lessons from the past?
So what can the OpenRAN industry learn from OBSAI?
Being a nerd, I started detailing the technical challenges, but that’s all window dressing; The biggest hurdle facing CPRI vs eCPRI are the same challenges OBSAI vs CPRI faced a decade prior:
To be relevant, OpenRAN kit has to be demonstrably better than what we have today AND provide a tangible cost saving.
OBSAI failed at achieving this, and so failed to meet it’s other more noble goals.
[At the time of writing this at least] I’d contend that neither of those two criteria have been met by OpenRAN.
What does the future hold for OpenRAN?
Looking into the crystal ball, will OpenRAN and eCPRI go the way of OBSAI, or will someone keep the OpenRAN dream alive?
Today, we’re still seeing the MNOs continue to provide tokenistic investment in OpenRAN. But being a cynic, I’d say the MNOs are feigning interest in OpenRAN products because it’s advantageous for them to do so.
The threat of OpenRAN has proven to be a great stick to beat the traditional vendors with to force them to lower their prices.
Think about the $14 billion USD Ericsson deal with AT&T, if chucking a few million at OpenRAN pilots / trials lead to AT&T getting even a 0.1% reduction in what they’re paying Ericsson, then the numbers would have worked out well in AT&Ts favor.
From the MNOs perspective, the cost to throw the odd pilot or trial to a hungry OpenRAN vendor to keep them on the hook is negligible, but the threat of OpenRAN provides leverage and bargaining power every time it’s contract renewal time with the big RAN vendors.
Already we’ve seen all the traditional RAN vendors move to neutralize this threat by introducing “OpenRAN compatible” equipment and talking up their commitment to openness.
This move by the RAN vendors takes this sting out of the OpenRAN threat, and means MNOs won’t have much reason to continue supporting OpenRAN.
This leaves the remaining OpenRAN vendors like Miss Havisham, forever waiting in their proverbial wedding dresses, having being left at the altar.
Okay, I’m mixing my Dickens’ references here, but it was too good not to.
Appendix
I’ve been enjoying writing more analysis than just technical content, let me know if this is something you’re interested in seeing more of.
I’ve been involved in two big OpenRAN integration projects, both of which went poorly and probably tainted my perspective. Enough time has passed to probably write up how it all went with the vendor names removed, but that’s a post for another time!
This is the next post in my series on SS7, and today we’re taking a look at SCCP the Signalling Connection Control Part (SCCP).
High Level
Global Title uses the routing features from SCCP, which is another layer on top of MTP3.
SCCP allows us to route on more than just point code, instead we can route based on two new fields, Subsystem Number and Global Title.
Subsystem Number is the type of system we are looking to reach, ie an HLR, MSC, CAMEL Gateway, etc.
The Global Title generally looks like an E.164 formatted phone number, and often it is just that.
Somewhere along the chain (typically at the end of it) an STP somewhere needs to perform Global Title Translation to analyse the SCCP header (Subsystem Number, Point Code & Global Title) and finally turn that into a single point code to route the MTP3 message to.
The advantage of this is we are no longer just limited to routing messages based on Point Code.
This is how the international SS7 Network used for roaming is structured and addressed – All using Global Title rather than Point Codes.
The need for SCCP
For starters, after all this talk of MTP3 and Point Codes, why the need to add SCCP?
Let’s go back in time and look at the motivators…
1. Address space is finite
Point codes are great, and when we’ve spoken about them before, I’ve compared them to IPv4 address, but rather than ranging from 0.0.0.0 to 255.255.255.255 (32 bits on IPv4) international signaling point codes range from 0.0.0 to 7.255.7 (14 bits).
The problem with IPv4’s 32 bit addresses is they run out. The problem with the ITU International Signaling Point Codes is that they too, are a limited resource with only 16,383 possible ISPCs.
~700 operators worldwide each with ~100 network elements would be 70k point codes to address them all – That’s not going to fit into our 16k possible Point Codes.
Global Title fixes this, because we’re able to use E.164 phone number ranges (which are plentiful) for addressing, we’re still not at IPv6 levels of address space, but pretty hefty.
2. Service Discovery by Subsystem
Now imagine you’re a VLR looking to find an HLR. The VLR and the HLR are both connected to an STP, but how does the VLR know where to reach the HLR?
One option would be to statically set every route for the Point Code of every HLR into every possible VLR and visa-versa, but that gets messy fast.
What if the VLR could just send a request to the STP and indicate that the request needs to be routed to any HLR, and the STP takes care of finding a SS7 node capable of handling the request, much a Diameter Routing Agent routes based on Application ID.
SCCP’s “Subsystem Number” routing can handle this as we can route based on SSN.
3. Service Discovery by MSISDN
Having an SMS destined to a given MSISDN requires the SMSc to know where to route it.
Likewise an MSC wanting to call a given number.
There’s a lot of MSISDN ranges. Like a lot. Like every phone mobile number.
Having every a table on every SSP/SCP in the network know where every MSISDN range is in the world and what point code to go through to reach it is not practical.
Instead, being able to have the SCP/SSPs (like our MSC or SMSc) send all off-net traffic to an STP frees us the individual SCP/SSPs from this role; they just forward it to their connected STP.
Our STP can analyse the destination MSISDN and make these routing decisions for us, using Global Title Translation based on rules in the Global Title Table on the STP.
For example by adding each of the domestic / national MSISDN ranges/prefixes into the Global Title Table on the STP (along with the corresponding point code to route each one to), the STP can look at the destination MSISDN in the message and forward to the STP for the correct operator.
Likewise a route can match anything where the Global Title address is outside of the local country and send it to an international signaling provider.
Global title takes care of this as we can route based on a phone number.
4. Tokenistic Security
By “Hiding” network elements behind Global Titles, you don’t expose as much information about your internal network, and the only way people can “find” your network elements would be scanning through all the possible addresses in your (publicly advertised) Global Title range (wardialing is back baby!).
But the phrases “Security” and “SS7” don’t really belong together…
The SCCP Header
The SCCP header has a Called Party and a Calling Party, and this is where the magic happens.
These can be made up for any number of 3 parts:
Global Title Address
Subsystem Number
Point Code
We can route on any combination of these.
To indicate we’re using SCCP, we set the Signaling Indicator bit in the M3UA / MTP3 message to SCCP:
Great, now we can look at our SCCP header.
It looks like there’s a lot going on, but we can see the calling and called party (888888888 is called by 9999999999) with the Subsystem number set (888888888 is called for subsystem HLR, from 999999999 which is a VLR).
The closest TCP/IP analogy I can think of here is that of port numbers, there’s still an IP (Point code) but the port number allows us to specify multiple applications that run at a higher layer. This analogy falls down when we consider that the Point Code is generally set to that of your STP, not the final STP.
For this to work, we’ve got to have at least one Signaling Transfer Point in the flow, where we send the request to.
Somewhere (generally at the end of the chain of STPs), an STP is going to perform Global Title Translation.
What does this look like? Well let’s have a look at my GT table for the example above, in my lab network, I’ve got two nodes attached (via M3UA but could equally be on MTP3 links), my test MAP client where I’m originating this traffic, and an SMS Firewall, I can see they’re both up here:
Now knowing this I need to setup my SCCP routing for Global Title. In the screenshot above, the Called Party was 888888888 with Subsystem Number 7. Inside the SCCP request, there’s a few other fields, the Translation Type we have set to 0, Global Title Indicator is 4 (route on Global Title), while Numbering Plan Indicator is 1 (ISDN) and Nature of Address Indicator is 4 (International).
So on my Cisco ITP I define a GTT Selector to target traffic with these values, Translation Type is 0, Global Title Indicator is 4, Number Plan is 1 and the Nature of Address Indicator is 4.
So we’d define a Global Title Translation selector like the one below to match this traffic:
But that’s only matching the group of traffic, it’s not going to match based on the actual SCCP Called Party. So now I need to define a translation for each Global Title address (Called /Calling party) or prefix I want to route, I’ve setup anything starting with 888 to route to the `SMSFirewall` ASP endpoint.
I could stop here and my request addressed to 888888888 would make it to the SMSFirewall ASP, but the response never would, like in all SS7 routing, we need to define the return route translation too, which is what I’ve done for 999999 to route to the TestClient.
Lastly I’ve added a wildcard route, this means if this STP doesn’t know how to resolve a GT address matching the rules in the top line, it’ll forward the request to the STP at point code 1.2.3 – This is how you’d do your connection to an IPX / Signaling exchange.
Debugging this can be a massive pain in the backside, but if you enable logging you can see when GT rules are not matched, like in the example below.
If your network is quiet enough, it’s sometimes easier to just make your rules based on what you observe failing to route.
So with those routes in place, when we send a request with the Global Title called party starting with 8888888 it’s routed to M3UA ASP SMSFirewall, which handles the request, and then sends the response back to the MAPClient M3UA ASP.
The other day I was facing an issue with our SMSc inter-working with another operator via MAP.
Our SRI-for-SM responses were relayed back to their nodes, but it was like it couldn’t parse the message.
I got some “known good” traffic to compare this against to work out what we’re doing wrong.
The difference in the two examples below is subtle, but it’s there – On the example on the left (failing) we are including an msc-number in the locationInfoWithLMSI field, while on the right we’ve got a “Network Node Number”.
“Okay” I thought to myself, we’re just doing something wrong with the encoding of the MAP body, so I did my usual diff trick from Wireshark:
Oddly in the raw form both these values decode the same, if I feed the values on the left into our decoder, and then encode, I get the values on the right, with the exact same hex body – and this is all ASN.1; so there’s very little room for error anyway.
So what gives? Why does Wireshark show one MAP body differently to the other, with the same hex bytes?
Well, the issue is not within my GSM MAP body and the content I include there, but rather in the TCAP layer above it, specifically the `application-context-name`.
We’re indicating support for GSM MAP v3 (0.4.0.0.1.0.20.3), while the other operator is using MAPv2 ( 0.4.0.0.1.0.20.2 ) even though their IR.21 indicates it should be GSM MAP v3.
Now our SRI-for-SM responses are based on the GSM MAP version received, rather than reported as supported, and we don’t have to deal with this for handling requests anymore.
Hello Nick, thank you for the article. What is the use of the OPc key to be derived from OP key ? Why can’t it just be a random key like Ki ?
It’s a super good question, and something I see a lot of operators get “wrong” from a security best practices perspective.
Refresher on OP vs OPc Keys
The “OP Key” is the “operator” key, and was (historically) common for an operator.
This meant all SIMs in the network had a common OP Key, and each SIM had a unique Ki/K key.
The SIM knew both, and the HSS only needed to know what the Ki was for the SIM, as they shared a common OP Key (Generally you associate an index which translates to the OP Key for that batch of SIMs but you get the idea).
But having common key material is probably not the best idea – I’m sure there was probably some reason why using a common key across all the SIMs seemed like a good option, and the K / Ki key has always been unique, so there was one unique key per SIM, but previously, OP was common.
Over time, the issues with this became clear, so the OPc key was introduced. OPc is derived from mushing the K & OP key together. This means we don’t need to expose / store the original OP key in the SIM or the HSS just the derived OPc key output.
This adds additional security, if the Ki for a SIM were to be exposed along with the OP for that operator, that’s half the entropy lost. Whereas by storing the Ki and OPc you limit the blast radius if say a single SIMs data was exposed, to only the data for that particular SIM.
This is how most operators achieve this today; there is still a common OP Key, locked away in a vault alongside the recipe for Coca-cola and the moon landing set.
But his OP Key is no longer written to the SIMs or stored in the HSS.
Instead, during the personalization process (The bit in manufacturing where SIMs get the unique data written to them (The IMSI & keys)) a derived OPc key is written to the card itself, and to the output files the operator then loads into their HSS/HLR/AuC.
This is not my preferred method for handling key material however, today we get our SIM manufacturers to randomize the OP key for every card and then derive an OPc from that.
This means we have two unique keys for each SIM, and even if the Ki and OP were to become exposed for a SIM, there is nothing common between that SIM, and the other SIMs in the network.
Do we want our Ki to leak? No. Do we want an OP Key to leak? No. But if we’ve got unique keys for everything we minimize the blast radius if something were to happen – Just minimizes the risk.
How does one encode / interpret the value of this AVP / IE was the question I set out to answer.
TS 29.274 says:
For the encoding of this information element see 3GPP TS 32.298
TS 32.298 says:
The functional requirements for the Charging Characteristics as well as the profile and behaviour bits are further defined in normative Annex A of TS 32.251
TS 32.251 Annex A says:
The Charging Characteristics parameter consists of a string of 16 bits designated as Behaviours (B), freely defined by Operators, as shown in TS 32.298 [51]. Each bit corresponds to a specific charging behaviour which is defined on a per operator basis, configured within the PCN and pointed when bit is set to “1” value.
After a few circular references I found this is imported from 32.298.
Finally we find some solid answers hidden away in TS 132 215, under the Charging Characteristics Profile index.
Charging Characteristics consists of a string of 16 bits designated as Profile (P) and Behaviour (B), shown in Figure 4. The first four bits (P) shall be used to select different charging trigger profiles, where each profile consists of the following trigger sets:
S-CDR: activate/deactivate CDRs, time limit, volume limit, maximum number of charging conditions, tariff times;
G-CDR: same as SGSN, plus maximum number of SGSN changes;
M-CDR: activate/deactivate CDRs, time limit, and maximum number of mobility changes;
SMS-MO-CDR: activate/deactivate CDRs;
SMS-MT-CDR: active/deactivate CDRs.
The Charging Characteristics field allows the operator to apply different kind of charging methods in the CDRs. A subscriber may have Charging Characteristics assigned to his subscription. These characteristics can be supplied by the HLR to the SGSN as part of the subscription information, and, upon activation of a PDP context, the SGSN forwards the charging characteristics to the GGSN on the Gn / Gp reference point according to the rules specified in Annex A of TS 32.251 [11].
This information can be used by the GSNs to activate CDR generation and control the closure of the CDR or the traffic volume containers (see clause 5.1.2.2.23) and is included in CDRs transmitted to nodes handling the CDRs via the Ga reference point. It can also be used in nodes handling the CDRs (e.g., the CGF or the billing system) to influence the CDR processing priority and routing.
These functions are accomplished by specifying the charging characteristics as sets of charging profiles and the expected behaviour associated with each profile.
The interpretations of the profiles and their associated behaviours can be different for each PLMN operator and are not subject to standardisation. In the present document only the charging characteristic formats and selection modes are specified.
The functional requirements for the Charging Characteristics as well as the profile and behaviour bits are further defined in normative Annex A of TS 32.251 [11], including the definitions of the trigger profiles associated with each CDR type.
The format of charging characteristics field is depicted in Figure 4. Px (x =0..3) refers to the Charging Characteristics Profile index. Bits classified with a “B” may be used by the operator for non-standardised behaviour (see Annex A of TS 32.251 [11]).
Right, well hopefully next time someone goes looking for this info you’ll find it a bit more easily than I did!
SGs-AP which is used for CSFB & SMS doesn’t span network borders (you can’t roam with SGs-AP), and with SMSoIP out of the question, that gave us the option of MAP or Diameter, so we picked Diameter.
This introduces the S6c and SGd Diameter interfaces, in the diagrams below Orange is the Home Network (HPMN) and the Green is the Visited Network (VPMN).
The S6c interface is used between the SMSc and the HSS, in order to retrieve the routing information. This like the SRI-for-SM in MAP.
The SGd interface is used between the MME serving the UE and the SMSc, and is used for actual delivery of the MO/MT messages.
I haven’t shown the Diameter Routing Agents in these diagrams, but in reality there would be a DRA on the VPLMN and a DRA on the HPMN, and probably a DRA in the IPX between them too.
The Attach
The attach looks like a regular roaming attach, the MME in the Visited PMN sends an Update Location Request to the HSS, so the HSS knows the MME that is serving the subscriber.
S6a Update Location Request to indicate the MME serving the Subscriber
The Mobile Terminated SMS Flow
Now we introduce the S6c interface and the SGd interfaces.
When the Home SMSc has a message to send to the subscriber (Mobile Terminated SMS) it runs a the Send-Routing-Info-for-SM-Request (SRR) dialog to the HSS.
The Send-Routing-Info-for-SM-Answer (SRA) back from the HSS contains the info on the MME Diameter Host name and Diameter Realm serving the subscriber.
S6t – Send-Routing-Info-for-SM request to get the MME serving the subscriber
With this info, we can now craft a Diameter Request that will get sent to the MME serving the subscriber, containing the SMS PDU to send to the UE.
SGd MT-Forward-Short-Message to deliver Mobile Terminated SMS to the serving MME
We make sure it’s sent to the correct MME by setting the Destination-Host and Destination-Realm in the Diameter request.
Here’s how the request looks from the SMSc towards our DRA:
As you can see the Destination Realm and Destination-Host is set, as is the User-Name set to the IMSI of the UE we want to send the message to.
And down the bottom you can see the SMS-TPDU, the same as it’s been all the way back since GSM days.
The Mobile Originated SMS Flow
The Mobile Originated flow is even simpler, because we don’t need to look up where to route it to.
The MME receives the MO SMS from the UE, and shoves it into a Diameter message with Application ID set to SGd and Destination-Realm set to the HPMN Realm.
When the message reaches the DRA in the HPMN it forwards the request to an SMSc and then the Home SMSc has the message ready to roll.
Android, being open source, allows us to see how this logic works, and it’s important for operators to understand this logic, as it’s what dictates the behavior in many scenarios.
It’s important to note that I’m not covering Apple here, this information is not publicly available to share for iOS devices, so I won’t be sharing anything on this – Apple has their own ecosystem to handle emergency calling, if you’re from an operator and reading this, I’d suggest getting in touch with your Apple account manager to discuss it, they’re always great to work with.
The Android Open Source Project has an “emergency number database”. This database has each of the emergency phone numbers and the corresponding service, for each country.
This file can be read at packages/services/Telephony/ecc/input/eccdata.txt on a phone with engineering mode.
Let’s take a look what’s in mainline Android for Australia:
I recently ended up with a few Commscope RF combiners from a cell site, they’re not on frequencies that are of any use to us, so, let’s see what’s inside.
The units on the bench are Commscope Diplexer units, these ones allow you to put a signal between 694-862Mhz, and another signal between 880-960Mhz, on the same RF feeder up the tower.
It’s a nifty trick from the days where radio units lived at the bottom of the tower, but now with Remote Radio Units, and Active Antenna Units, it’s becoming increasingly uncommon to have radio units in the site hut, and more common to just run DC & fibre up the tower and power a radio unit right next to the antenna – This is especially important for higher frequencies where of course the feeder loss is greater.
Diplexer unit before it is maimed…
Anywho, that’s about all I know of them, after the liberal application of chemicals to remove the stickers and several burns from a heat gun, we started to get the unit open, to show the zillion adjustment bolts, and finely machined parts.
A lot of screwsBonus TMA
Thanks to Oliver for offering up the bench space when I rocked to up to their house with some stuff to pull apart.
Advanced Mobile Location (AML) is being rolled out by a large number of mobile network operators to provide accurate caller location to emergency services, so how does it work, what’s going on and what do you need to know?
Recently we’ve been doing a lot of work on emergency calling in IMS, and meeting requirements for NG-112 / e911, etc.
This led me to seeing my first Advanced Mobile Location (AML) SMS in the wild.
For those unfamiliar, AML is a fancy text message that contains the callers location, accuracy, etc, that is passed to emergency services when you make a call to emergency services in some countries.
It’s sent automatically by your handset (if enabled) when making a call to an emergency number, and it provides the dispatch operator with your location information, including extra metadata like the accuracy of the location information, height / floor if known, and level of confidence.
Google has their own version of AML called ELS, which they claim is supported on more than 99% of Android phones (I’m unclear on what this means for Harmony OS or other non-Google backed forks of Android), and Apple support for AML starts from iOS 11 onwards, meaning it’s supported on iPhones from the iPhone 5S onards,.
Call Flow
When a call is made to the PSAP based on the Emergency Calling Codes set on the SIM card or set in the OS, the handset starts collecting location information. The phone can pull this from a variety of sources, such as WiFi SSIDs visible, but the best is going to be GPS or one of it’s siblings (GLONASS / Galileo).
Once the handset has a good “lock” of a location (or if 20 seconds has passed since the call started) it bundles up all of this information the phone has, into an SMS and sends it to the PSAP as a regular old SMS.
The routing from the operator’s SMSc to the PSAP, and the routing from the PSAP to the dispatcher screen of the operator taking the call, is all up to implementation. For the most part the SMS destination is the emergency number (911 / 112) but again, this is dependent on the country.
Inside the SMS
To the user, the AML SMS is not seen, in fact, it’s actually forbidden by the standard to show in the “sent” items list in the SMS client.
On the wire, the SMS looks like any regular SMS, it can use GSM7 bit encoding as it doesn’t require any special characters.
Each attribute is a key / value pair, with semicolons (;) delineating the individual attributes, and = separating the key and the value.
If you’ve got a few years of staring at Wireshark traces in Hex under your belt, then this will probably be pretty easy to get the gist of what’s going on, we’ve got the header (A”ML=1″) which denotes this is AML and the version is 1.
After that we have the latitude (lt=), longitude (lg=), radius (rd=), time of positioning (top=), level of confidence (lc=), positioning method (pm=) with G for GNSS, W for Wifi signal, C for Cell or N for a position was not available, and so on.
AML outside the ordinary
Roaming Scenarios
If an emergency occurs inside my house, there’s a good chance I know the address, and even if I don’t know my own address, it’s probably linked to the account holder information from my telco anyway.
AML and location reporting for emergency calls is primarily relied upon in scenarios where the caller doesn’t know where they’re calling from, and a good example of this would be a call made while roaming.
If I were in a different country, there’s a much higher likelihood that I wouldn’t know my exact address, however AML does not currently work across borders.
The standard suggests disabling SMS when roaming, which is not that surprising considering the current state of SMS transport.
Without a SIM?
Without a SIM in the phone, calls can still be made to emergency services, however SMS cannot be sent.
That’s because the emergency calling standards for unauthenticated emergency calls, only cater for
This is a limitation however this could be addressed by 3GPP in future releases if there is sufficient need.
HTTPS Delivery
The standard was revised to allow HTTPS as the delivery method for AML, for example, the below POST contains the same data encoded for use in a HTTP transaction:
Implementation of this approach is however more complex, and leads to little benefit.
The operator must zero-rate the DNS, to allow the FQDN for this to be resolved (it resolves to a different domain in each country), and allow traffic to this endpoint even if the customer has data disabled (see what happens when your handset has PS Data Off ), or has run out of data.
Due to the EU’s stance on Net Neutrality, “Zero Rating” is a controversial topic that means most operators have limited implementation of this, so most fall back to SMS.
Other methods for sharing location of emergency calls?
In some upcoming posts we’ll look at the GMLC used for E911 Phase 2, and how the network can request the location from the handset.
There’s old joke about standards that the great thing about standards there’s so many to choose from.
SMS wasn’t there from the start of GSM, but within a year of the inception of 2G we had SMS, and we’ve had SMS, almost totally unchanged, ever since.
In a recent Twitter exchange, I was asked, what’s the best way to transport SMS? As always the answer is “it depends” so let’s take a look together at where we’ve come from, where we are now, and how we should move forward.
How we got Here
Between 2G and 3G SMS didn’t change at all, but the introduction of 4G (LTE) caused a bit of a rethink regarding SMS transport.
Early builders of LTE (4G) networks launched their 4G offerings without 4G Voice support (VoLTE), with the idea that networks would “fall back” to using 2G/3G for voice calls.
This meant users got fast data, but to make or receive a call they relied on falling back to the circuit switched (2G/3G) network – Hence the name Circuit Switched Fallback.
Falling back to the 2G/3G network for a call was one thing, but some smart minds realised that if a phone had to fall back to a 2G/3G network every time a subscriber sent a text (not just calls) – And keep in mind this was ~2010 when SMS traffic was crazy high; then that would put a huge amount of strain on the 2G/3G layers as subs constantly flip-flopped between them.
The SGs-AP interface has two purposes; One, It can tell a phone on 4G to fallback to 2G/3G when it’s got an incoming call, and two; it can send and receive SMS.
SMS traffic over this interface is sometimes described as SMS-over-NAS, as it’s transported over a signaling channel to the UE.
This also worked when roaming, as the MSC from the 2G/3G network was still used, so SMS delivery worked the same when roaming as if you were in the home 2G/3G network.
Enter VoLTE & IMS
Of course when VoLTE entered the scene, it also came with it’s own option for delivering SMS to users, using IP, rather than the NAS signaling. This removed the reliance on a link to a 2G/3G core (MSC) to make calls and send texts.
This was great because it allowed operators to build networks without any 2G/3G network elements and build a fully standalone LTE only network, like Jio, Rakuten, etc.
In roaming scenarios, S8 Home Routing for VoLTE enabled SMS to be handled when roaming the same way as voice calls, which made SMS roaming a doddle.
4G SMS: SMS over IP vs SMS over NAS
So if you’re operating a 4G network, should you deliver your SMS traffic using SMS-over-IP or SMS-over-NAS?
Generally, if you’ve been evolving your network over the years, you’ve got an MSC and a 2G/3G network, you still may do CSFB so you’ve probably ended up using SMS over NAS using the SGs-AP interface. This method still relies on “the old ways” to work, which is fine until a discussion starts around sunsetting the 2G/3G networks, when you’d need to move calling to VoLTE, and SMS over NAS is a bit of a mess when it comes to roaming.
Greenfield operators generally opt for SMS over IP from the start, but this has its own limitations; SMS over IP is has awful efficiency which makes it unsuitable for use with NB-IoT applications which are bandwidth constrained, support for SMS over IP is generally limited to more expensive chipsets, so the bargain basement chips used for IoT often don’t support SMS over IP either, and integration of VoLTE comes with its own set of challenges regarding VoLTE enablement.
5G enters the scene (Nsmsf_SMService)
5G rolled onto the scene with the opportunity to remove the SMS over NAS option, and rely purely on SMS over IP (IMS); forcing the industry to standardise on an option alas this did not happen.
This added another option for SMS delivery dependent on the access network used, and the Nsmsf_SMService interface does not support roaming.
Of course if you are using Voice over NR (VoNR) then like VoLTE, SMS is carried in a SIP message to the IMS, so this negates the need for the Nsmsf_SMService.
2G/3G Shutdown – Diameter to replace SGs-AP (SGd)
With the 2G/3G shutdown in the US operators who had up until this point been relying on SMS-over-NAS using the SGs-AP interface back to their MSCs were forced to make a decision on how to route SMS traffic, after the MSCs were shut down.
This landed with SMS-over-Diameter, where the 4G core (MME) communicates over Diameter with the SMSc.
This has adoption by all the US operators, but we’re not seeing it so widely deployed in the rest of the world.
State of Play
Option
Conditions
Notes
MAP
2G/3G Only
Relies on SS7 signaling and is very old Supports roaming
SGs-AP (SMS-over-NAS)
4G only relies on 2G/3G
Needs an MSC to be present in the network (generally because you have a 2G/3G network and have not deployed VoLTE) Supports limited roaming
SMS over IP (IMS)
4G / 5G
Not supported on 2G/3G networks Relies on a IMS enabled handset and network Supports roaming in all S8 Home Routed scenarios Device support limited, especially for IoT devices
Diameter SGd
4G only / 5G NSA
Only works on 4G or 5G NSA Better device support than 4G/5G Supports roaming in some scenarios
Nsmsf_SMService
5G standalone only
Only works on 5GC Doesn’t support roaming
The convoluted world of SMS delivery options
A Way Forward:
While the SMS payload hasn’t changed in the past 31 years, how it is transported has opened up a lot of potential options for operators to use, with no clear winner, while SMS revenues and traffic volumes have continued to fall.
For better or worse, the industry needs to accept that SMS over NAS is an option to use when there is no IMS, and that in order to decommission 2G/3G networks, IMS needs to be embraced, and so SMS over IP (IMS) supported in all future networks, seems like the simple logical answer to move forward.
And with that clear path forward, we add in another wildcard…
But, when you’ve only got a finite resource of bandwidth, and massive latencies to contend with, the all-IP architecture of IMS (VoLTE / VoNR) and it’s woeful inefficiency starts to really sting.
Of course there are potential workarounds here, Robust Header Correction (ROHC) can shrink this down, but it’s still going to rely on the 3 way handshake of TCP, TCP keepalive timers and IMS registrations, which in turn can starve the radio resources of the satellite link.
Even with SMS over 30 years old, we can still expect it to be a part of networks for years to come, even as WhatsApp / iMessage, etc, offer enhanced services. As to how it’s transported and the myriad of options here, I’m expecting that we’ll keep seeing a multi-transport mix long into the future.
For simple, cut-and-dried 4G/5G only network, IMS and SMS over IP makes the most sense, but for anything outside of that, you’ve got a toolbox of options for use to make a solution that best meets your needs.
Short one, The other day I needed to add a Network Appearance on an SS7/SS7 M3UA linkset.
Network Appearances on M3UA links are kinda like a port number, in that they allow you to distinguish traffic to the same point code, but handled by different logical entities.
When I added the NA parameter on the Linkset nothing happened.
If you’re facing the same you’ll need to set:
cs7 multi-instance
In the global config (this is the part I missed).
Then select the M3UA linkset you want to change and add the network-appearance parameter:
network-appearance 10
And bingo, you’ll start seeing it in your M3UA traffic:
If you’ve ever worked in roaming, you’ll probably have had the misfortune of dealing with Transferred Account Procedures aka TAP files.
It’s used for billing a 2G GSM call right up to 5G data usage, if you use a service while roaming, somewhere in the world there’s a TAP file with your usage in it.
A brief history of TAP
TAP was originally specified by the GSMA in 1991 as a standard CDR interchange format between operators, for use in roaming scenarios.
Notice I said GSMA – Not 3GPP – This means there’s no 3GPP TS docs for this, it’s defined by the industry lobby group’s members, rather than the standards body.
So what does this actually mean? Well, if you’re MNO A and a customer from MNO B roams into your network, all the calls, SMS and data consumed by the roaming subscriber from MNO B will need to be billed to MNO B, by you, MNO A.
If a network operator wants to get paid for traffic used on their network by roaming subscribers, they’d better send out a TAP file to the roamer’s home network.
TAP is the file format generated my MNO A and sent to MNO B, containing all the usage charges that subscribers from MNO B have racked up while roaming into your network.
These are broken down into “Transactions” (CDRs), for events like making a call, connecting a PDN session and consuming data, or sending a text.
In the beginning of time, GSM provided only voice calling service. This meant that the only services a subscriber could consume while roaming was just making/receiving voice calls which were billed at the end of each month. – This meant billing was equally simple, every so often the visisted network would send the TAP files for the voice calls made by subscribers visited other networks, to the home networks, which would markup those charges, and add them onto the monthly invoice for each subscriber who was roaming.
But of course today, calling accounts for a tiny amount of usage on the network, but this happened gradually while passing through the introduction of SMS, CAMEL services, prepaid services, mobile data, etc. For all these services that could be offered, the TAP format had to evolve to handle each of these scenarios.
As we move towards a flat IP architecture, where voice calls and SMS sent while roaming are just data, TAP files for 4G and 5G networks only need to show data transactions, so the call objects, CAMEL parameters and SMS objects are all falling by the wayside.
What’s inside a TAP File
TAP uses the most beloved of formats – ASN1 to encode the data. This means it is strictly formatted and rigidly specified.
Each file contains a Sequence Number which is a monotonically increasing number, which allows the receiver to know if any files have been missed between the file that’s being currently parsed, an the previous file.
They also have a recipient and sender TADIG code, which is a code allocated by GSMA that uniquely identifies the sender and the recipient of the file.
The TAP records exist in one of two common format, Notification Records and transferBatch records.
These files are exchanged between operators, in practice this means “Dumped on an FTP server as agreed between the two”.
TAP Notification Records
Notifications are the simplest of TAP records and are used when there aren’t any CDRs for roaming events during the time period the TAP file covers.
These are essentially blank TAP files generated by the visited network to let the home network know it’s still there, but there are no roaming subs consuming services in that period.
Notification files are really simple, let’s take a look as one shown as JSON:
When we have services to bill and records to charge, that’s when instead we generate a transferBatch record.
It looks something like this:
There’s a lot going on in here, so let’s break it down section by section.
accountingInfo
The accountingInfo section specifies the currency, exchange rate parameters.
Keep in mind a TAP record generated by an operator in the US, would use USD, while the receiver of the file may be a European MNO dealing in EUR.
This gets even more complicated if you’re dealing with more obscure currencies where an intermediary currency is used, that’s where we bring in SDRs (“Special Drawing Right”) that map to the dollar value to be charged, kinda – the roaming agreement defines how many SDRs are in a dollar, in the example below we’re not using any, but you do see it.
When it comes to numbers and decimal places, TAP doesn’t exactly make it easy.
Significant Digits are defined by counting the first number before the decimal point and all the numbers to the right of the decimal point, so for example the number 1.234 would be 4 significant digits (1 digit before the decimal point and 3 digits after it).
Decimal Places are not actually supported for the Value fields in the TAP file. This is tricky because especially today when roaming tariffs are quite low, these values can be quite small, and we need to represent them as an integer number. TAP defines decimal places as the number of digits after the decimal place.
When it comes to the maximum number of decimal places, this actually impacts the maximum number we can store in the field – as ASN1 strictly enforce what we put in it.
The auditControlInfo section contains the number of CDRs (callEventDetailsCount) contained in the TAP file, the timestamp of the first and last CDR in the file, the total charge and any tax charged.
All of the currency information was provided in the accountingInfo so this is just giving us our totals.
A CDR has 30 days from the time it was generated / service consumed by the roamer, to be baked into a TAP file. After this we can no longer charge for it, so it’s important that the earliestCallTimeStamp is not more than 30 days before the fileCreationTimeStamp seen in batchControlInfo.
batchControlInfo
The batchControlInfo section specifies the time the TAP file became available for transfer, the time the file was created (usually the same), the sequence number and the sender / recipient TADIG codes.
As mentioned earlier, we track sequence number so the receiver can know if a TAP file has been missed; for example if you’ve got TAP file 1 and TAP file 3 comes in, you can determine you’ve missed TAP file 2.
Now we’re getting to the meat & potatoes of our TAP record, the CDRs themselves.
In LTE networks these are just records of data consumption, so let’s take a look inside the gprsCall records under callEventDetails:
In the gprsBasicCallInformation we’ve got as the name suggests the basic info about the data usage event. The time when the session started, the charging ID, the IMSI and the MSISDN of the subscriber to charge, along with their IP and the APN used.
Next up we have the gprsLocationInformation – rates and tariffs may be set based on the location of the subscriber, so we need to identify the area the sub was using the services to select correct tariff / rate for traffic in this destination.
The recEntity is the index number of the SGW / PGW used for the transaction (more on that later).
Next we have the gprsServiceUsed which, again as the name suggests, details the services used and the charge.
chargeDetailList contains the charged data (Made up of dataVolumeIncoming + dataVolumeOutgoing) and the cost.
The chargeableUnits indicates the actual data consumed, however most roaming agreements will standardise on some level of rounding, for example rounding up to the nearest Kilobyte (1024 bytes), so while a sub may consume 1025 bytes of data, they’d be billed for 2045 bytes of data. The data consumed is indicated in the chargeableUnits which indicates how much data was actually consumed, before any rounding policies where applied, while the amount that is actually charged (When taking into account rounding policies) isindicated inside Charged Units.
In the example below data usage is rounded up to the nearest 1024 bytes, 134390 bytes rounds up to the nearest 1024 gives you 135168 bytes.
As this is data we’re talking bytes, but not all bytes are created equal!
VoLTE traffic, using a QCI1 bearer is more valuable than QCI 9 cat videos, and TAP records take this into account in the Call Type Groups, each of which has a different price – Call Type Level 1 indicates the type of traffic, for S8 Home Routed LTE Traffic this is 10 (HGGSN/HP-GW), while Call Type Level 2 indicates the type of traffic as mapped to QCI values:
So Call Type Level 2 set to 20 indicates that this is “20 Unspecified/default LTE QCIs”, and Call Type Level 3 can be set to any value based on a defined inter-operator tariff.
recEntityType 7 means a PGW and contains the IP of the PGW in the Home PLMN, while recEntityType 8 means SGW and is the SGW in the Visited PLMN.
So this means if we reference recEntityCode 2 in a gprsCall, that we’re referring to an SGW at 1.2.3.5.
Lastly also got the utcTimeOffsetInfo to indicate the timezones used and assign a unique code to it.
Using the Records
We as humans? These records aren’t meant for us.
They’re designed to be generated by the Visited PLMN and sent to to the home PLMN, which ingests it and pays the amount specified in the time agreed.
Generally this is an FTP server that the TAP records get dumped into, and an automated bank transfer job based on the totals for the TAP records.
Testing of the TAP records is called “TADIG Testing” and it’s something we’ll go into another day, but in essence it’s validating that the output and contents of the files meet what both operators think is the contract pricing and specifications.
So that’s it! That’s what’s in a TAP record, what it does and how we use it!
GSMA are introducing BCE – Billing & Charging Evolution, a new standard, designed to last for the next 30+ years like TAP has. It’s still in its early days, but that’s the direction the GSMA has indicated it would like to go.
If you’ve ever received an SMS from your operator, and the sender was the Operator name for example, you may be left wondering how it’s done.
In IMS you’d think this could be quite simple – You’d set the From header to be the name rather than the MSISDN, but for most SMSoIP deployments, the From header is ignored and instead the c header inside the SMS body is used.
So how do we get it to show text?
Well the TP-Originating address has the “Type of Number” (ToN) field which is typically set to International/National, but value 5 allows for the Digits to instead be alphanumeric characters.
GSM 7 bit encoding on the text in the TP-Originating Address digits and presto, you can send SMS to subscribers where the message shows as From an alphanumeric source.
On Android SMSs received from alphanumeric sources cannot be responded to (“no more “DO NOT REPLY TO THIS MESSAGE” at the end of each text), but on iOS devices you can respond, but if I send an SMS from “Nick” the reply from the subscriber using the iPhone will be sent to MSISDN 6425 (Nick on the telephone keypad).
Unstructured Supplementary Service Data or “USSD” is the stack used in Cellular Networks to offer interactive text based menus and systems to Subscribers.
If you remember topping up your mobile phone credit via a text menu on your flip phone, there’s a good chance that was USSD*.
For a period, USSD Services provided Sporting Scores, Stock Prices and horoscopes on phones and networks that were not enabled for packet data.
Unlike plain SMS-PP, USSD services are transaction stateful, which means that there is a session / dialog between the subscriber and the USSD gateway that keeps track of the session and what has happened in the session thus far.
T-Mobile website from 2003 covering the features of their USSD based product at the time
Today USSD is primarily used in the network at times when a subscriber may not have balance to access packet data (Internet) services, so primarily is used for recharging with vouchers.
Osmocom’s HLR (osmo-hlr) has an External USSD interface to allow you to define the USSD logic in another entity, for example you could interface the USSD service with a chat bot, or interface with a billing system to manage credit.
Using the example code provided I made a little demo of how the service could be used:
Communication between the USSD Gateway and the HLR is MAP but carried GSUP (Rather than the full MTP3/SCCP/TCAP layers that traditionally MAP stits on top of), and inside the HLR you define the prefixes and which USSD Gateway to route them to (This would allow you to have multiple USSD gateways and route the requests to them based on the code the subscriber sends).
(I had hoped to make a Python example and actually interface it with some external systems, but another day!)
The signaling is fairly straight forward, when the subscriber kicks off the USSD request, the HLR calls a MAP Invoke operation for “processUnstructuredSS-Request”
Unfortunately is seems the stock Android does not support interactive USSD. This is exposed in the Android SDK so applications can access USSD interfaces (including interactive USSD) but the stock dialer on the few phones I played with did not, which threw a bit of a spanner in the works. There are a few apps that can help with this however I didn’t go into any of them.
(or maybe they used SIM Toolkit which had a similar interface)
This is part of a series of posts looking into SS7 and Sigtran networks. We cover some basic theory and then get into the weeds with GNS3 based labs where we will build real SS7/Sigtran based networks and use them to carry traffic.
Having a direct Linkset from every Point Code to every other Point Code in an SS7 network isn’t practical, we need to rely on routing, so in this post we’ll cover routing between Point Codes on our STPs.
Let’s start in the IP world, imagine a router with a routing table that looks something like this:
Simple IP Routing Table
192.168.0.0/24 out 192.168.0.1 (Directly Attached)
172.16.8.0/22 via 192.168.0.3 - Static Route - (Priority 100)
172.16.0.0/16 via 192.168.0.2 - Static Route - (Priority 50)
10.98.22.1/32 via 192.168.0.3 - Static Route - (Priority 50)
We have an implicit route for the network we’re directly attached to (192.168.0.0/24), and then a series of static routes we configure. We’ve also got two routes to the 172.16.8.0/22 subnet, one is more specific with a higher priority (172.16.8.0/22 – Priority 100), while the other is less specific with a lower priority (172.16.0.0/16 – Priority 50). The higher priority route will take precedence.
This should look pretty familiar to you, but now we’re going to take a look at routing in SS7, and for that we’re going to be talking Variable Length Subnet Masking in detail you haven’t needed to think about since doing your CCNA years ago…
Why Masking is Important
A route to a single Point Code is called a “/14”, this is akin to a single IPv4 address being called a “/32”.
We could setup all our routing tables with static routes to each point code (/14), but with about 4,000 international point codes, this might be a challenge.
Instead, by using Masks, we can group together ranges of Point Codes and route those ranges through a particular STP.
This opens up the ability to achieve things like “Route all traffic to Point Codes to this Default Gateway STP”, or to say “Route all traffic to this region through this STP”.
Individually routing to a point code works well for small scale networking, but there’s power, flexibility and simplification that comes from grouping together ranges of point codes.
Information Overload about Point Codes
So far we’ve talked about point codes in the X.YYY.Z format, in our lab we setup point codes like 1.2.3.
This is not the only option however…
Variants of SS7 Point Codes
IPv4 addresses look the same regardless of where you are. From Algeria to Zimbabwe, IPv4 addresses look the same and route the same.
In SS7 networks that’s not the case – There are a lot of variants that define how a point code is structured, how long it is, etc. Common variants are ANSI, ITU-T (International & National variants), ETSI, Japan NTT, TTC & China.
The SS7 variant used must match on both ends of a link; this means an SS7 node speaking ETSI flavoured Point Codes can’t exchange messages with an ANSI flavoured Point Code.
Well, you can kinda translate from one variant to another, but requires some rewriting not unlike how NAT does it.
ITU International Variant
For the start of this series, we’ll be working with the ITU International variant / flavour of Point Code.
ITU International point codes are 14 bits long, and format is described as 3-8-3. The 3-8-3 form of Point code just means the 14 bit long point code is broken up into three sections, the first section is made up of the first 3 bits, the second section is made up of the next 8 bits then the remaining 3 bits in the last section, for a total of 14 bits.
So our 14 bit 3-8-3 Point Code looks like this in binary form:
If you’re dealing with multiple vendors or products,you’ll see some SS7 Point Codes represented as decimal (2067), some showing as 1-2-3 codes and sometimes just raw binary. Fun hey?
So why does the binary part matter? Well the answer is for masks.
To loop back to the start of this post, we talked about IP routing using a network address and netmask, to represent a range of IP addresses. We can do the same for SS7 Point Codes, but that requires a teeny bit of working out.
As an example let’s imagine we need to setup a route to all point codes from 3-4-0 through to 3-6-7, without specifying all the individual point codes between them.
Firstly let’s look at our start and end point codes in binary:
100-00000100-000 = 3-004-0 (Start Point Code)
100-00000110-111 = 3-006-7 (End Point Code)
Looking at the above example let’s look at how many bits are common between the two,
100-00000100-000 = 3-004-0 (Start Point Code)
100-00000110-111 = 3-006-7 (End Point Code)
The first 9 bits are common, it’s only the last 5 bits that change, so we can group all these together by saying we have a /9 mask.
When it comes time to add a route, we can add a route to 3-4-0/9 and that tells our STP to match everything from point code 3-4-0 through to point code 3-6-7.
The STP doing the routing it only needs to match on the first 9 bits in the point code, to match this route.
SS7 Routing Tables
Now we have covered Masking for roues, we can start putting some routes into our network.
In order to get a message from one point code to another point code, where there isn’t a direct linkset between the two, we need to rely on routing, which is performed by our STPs.
This is where all that point code mask stuff we just covered comes in.
Let’s look at a diagram below,
Let’s look at the routing to get a message from Exchange A (SSP) on the bottom left of the picture to Exchange E (SSP) with Point Code 4.5.3 in the bottom right of the picture.
Exchange A (SSP) on the bottom left of the picture has point code 1.2.3 assigned to it and a Linkset to STP-A. It has the implicit route to STP-A as it’s got that linkset, but it’s also got a route configured on it to reach any other point code via the Linkset to STP-A via the 0.0.0/0 route which is the SS7 equivalent of a default route. This means any traffic to any point code will go to STP-A.
From STP-A we have a linkset to STP-B. In order to route to the point codes behind STP-B, STP-A has a route to match any Point Code starting with 4.5.X, which is 4.5.0/11. This means that STP-A will route any Point Code between 4.5.1 and 4.5.7 down the Linkset to STP-B.
STP-B has got a direct connection to Exchange B and Exchange E, so has implicit routes to reach each of them.
So with that routing table, Exchange A should be able to route a message to Exchange E.
But…
Return Routing
Just like in IP routing, we need return routing. while Exchange A (SSP) at 1.2.3 has a route to everywhere in the network, the other parts of the network don’t have a route to get to it. This means a request from 1.2.3 can get anywhere in the network, but it can’t get a response back to 1.2.3.
So to get traffic back to Exchange A (SSP) at 1.2.3, our two Exchanges on the right (Exchange B & C with point codes 4.5.6 and 4.5.3) will need routes added to them. We’ll also need to add routes to STP-B, and once we’ve done that, we should be able to get from Exchange A to any point code in this network.
There is a route missing here, see if you can pick up what it is!
So we’ve added a default route via STP-B on Exchange B & Exchange E, and added a route on STP-B to send anything to 1.2.3/14 via STP-A, and with that we should be able to route from any exchange to any other exchange.
One last point on terminology – when we specify a route we don’t talk in terms of the next hop Point Code, but the Linkset to route it down. For example the default route on Exchange A is 0.0.0/0 via STP-A linkset (The linkset from Exchange A to STP-A), we don’t specify the point code of STP-A, but just the name of the Linkset between them.
Back into the Lab
So back to the lab, where we left it was with linksets between each point code, so each Country could talk to it’s neighbor.
Let’s confirm this is the case before we go setting up routes, then together, we’ll get a route from Country A to Country C (and back).
So let’s check the status of the link from Country B to its two neighbors – Country A and Country C. All going well it should look like this, and if it doesn’t, then stop by my last post and check you’ve got everything setup.
So let’s add some routing so Country A can reach Country C via Country B. On Country A STP we’ll need to add a static route. For this example we’ll add a route to 7.7.1/14 (Just Country C).
That means Country A knows how to get to Country C. But with no return routing, Country C doesn’t know how to get to Country A. So let’s fix that.
We’ll add a static route to Country C to send everything via Country B.
CountryC#conf t
Enter configuration commands, one per line. End with CNTL/Z.
CountryC(config)#cs7 route-table system
CountryC(config)#update route 0.0.0/0 linkset ToCountryB
*Jan 01 05:37:28.879: %CS7MTP3-5-DESTSTATUS: Destination 0.0.0 is accessible
So now from Country C, let’s see if we can ping Country A (Ok, it’s not a “real” ICMP ping, it’s a link state check message, but the result is essentially the same).
By running:
CountryC# ping cs7 1.2.3
*Jan 01 06:28:53.699: %CS7PING-6-RTT: Test Q.755 1.2.3: MTP Traffic test rtt 48/48/48
*Jan 01 06:28:53.699: %CS7PING-6-STAT: Test Q.755 1.2.3: MTP Traffic test 100% successful packets(1/1)
*Jan 01 06:28:53.699: %CS7PING-6-RATES: Test Q.755 1.2.3: Receive rate(pps:kbps) 1:0 Sent rate(pps:kbps) 1:0
*Jan 01 06:28:53.699: %CS7PING-6-TERM: Test Q.755 1.2.3: MTP Traffic test terminated.
We can confirm now that Country C can reach Country A, we can do the same from Country A to confirm we can reach Country B.
But what about Country D? The route we added on Country A won’t cover Country D, and to get to Country D, again we go through Country B.
This means we could group Country C and Country D into one route entry on Country A that matches anything starting with 7-X-X,
For this we’d add a route on Country A, and then remove the original route;
Of course, you may have already picked up, we’ll need to add a return route to Country D, so that it has a default route pointing all traffic to STP-B. Once we’ve done that from Country A we should be able to reach all the other countries:
CountryA#show cs7 route
Dynamic Routes 0 of 1000
Routing table = system Destinations = 3 Routes = 3
Destination Prio Linkset Name Route
---------------------- ---- ------------------- -------
4.5.6/14 acces 1 ToCountryB avail
7.0.0/3 acces 5 ToCountryB avail
CountryA#ping cs7 7.8.1
*Jan 01 07:28:19.503: %CS7PING-6-RTT: Test Q.755 7.8.1: MTP Traffic test rtt 84/84/84
*Jan 01 07:28:19.503: %CS7PING-6-STAT: Test Q.755 7.8.1: MTP Traffic test 100% successful packets(1/1)
*Jan 01 07:28:19.503: %CS7PING-6-RATES: Test Q.755 7.8.1: Receive rate(pps:kbps) 1:0 Sent rate(pps:kbps) 1:0
*Jan 01 07:28:19.507: %CS7PING-6-TERM: Test Q.755 7.8.1: MTP Traffic test terminated.
CountryA#ping cs7 7.7.1
*Jan 01 07:28:26.839: %CS7PING-6-RTT: Test Q.755 7.7.1: MTP Traffic test rtt 60/60/60
*Jan 01 07:28:26.839: %CS7PING-6-STAT: Test Q.755 7.7.1: MTP Traffic test 100% successful packets(1/1)
*Jan 01 07:28:26.839: %CS7PING-6-RATES: Test Q.755 7.7.1: Receive rate(pps:kbps) 1:0 Sent rate(pps:kbps) 1:0
*Jan 01 07:28:26.843: %CS7PING-6-TERM: Test Q.755 7.7.1: MTP Traffic test terminated.
So where to from here?
Well, we now have a a functional SS7 network made up of STPs, with routing between them, but if we go back to our SS7 network overview diagram from before, you’ll notice there’s something missing from our lab network…
So far our network is made up only of STPs, that’s like building a network only out of routers!
In our next lab, we’ll start adding some SSPs to actually generate some SS7 traffic on the network, rather than just OAM traffic.
SMS by default uses the GSM-7 bit alphabet, thanks to the fact each letter is only 7 bits long, this means you can cram 160 characters into a 140 byte message body.
However, this 7-bit alphabet is, well, limited, because it’s 7 bits long it means we can only have 128 different combinations of these bits, or to put it another way, with only 128 different unique combinations of these bits, we can only define 128 characters.
You have the standard 26 latin alphabet characters that Sesame Street drilled into you, some characters with accents, digits, and a limited set of symbols.
The GSM 7 bit alphabet does not include is character sets and symbols common for non-English written languages.
Shift Tables
To deal with this 3GPP introduced “National Language Shift Tables”, which are enable a sort of find-and-replace approach to the 7-bit alphabet, where certain characters that are unused in one alphabet, take the value of characters from the local alphabet.
So if you want to send the character Ğ (Found in the Turkish and Azerbaijani alphabets) you’d select the Turkish language Shift table, that replaces the capital G (71) with Ğ.
Of course you need to have two things to do this, you need the Language Shift Table to tell you what local-language letters replace what default letters, and a mechanism to state that you’re using a language shift table.
3GPP define the National Language Shift tables in TS 23.038, where you can lookup the character you want to encode, so you know what 7 bit value it uses, for example our character Ğ is 1000111 in the 7-bit alphabet.
Next we need to indicate that we don’t want 1000111 in the 7-bit alphabet to be rendered as “G”, we want to use the “Turkish National Language Single Shift Table” which will render it as “Ğ”. We do this in the User Data Header of the SMS Body, the same way we’d indicate that an SMS is a concatenated SMS.
But by adding a header in the User Data Header of the SMS Body, we eat into the space we can use to send the message body, with a single User Data Header indicating that the Turkish National Language Single Shift Table is being used, we go from a maximum of 160 characters without the User Data Header, to 134 characters.
I’ve shared a lot more information on the User Data Header in this post on Concatenated SMS, should you be interested.
No matter how many shift tables you define, you’re not going to cover all of these in a 7-bit alphabet.
So all this encoding falls to 💩 when someone adds an Emoji.
The “😀” Emoji, represented as U+1F600 in Unicode, can be encoded as 0xF09F9880 in UTF-8 or 0xD83DDE00 in UTF-16.
So in 3GPP Networks, when you need more than 128 characters to work with, and when shift tables won’t cut the mustard, you can change the encoding used to use the International Standards Organisations’ “Universal coded character set 2” (UCS-2).
Unfortunately UCS-2 never really took off, but luckily it overlaps with UTF-16 character set, which is a lot more common.
So if you’ve got a “😀” Emoji in your SMS body the encoding of the message will be changed from GSM-7 to use a different encoding -UTF-16 / UCS2.
SMS Body showing TP-DCS character set is UCS2 / UTF-16 as Emojis are present
There’s a catch here, if you’re moving from a 7-bit alphabet to a 16 bit alphabet, you’re going to have a lot less space to work with.
A single SMS contains 1120 bits for the user data (The actual message).
With GSM-7 bit encoding, each letter takes up 7 bits, so 1120÷7 gives us 160 characters.
With UTF-16/UCS2 encoding, each letter takes up 16 bits so 1120÷16 only give us 70 characters.
So what happens next?
Often when Emojis are used, as our message is now limited to 70 characters concatenated messages are used, which takes a further 8 bytes of our message body if concatenated messages are used, further limiting the message length.
Most people think of 160 characters as the length of an SMS. But the payload is actually 140 bytes, but with better encoding 1 character doesn’t require 1 byte.
The above paragraph is exactly 160 characters. It would fit into a standard SMS.
By using the GSM 7 bit alphabet, you can cram 16 characters into 140 bytes (octets) of space, which is kind of cool.
140 bytes of data containing 160 characters of text
You’d think if you took a 160 character SMS, and concatenated it onto another 160 character SMS, you’d get a total of 320 characters, right (160+160=320)? Alas it’s not that simple.
In order to achieve the concatenation of messages in a way that’s transparent to the users (rather than a series of SMSes coming through one-after-the-ther) a User-Data Header (TP-User-Data-Header-Indicator aka TP-UHDI) is added to the TP-User Data of the TPDU (the part that actually contains the user message).
This User-Data Header takes up 7 bytes, which with GSM encoding robs us of 6 characters from the message length. (Not a typo, GSM7 encoding does not mean 1 character = 1 byte, hence we can get 160 characters into 140 bytes of space) So a two SMS concatenated message would only allow 268 characters to be sent (134 characters + 134 characters).
Let’s take a look at this header that’s robbing us of message length, but enabling us to concatenate messages.
For starters, the information about how many parts in the concatenated message, and what part number this one is, is located in the message body, hence robbing us of characters.
But we only know about the presence of this header being in the message body because the SMS-SUBMIT TPDU has the TP-UDHI flag (TP-User-Data-Header-Indicator) set, so we know the User Data is prefixed with the User-Data-Header.
Now if we have a look in the TP-User-Data we can see the User-Data Header, this can actually carry a few different payloads, but in our case, it’s carrying the Concatenated Short Messages IE, which tells us the message identifier (unique per single-but-multi-part message, the number of parts in the message (in this case 2) and the part number this is (part 1 of 2).
First part of a two part SMS
Now the phone has indicated this is a multipart message, the length of the data is still 160, but the length of the actual message is now limited to 134 characters with GSM7 encoding.
The encoding isn’t as bad as you might expect: 1st byte indicates the total length of the User Data Headers (After this the actual user data begins), 2nd byte is the IE identifier, for Concatenated Short Messages, this is 00, 3rd byte is the length of the Concatenated Short Messages IE, 4th byte is the message identifier in hex, 5th byte is the number of message parts in hex (So up to 255 message parts) 6th byte is the message part number, to aid in putting it back together in order.
3GPP TS 23.040 – 9.2.3.24 TP-User Data (TP-UD) – Encoding of User Data Header and generic IEConcatenated short message IE encoding
So what we end up with is a header inside our user payload, advising that this is a concatenated SMS, the message identifier, the number of parts in the message, and the part number of this particular message.
Last part of two part SMS
The SMSc on receipt of these has to spool them back out to the destination with the same message part number, and same headers in place.
The phone receiving the SMS has to wait for all the parts to come through and then reassemble before rendering to the user.
So that’s how concatenated SMS works. While this may seem convoluted and silly in a world where transfering more than 140 bytes of data is trivial, SMS was introduced in the early 1990s, and in theory at least, a user with a phone that supported SMS purchased when SMS was introduced, should still be able to interwork with phones today.
We’ve covered SMS in the past, but MMS is a different kettle of fish.
Let’s look at how the call flow goes, when Bob wants to send a picture to Alice.
Before Bob sends the MMS, his phone will have to be setup with the correct settings to send MMS. Sometimes this is done manually, for others it’s done through the Carrier provisioning SMS that preloads the settings, and for others it’s baked in based on the Android Carrier settings XML,
APN settings for Telstra in Australia for MMS
It’s made up of the APN to send MMS traffic over, the MMSC address (Multimedia Message Switching Center) and often an MMS proxy and port combination for where the traffic will actually go.
Message Flow – Bob to MMSC (Mobile Originated MMS)
Bob opens his phone, creates a new message to Alice, selects the picture (or other multimedia filetype) to send to her and hits the send button.
For starters, MMS has a file size limit, like MTU it’s not advertised, so you don’t know if you’ve hit it, so rather like MTU is a “lowest has the highest success of getting through” rule. So Bob’s phone will most likely scale the image down to fit inside 300K.
Next Bob’s phone knows it has an MMS to send, for this is opens up a new bearer on the MMS APN, typically called MMS, but configured in the phone by Bob.
Why use a separate APN for sending 300K of MMS traffic? Once upon a time mobile data was expensive. By having a separate APN just for MMS traffic (An APN that could do nothing except send / receive MMS) allowed easier billing / tariffing of data, as MMS traffic was sent over a APN which was unmetered.
After the bearer is setup on the MMS APN, Bob’s phone begins crafting a HTTP 1.1 Post to be sent to the MMSC. The content type of this request will be application/vnd.wap.mms-message and the body of the HTTP post will be made up of MMS Message Encapsulation, with the body containing the picture he wants to send to Alice.
Note: Historically Wireless Session Protocol (WSP) was used in lieu of HTTP. These clients would now need a WAP gateway to translate into HTTP.
This HTTP Post is then sent to the MMSC Address, or, if present, the MMSC Proxy address. This traffic is sent over the MMS APN that we just brought up.
HTTP POST Headers for the MO MMS MessageMMS Message Encapsulation from MO MMS Message
The MMSC receives this information, and then, if all was successful, responds with a 200 OK,
200 OK response to MO MMS Message
So now the MMSC has the information from Bob, let’s flip over to Alice.
Message Flow – MMSC to Alice (Mobile Terminated MMS)
For the purposes of simplicity, we’re going to rule out the MMSC from doing clever things like converting the media, accepting email (SMPP) as MMS, etc, etc. Instead we’re going to assume Alice and Bob are on the same Network, and our MMSC is just doing store-and-forward.
The MMSC will look at the To address in the MMS Message Encapsulation of the request Bob sent, to determine that this message is destined for Alice.
The MMSC will load the media content (photo) sent by Bob destined for Alice and serve it via HTTP. The MMSC generates a random URL to serve it this particular file on, with each MMS the MMSC handles being assigned a random URL containing the media content.
Next the MMSC will need to tell Alice’s phone, that she has an MMS waiting for her. This is done by generating an SMS to send to Alice’s phone,
The user-data of this SMS is the Wireless Session Protocol with the method PUSH – Aka WAP Push.
SMS alerting the user of an MMS waiting for delivery
This specially encoded SMS is parsed by the Alice’s phone, which tells the her there is an MMS message waiting for her.
On some operating systems this is pulled automatically, on others, users need to select “Download” to actually get the file.
The UE then just runs an HTTP get to the address in the X-Mms-Content-Location: Header to pull the multimedia content that Bob sent.
HTTP GET from Alice’s Phone / UE to retrieve MMS sent by Bob (MT-MMS)
All going well the URL is valid and Alice’s phone retrieves the message, getting a 200 OK back from the server with the message content.
HTTP Response (200 OK) for MT-MMS, sent by the MMSC to Alice’s phone with the MMS Body
So now Alice’s phone has the MMS content and renders it on the screen, Alice can see the Photo Bob sent her.
Lastly Alice’s phone sends a HTTP POST again to the MMSC, this time indicating the message status is “Retrieved”,
And to close everything off the MMSC confirms receipt of the Retrieved status with a 200 OK, and we are done.
What didn’t we cover?
So that’s a basic MMS message flow, but there’s a few parts we didn’t cover.
The overall architecture beyond just the store-and forward behaviour, charging and authentication we didn’t cover. So let’s look at each of these points.
Overall Architecture
What we just covered what what’s defined as the MM1 interface.
There’s obviously a stack of other interfaces, such as for charging, messaging between MMSC/Carriers, subscriber locating / user database, etc.
Charging
MMSCs would typically have a connection to trigger charging events / credit-control events prior to processing the message.
For online charging the Ro interface can be used, as you would for IMS charging events.
3GPP 3GPP TS 32.270 covers the charging architecture for online/offline charging for MMS.
Authentication
Unfortunately authentication was a bit of an afterthought for the MMS standard, and can be done several different ways.
The most common is to correlate the IP Address on the MMS APN against a subscriber.
Want more telecom goodness?
I have a good old fashioned RSS feed you can subscribe to.