Tag Archives: VoLTE

Best Practices for SGW & PGW Deployment Architectures for Roaming

The S8 Home Routing approach for LTE Roaming works really well, as more and more operators are switching off their legacy circuit switched 2G/3G networks and shifting to LTE & VoLTE for roaming, we’re seeing more an more S8-HR deployments.

When LTE was being standardised in 2008, Local Breakout (LBO) and S8 Home Routing were both considered options for how roaming may look. Fast forward to today, and S8 Home routing is the only way roaming is done for modern deployments.

In light of this, there are some “best practices” in an “all S8 Home Routed” world, we’ve developed, that I thought I’d share.

The Basics

When roaming, the SGW in the Visited Network, sends user traffic back to the PGW in the Home Network.

This means Online/Offline charging, IMS, PCRF, etc, is all done in the Home PLMN. As long as data packets can get from the SGW in the Visited PLMN to the PGW in the Home PLMN, and authentication flows from the Visited MME to the HSS in the Home PLMN, you’re golden.

The Constraints

Of course real networks don’t look as simple as this, in reality a roaming scenario for a visited network has a lot more nodes, which need to be

Building Distributed Packet Core & IMS

Virtualization (VNF / CNF) has led operators away from “big iron” hardware for Packet Core & IMS nodes, towards software based solutions, which in turn offer a lot more flexibility.

Best practice for design of User Plane is to keep the the latency down, by bringing the user plane closer to the user (the idea of “Edge” UPFs in 5GC is a great example of this), and the move away from “big iron” in central locations for SGW and PGW nodes has been the trend for the past decade.

So to achieve these goals in the networks we build, we geographically distribute the core network.

This means we’ve got quite a few S-GW, P-GW, MME & HSS instances across the network.
There’s some real advantages to this approach:

From a redundancy perspective this allows us to “spread the load” and build far more resilient networks. A network with 20 smaller HSS instances spread around the country, is far more resilient than 2 massive ones, regardless of how many power feeds or redundant disks it may have.

This allows us to be more resource efficient. MNOs have always provisioned excess capacity to cater for the loss of a node. If we have 2 MMEs serving a country, then each node has to have at least 50% capacity free, so if one MME were to fail, the other MME could handle the additional load it from it’s dead friend. This is costly for resources. Having 20 MMEs means each MME has to have 5% capacity free, to handle the loss of one MME in the pool.

It also forces our infrastructure teams to manage infrastructure “as cattle” rather than pets. These boxes don’t get names or lovingly crafted, they’re automatically spun up and destroyed without thinking about it.

For security, we only use internal IP addresses for the nodes in our packet core, this provides another layer of protection for the “crown jewels” of our network, so no one messing with BGP filtering can accidentally open the flood gates to our core, as one US operator learned leaving a GGSN open to the world leading to the private information for 100 million customers being leaked.

What this all adds to, is of course, the end user experience.
For the end subscriber / customer, they get a better experience thanks to the reduced latency the connection provides, better uptime and faster call setup / SMS delivery, and less cost to deliver services.

I love this approach and could prothletise about it all day, but in a roaming context this presents some challenges.

The distributed networks we build are in a constant state of flux, new capacity is being provisioned in some areas, nodes things decommissioned in others, and our our core nodes are only reachable on internal IPs, so wouldn’t be reachable by roaming networks.

Our Distributed-Core Roaming Solution

To resolve this we’ve taken a novel approach, we’ve deployed a pair of S-GWs we call the “Roaming SGWs”, and a pair of P-GWs we call the “Roaming PGWs”, these do have public IPs, and are dedicated for use only by roaming traffic.

We really like this approach for a few reasons:

It allows us to be really flexible do what we want inside the network, without impacting roaming customers or operators who use our network for roaming. All the benefits I described from the distributed architectures can still be realised.

From a security standpoint, only these SGW/PGW pairs have public IPs, all the others are on internal IPs. This good for security – Our core network is the ‘crown jewels’ of the network and we only expose an edge to other providers. Even though IPX networks are supposed to be secure, one of the largest IPX providers had their systems breached for 5 years before it was detected, so being almost as distrustful of IPX traffic as Internet traffic is a good thing.
This allows us to put these PGWs / SGWs at the “edge” of our network, and keep all our MMEs, as well as our on-net PGW and SGWs, on internal IPs, safe and secure inside our network.

For charging on the SGWs, we only need to worry about collecting CDRs from one set of SGWs (to go into the TAP files we use to bill the other operators), rather than running around hoovering up SGW CDRs from large numbers of Serving Gateways, which may get blown away and replaced without warning.

Of course, there is a latency angle to this, for international roaming, the traffic has to cross the sea / international borders to get to us. By putting it at the edge we’re seeing increased MOS on our calls, as the traffic is as close to the edge of the network as can be.

Caveat: Increased S11 Latency on Core Network sites over Satellite

This is probably not relevant to most operators, but some of our core network sites are fed only by satellite, and the move to this architecture shifted something: Rather than having latency on the S8 interface from the SGW to the PGW due to the satellite hop, we’ve got latency between the MME and the SGW due to the satellite hop.

It just shifts where in the chain the latency lies, but it did lead to us having to boost some timers in the MME and out of sequence deliver detection, on what had always been an internal interface previously.

Evolution to 5G Standalone Roaming

This approach aligns to the Home Routed options for 5G-SA roaming; UPF chaining means that the roaming traffic can still be routed, as seems to be the way the industry is going.

SA roaming is in its infancy, without widely deployed SA networks, we’re not going to see common roaming using SA for a good long while, but I’ll be curious to see if this approach becomes the de facto standard going forward.

Where to from here?

We’re pretty happy with this approach in the networks we’ve been building.

So far it’s made IREG testing easier as we’ve got two fixed points the IPX needs to hit (The DRAs and the SGWs) rather than a wide range of networks.

Operators with a vast number of APNs they need to drop into different VRFs may have to do some traffic engineering here – Our operations are generally pretty flat, but I can see where this may present some challenges for established operators shifting their traffic.

I’d be keen to hear if other operators are taking this approach and if they’ve run into any issues, or any issues others can see in this, feel free to drop a comment below.

How do you know if they’re roaming? Charging challenges in IMS for Roamers

I got an email the other day asking a simple question:

How do I know if a subscriber is VoLTE roaming or not when they send an SMS to charge for it?

My immediate reaction was to look at the SIP headers, P-Access-Network-Info will tell you where the subscriber is located, end of.

Right?

Well not quite, this will tell the SMSc the location of the subscriber sending the SMS. If the PLMN in the P-Access-Network-Info != the home PLMN, the sub is roaming.

But does this information get passed to the OCS / OFCS?

The SMSc uses “Event based charging” to perform credit control, so let’s have a look at what AVPs are present in the Credit Control Request from the SMSc:

Hmm, the SMS-Information AVP (2000) contains a bunch of information about the SMS being sent, but I don’t see anything about the location of the sender in there.

Originator-Interface is just set to “SIP”, of course in a 2G/3G roaming scenario the Originator-SCCP-Address would be that of the Visited PLMN, but for us it is our SCCP address.

Maybe the standard allows for an additional optional AVP in the SMS-Information-AVP we’re missing? Let’s check TS 32.299:

Nope.

So how to deal with this?

While the standards aren’t totally clear on this, we added an IMS-Info AVP and inside that populated the Access-Network-Information directly from the SIP header, and then picked that off inside our OCS in order to apply the correct rules.

Kamailio Bytes: Stripping SIP Multipart Bodies

For some calls in (such as some IMS emergency calls) you’ll get MIME Multipart Media Encapsulation as the SIP body, as the content-type set to:

Content-Type: multipart/mixed;boundary=968f194ab800ab27

If you’re used to dealing with SIP, you’d expect to see:

Content-Type: application/sdp

This Content-Type multipart/mixed;boundary is totally valid in SIP, in fact RFC 5261 (Message Body Handling in the Session Initiation Protocol (SIP)) details the use of MIME in SIP, and the Geolocation extension uses this, as we see below from a 911 call example.

But while this extension is standardised, and having your SIP Body containing multipart MIME is legal, not everything supports this, including the FreeSWITCH bridge module, which just appends a new SDP body into the Mime Multipart

Site note: I noticed FreeSWITCH Bridge function just appends the new SIP body in the multipart MIME, leaving the original, SDP:

Okay, so how do we replace the MIME Multipart SIP body with a standard SDP?

Well, with Kamalio’s SDP Ops Module, it’s fairly easy:

#If the body is multipart then strip it and replace with a single part
if (has_body("multipart/mixed")) {
	xlog("This has a multipart body");
	if (filter_body("application/sdp")) {
		remove_hf("Content-Type");
		append_hf("Content-Type: application/sdp\r\n");
	} else {
		xlog("Body part application/sdp not found\n");
		}
}

I’ve written about using SDPops to modify SDP before.

And with that we’ll take an SIP message like the one shown on the left, and when relayed, end up with the message on the right:

Simple fix, but saved me having to fix the fault in FreeSWITCH.

Android and Emergency Calling

In the last post we looked at emergency calling when roaming, and I mentioned that there are databases on the handsets for emergency numbers, to allow for example, calling 999 from a US phone, with a US SIM, roaming into the UK.

Android, being open source, allows us to see how this logic works, and it’s important for operators to understand this logic, as it’s what dictates the behavior in many scenarios.

It’s important to note that I’m not covering Apple here, this information is not publicly available to share for iOS devices, so I won’t be sharing anything on this – Apple has their own ecosystem to handle emergency calling, if you’re from an operator and reading this, I’d suggest getting in touch with your Apple account manager to discuss it, they’re always great to work with.

The Android Open Source Project has an “emergency number database”. This database has each of the emergency phone numbers and the corresponding service, for each country.

This file can be read at packages/services/Telephony/ecc/input/eccdata.txt on a phone with engineering mode.

Let’s take a look what’s in mainline Android for Australia:

You can check ECC for countries from the database on the AOSP repo.

This is one of the ways handsets know what codes represent emergency calling codes in different countries, alongside the values set in the SIM and provided by the visited network.

Tales from the Trenches: mode-set in AMR

This one was a bit of a head scratcher for me, but I’m always glad to learn something new.

The handset made a VoLTE call, and it’s SDP offer shows it can support AMR and AMR-WB:

        Media Attribute (a): rtpmap:116 AMR-WB/16000/1
        Media Attribute (a): fmtp:116 mode-set=0,1,2,3,4,5,6,7,8;mode-change-capability=2;max-red=220
        Media Attribute (a): rtpmap:118 AMR/8000/1
        Media Attribute (a): fmtp:118 mode-set=0,1,2,3,4,5,6,7;mode-change-capability=2;max-red=220
        Media Attribute (a): rtpmap:111 telephone-event/16000
        Media Attribute (a): fmtp:111 0-15

Okay, that’s pretty normal, I can see we have the mode-set parameter defined, which indicates what modes the handset supports for each codec.

In our problem scenario, the Media Gateway that the call was sent to responded with this SDP answer:

        Media Description, name and address (m): audio 24504 RTP/AVP 118 110
        Media Attribute (a): rtpmap:118 AMR/8000
        Media Attribute (a): fmtp:118 mode-set=7
        Media Attribute (a): rtpmap:110 telephone-event/8000
        Media Attribute (a): fmtp:110 0-15
        Media Attribute (a): ptime:20
        Media Attribute (a): sendrecv
        [Generated Call-ID: FA163E564B37-f4d-98f56700-735d25-65357ee0-9c488]

But we got an error about not available codecs and the call drops, what gives?

Both sides support AMR (Only the phone supports AMR-WB), and the Media Gateway, as the answerer, supports mode-set 7, which is supported by the UE, so we should be good?

Well, not quite:

If mode-set is specified, it MUST be abided, and frames encoded with modes outside of the subset MUST NOT be sent in any RTP payload or used in codec mode requests. If not present, all codec modes are allowed for the payload type.

RFC 4867 – RTP Payload Format for AMR and AMR-WB

Okay, I get it, the answerer (media gateway) only supports mode 7, but the UE supports all the modes, so we should be fine right?

Well, no.

Section 8.3.1 in the RFC goes on to say in the Offer-Answer Model Considerations:

The parameter [mode-set] is bi-directional, i.e., the restricted set applies to media both to be received and sent by the declaring entity. If a mode set was supplied in the offer, the answerer SHALL return the mode-set unmodified or reject the payload type. However, the answerer is free to choose a mode-set in the answer only if no mode-set was supplied in the offer for a unicast two-peer session.

And there is our problem, and why the call is getting rejected.

The Media Gateway (the answerer in this scenario) is sending back the mode-set it supports (7) but as the UE / handset (offerer) included the mode-set, the Media Gateway should either respond with the same mode set (if it supported all the requested modes) or reject it.

Instead we’re seeing the Media Gateway repond with the mode set, which it supports, which it should not do: The Media Gateway should either return the same mode-set (unmodified / unchanged) or reject it.

And boom, another ticket to another vendor…

Tales from the Trenches: The issue with Emergency Calling URNs in IMS Networks

A lot of countries have a single point of contact for emergency services; in Europe you’d call 112 in an emergency, 000 in Australia or 911 in the US. Calling this number in the country will get you the emergency services.

This means a caller can order an ambulance for smoke inhalation, and the fire brigade, in one call.

But that’s not the case in every country; many countries don’t have one number for the emergency services, they’ve got multiple; a phone number for police, a different number for fire brigade and a different number for an ambulance.

For example, in Brazil if you need the police, you call 190, while a for example, uses 193 as the emergency number for the fire department, the police can be reached at 190 or 191 depending on if it’s road policing or general, and medical emergencies are covered by 192. Other countries have similar setups.

This is all well and good if you’re in Brazil, and you call 192 for an ambulance, the phone sends a SIP INVITE with a Request URI of sip:[email protected], because we can put a rule into our E-CSCF to say if the number is 192 to route it to the answer point for ambulances – But that’s not often the case on emergency calls.

In IMS, handsets generally detect the number dialed is on the Emergency Calling Code (ECC) list from the USIM Card.

The use of the ECC list means the phone knows this is an emergency call, and this is really important. For countries that use AML this can trigger sending of the AML SMS that process, and Emergency Calls should always be allowed to be made, even without credit, a valid SIM card, or even a SIM in the phone at all.

But this comes with a cost; when a user dials 911, the phones doesn’t (generally) send a call to sip:[email protected] like it would with any other dialled number, but rather the SIP INVITE is sent to urn:service:sos which will be routed to the PSAP by the E-CSCF. When a call comes through to these URNs they’re given top priority in the network

This is all well and good in a country where it doesn’t matter which emergency service you called, because all emergency calls route to a single PSAP, but in a country with multiple numbers, it’s really important when you call and ambulance, your call doesn’t get routed to animal control.

That means the phone has to look at what emergency number you’ve dialed, and map the URN it sends the call to to match what you’ve actually requested.

Recently we’ve been helping an operator in a country with a numbering plan like this, and we’ve been finding the limits of the standards here.
So let’s start by looking at what the standards state:

IMS Emergency Calling is governed by TS 103.479 which in turn delegates to IETF RFC 5031, but for the calling number to URN translation, it’s pretty quiet.

Let’s look at what RFC 5031 allows for URNs:

  • urn:service:sos.ambulance
  • urn:service:sos.animal-control
  • urn:service:sos.fire
  • urn:service:sos.gas
  • urn:service:sos.marine
  • urn:service:sos.mountain
  • urn:service:sos.physician
  • urn:service:sos.poison
  • urn:service:sos.police

The USIM’s Emergency Calling Codes EF would be the perfect source of this data; for each emergency calling code defined, you’ve got a flag to indicate what it’s for, here’s what we’ve got available on the SIM Card:

  • Bit 1 Police
  • Bit 2 Ambulance
  • Bit 3 Fire Brigade
  • Bit 4 Marine Guard
  • Bit 5 Mountain Rescue
  • Bit 6 manually initiated eCall
  • Bit 7 automatically initiated eCall
  • Bit 8 is spare and set to “0”

So these could be mapped pretty easily you’d think, so if the call is made to an Emergency Calling Code flagged with Bit 4, the URN would go to urn:service:sos.mountain.

Alas from our research, we’ve found most OEMs send calls to the generic urn:service:sos, regardless of the dialled number and the ECC flags that are set on the SIM for that number.

One of the big chip vendors sends calls to an ECC flagged as Ambulance to urn:service:sos.fire, which is totally infuriating, and we’ve had to put a rule in our E-CSCF to handle this if the User Agent is set to one of their phones.

Is there room for improvement here? For sure! Emergency calling is super important, and time is of the essence, while animal control can probably transfer you to an ambulance, an emergency is by very nature time sensitive, and any time wasted can lead to worse outcomes.

While carrier bundles from the OEMs can handle this, the global ability to take any phone, from any country and call an emergency number is so important, that relying on a country-by-country approach here won’t suffice.

What could we do as an industry to address this?

Acknowledging that not all countries have a single point of contact for emergency service, introducing a simple mechanism in the UE SIP message to indicate what number (Emergency Calling Code) the user actually dialled would be invaluable here.

URNs are important, but knowing the dialed number when it comes to PSAP routing, is so important – This wouldn’t even need to be its own SIP header, it could just be thrown into the Contact header as another parameter.

Highly developed markets are often the first to embrace new tech (for us this means VoLTE and VoNR), but this means that these issues seen by less developed markets won’t appear until long after the standard has been set in stone, and often countries like this aren’t at the table of the standards bodies to discuss such requirements.

This easy, reasonable update to the standard, has the potential to save lives, and next time this comes up in a working group I’ll be advocating for a change.

Playing back AMR streams from Packet Captures

The other day I found myself banging my head on the table to diagnose an issue with Ringback tone on an SS7 link and the IMS.

On the IMS side, no RBT was heard, but I could see the Media Gateway was sending RTP packets to the TAS, and the TAS was sending it to the UE, but was there actual content in the RTP packets or was it just silence?

If this was PCM / G711 we’d be able to just playback in Wireshark, but alas we can’t do this for the AMR codec.

Filter the RTP stream out in Wireshark

Inside Wireshark I filtered each of the audio streams in one direction (one for the A-Party audio and one for the B-Party audio)

Then I needed to save each of the streams as a separate PCAP file (Not PCAPng).

Turn into AMR File

With the audio stream for one direction saved, we can turn it into an AMR file, using Juan Noguera (Spinlogic)’s AMR Extractor tool.

Clone the Repo from git, and then in the same directory run:

python3 pcap_parser.py -i AMR_B_Leg.pcap -o AMR_B_Leg.3ga

Playback with VLC / Audacity

I was able to play the file with VLC, and load it into Audacity to easily see that yes, the Ringback Tone was present in the AMR stream!

A look at Advanced Mobile Location SMS for Emergency Calls

Advanced Mobile Location (AML) is being rolled out by a large number of mobile network operators to provide accurate caller location to emergency services, so how does it work, what’s going on and what do you need to know?

Recently we’ve been doing a lot of work on emergency calling in IMS, and meeting requirements for NG-112 / e911, etc.

This led me to seeing my first Advanced Mobile Location (AML) SMS in the wild.

For those unfamiliar, AML is a fancy text message that contains the callers location, accuracy, etc, that is passed to emergency services when you make a call to emergency services in some countries.

It’s sent automatically by your handset (if enabled) when making a call to an emergency number, and it provides the dispatch operator with your location information, including extra metadata like the accuracy of the location information, height / floor if known, and level of confidence.

The standard is primarily driven by EENA, and, being backed by the European Union, it’s got almost universal handset support.

Google has their own version of AML called ELS, which they claim is supported on more than 99% of Android phones (I’m unclear on what this means for Harmony OS or other non-Google backed forks of Android), and Apple support for AML starts from iOS 11 onwards, meaning it’s supported on iPhones from the iPhone 5S onards,.

Call Flow

When a call is made to the PSAP based on the Emergency Calling Codes set on the SIM card or set in the OS, the handset starts collecting location information. The phone can pull this from a variety of sources, such as WiFi SSIDs visible, but the best is going to be GPS or one of it’s siblings (GLONASS / Galileo).

Once the handset has a good “lock” of a location (or if 20 seconds has passed since the call started) it bundles up all of this information the phone has, into an SMS and sends it to the PSAP as a regular old SMS.

The routing from the operator’s SMSc to the PSAP, and the routing from the PSAP to the dispatcher screen of the operator taking the call, is all up to implementation. For the most part the SMS destination is the emergency number (911 / 112) but again, this is dependent on the country.

Inside the SMS

To the user, the AML SMS is not seen, in fact, it’s actually forbidden by the standard to show in the “sent” items list in the SMS client.

On the wire, the SMS looks like any regular SMS, it can use GSM7 bit encoding as it doesn’t require any special characters.

Each attribute is a key / value pair, with semicolons (;) delineating the individual attributes, and = separating the key and the value.

Below is an example of an AML SMS body:

A"ML=1;lt=+54.76397;lg=-
0.18305;rd=50;top=20130717141935;lc=90;pm=W;si=123456789012345;ei=1234567890123456;mcc=234;mnc=30; ml=128

If you’ve got a few years of staring at Wireshark traces in Hex under your belt, then this will probably be pretty easy to get the gist of what’s going on, we’ve got the header (A”ML=1″) which denotes this is AML and the version is 1.

After that we have the latitude (lt=), longitude (lg=), radius (rd=), time of positioning (top=), level of confidence (lc=), positioning method (pm=) with G for GNSS, W for Wifi signal, C for Cell
or N for a position was not available, and so on.

AML outside the ordinary

Roaming Scenarios

If an emergency occurs inside my house, there’s a good chance I know the address, and even if I don’t know my own address, it’s probably linked to the account holder information from my telco anyway.

AML and location reporting for emergency calls is primarily relied upon in scenarios where the caller doesn’t know where they’re calling from, and a good example of this would be a call made while roaming.

If I were in a different country, there’s a much higher likelihood that I wouldn’t know my exact address, however AML does not currently work across borders.

The standard suggests disabling SMS when roaming, which is not that surprising considering the current state of SMS transport.

Without a SIM?

Without a SIM in the phone, calls can still be made to emergency services, however SMS cannot be sent.

That’s because the emergency calling standards for unauthenticated emergency calls, only cater for

This is a limitation however this could be addressed by 3GPP in future releases if there is sufficient need.

HTTPS Delivery

The standard was revised to allow HTTPS as the delivery method for AML, for example, the below POST contains the same data encoded for use in a HTTP transaction:

v=3&device_number=%2B447477593102&location_latitude=55.85732&location_longitude=-
4.26325&location_time=1476189444435&location_accuracy=10.4&location_source=GPS&location_certainty=83
&location_altitude=0.0&location_floor=5&device_model=ABC+ABC+Detente+530&device_imei=354773072099116
&device_imsi=234159176307582&device_os=AOS&cell_carrier=&cell_home_mcc=234&cell_home_mnc=15&cell_net
work_mcc=234&cell_network_mnc=15&cell_id=0213454321 

Implementation of this approach is however more complex, and leads to little benefit.

The operator must zero-rate the DNS, to allow the FQDN for this to be resolved (it resolves to a different domain in each country), and allow traffic to this endpoint even if the customer has data disabled (see what happens when your handset has PS Data Off ), or has run out of data.

Due to the EU’s stance on Net Neutrality, “Zero Rating” is a controversial topic that means most operators have limited implementation of this, so most fall back to SMS.

Other methods for sharing location of emergency calls?

In some upcoming posts we’ll look at the GMLC used for E911 Phase 2, and how the network can request the location from the handset.

Further Reading

https://eena.org/knowledge-hub/documents/aml-specifications-requirements/

VoLTE / IMS – Analysis Challenge

It’s challenge time, this time we’re going to be looking at an IMS PCAP, and answering some questions to test your IMS analysis chops!

Here’s the packet capture:

Easy Questions

  • What QCI value is used for the IMS bearer?
  • What is the registration expiry?
  • What is the E-UTRAN Cell ID the Subscriber is served by?
  • What is the AMBR of the IMS APN?

Intermediate Questions

  • Is this the first or subsequent registration?
  • What is the Integrity-Key for the registration?
  • What is the FQDN of the S-CSCF?
  • What Nonce value is used and what does it do?
  • What P-CSCF Addresses are returned?
  • What time would the UE need to re-register by in order to stay active?
  • What is the AA-Request in #476 doing?
  • Who is the(opens in a new tab)(opens in a new tab)(opens in a new tab) OEM of the handset?
  • What is the MSISDN associated with this user?

Hard Questions

  • What port is used for the ESP data?
  • Which encryption algorithm and algorithm is used?
  • How many packets are sent over the ESP tunnel to the UE?
  • Where should SIP SUBSCRIBE requests get routed?
  • What’s the model of phone?

The answers for each question are on the next page, let me know in the comments how you went, and if there’s any tricky ones!

Improving WiFi Calling quality for WiFi Operators

I had a question recently on LinkedIn regarding how to preference Voice over WiFi traffic so that a network engineer operating the WiFi network can ensure the best quality of experience for Voice over WiFi.

Voice over WiFi is underpinned by the ePDG – Evolved Packet Data Gateway (this is a fancy IPsec tunnel we authenticate to using the SIM to drop our traffic into the P-CSCF over an unsecured connection). To someone operating a WiFi network, the question is how do we prioritise the traffic to the ePDGs and profile it?

ePDGs can be easily discovered through a simple DNS lookup, once you know the Mobile Network Code and Mobile Country code of the operators you want to prioritise, you can find the IPs really easily.

ePDG addresses take the form epdg.epc.mncXXX.mccYYY.pub.3gppnetwork.org so let’s look at finding the IPs for each of these for the operators in a country:

The first step is nailing down the mobile network code and mobile country codes of the operators you want to target, Wikipedia is a great source for this information.
Here in Australia we have the Mobile Country Code 505 and the big 3 operators all support Voice over WiFi, so let’s look at how we’d find the IPs for each.
Telstra has mobile network code (MNC) 01, in 3GPP DNS we always pad network codes to 3 digits, so that’ll be 001, and the mobile country code (MCC) for Australia is 505.
So to find the IPs for Telstra we’d run an nslookup for epdg.epc.mnc001.mcc505.pub.3gppnetwork.org – The list of IPs that are returned, are the IPs you’ll see Voice over WiFi traffic going to, and the IPs you should provide higher priority to:

For the other big operators in Australia epdg.epc.mnc002.mcc505.pub.3gppnetwork.org will get you Optus and epdg.epc.mnc003.mcc505.pub.3gppnetwork.org will get you VHA.

The same rules apply in other countries, you’d just need to update the MNC/MCC to match the operators in your country, do an nslookup and prioritise those IPs.

Generally these IPs are pretty static, but there will need to be a certain level of maintenance required to keep this list up to date by rechecking.

Happy WiFi Calling!

Verify Android Signing Certificate for ARA-M Carrier Privileges in App

Part of the headache when adding the ARA-M Certificate to a SIM is getting the correct certificate loaded,

The below command calculates it the SHA-1 Digest we need to load as the App ID on the SIM card’s ARA-M or ARA-F applet:

apksigner verify --verbose --print-certs "yourapp.apk"

You can then flash this onto the SIM with PySIM:

pySIM-shell (MF/ADF.ARA-M)> aram_store_ref_ar_do --aid FFFFFFFFFFFF --device-app-id 40b01d74cf51bfb3c90b69b6ae7cd966d6a215d4 --android-permissions 0000000000000001 --apdu-always

Authenticating Fixed Line Subscribers into IMS

We recently added support in PyHSS for fixed line SIP subscribers to attach to the IMS.

Traditional telecom operators are finding their fixed line network to be a bit of a money pit, something they’re required to keep operating to meet regulatory obligations, but the switches are sitting idle 99% of the time. As such we’re seeing more and more operators move fixed line subs onto their IMS.

This new feature means we can use PyHSS to serve as the brains for a fixed network, as well as for mobile, but there’s one catch – How we authenticate subscribers changes.

Most banks of line cards in a legacy telecom switches, or IP Phones, don’t have SIM slots to allow us to authenticate, so instead we’re forced to fallback to what they do support.

Unfortunately for the most part, what is supported by these IP phones or telecom switches is SIP MD5 Digest Authentication.

The Nonce is generated by the HSS and put into the Multimedia-Authentication-Answer, along with the subscriber’s password and sent in the clear to the S-CSCF.

Subscriber with Password made up of all 1's MAA response from HSS for Digest-MD5 Auth

The HSS then generates the the Multimedia-Auth Answer, it generates a nonce (in the 3GPP-SIP-Authenticate / 609 AVP) and sends the Subscriber’s password in the 3GPP-SIP-Authorization (610) AVP in response back to the S-CSCF.

I would have thought a better option would be for the HSS to generate the Nonce and Digest, and then the S-CSCF to just send the Nonce to the Sub and compare the returned Digest from the Sub against the expected Digest from the HSS, but it would limit flexibility (realm adaptation, etc) I guess.

The UE/UA (I guess it’s a UA in this context as it’s not a mobile) then generates its own Digest from the Nonce and sends it back to the S-CSCF via the P-CSCF.

The S-CSCF compares the received Digest response against the one it generated, and if the two match, the sub is authenticated and allowed to attach onto the network.

IMS iFC – SPT Session Cases

Mostly just reference material for me:

Possible values:

  • 0 (ORIGINATING_SESSION)
  • 1 TERMINATING_REGISTERED
  • 2 (TERMINATING_UNREGISTERED)
  • 3 (ORIGINATING_UNREGISTERED

In the past I had my iFCs setup to look for the P-Access-Network-Info header to know if the call was coming from the IMS, but it wasn’t foolproof – Fixed line IMS subs didn’t have this header.

            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>0</Group>
                    <Method>INVITE</Method>
                    <Extension></Extension>
                </SPT>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>1</Group>
                    <SIPHeader>
                      <Header>P-Access-Network-Info</Header>
                    </SIPHeader>
                </SPT>                
            </TriggerPoint>

But now I’m using the Session Cases to know if the call is coming from a registered IMS user:

        <!-- SIP INVITE Traffic from Registered Sub-->
        <InitialFilterCriteria>
            <Priority>30</Priority>
            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>0</Group>
                    <Method>INVITE</Method>
                    <Extension></Extension>
                </SPT>
                <SPT>
                    <Group>0</Group>
                    <SessionCase>0</SessionCase>
                </SPT>             
            </TriggerPoint>

SQN Sync in IMS Auth

So the issue was a head scratcher.

Everything was working on the IMS, then I go to bed, the next morning I fire up the test device and it just won’t authenticate to the IMS – The S-CSCF generated a 401 in response to the REGISTER, but the next REGISTER wouldn’t pass.

Wireshark just shows me this loop:

UE -> IMS: REGISTER
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)

So what’s going on here?

IMS uses AKAv1-MD5 for Authentication, this is slightly different to the standard AKA auth used in cellular, but if you’re curious, we’ve covered by IMS Authentication and standard AKA based SIM Authentication in cellular networks before.

When we generate the vectors (for IMS auth and standard auth) one of the inputs to generate the vectors is the Sequence Number or SQN.

This SQN ticks over like an odometer for the number of times the SIM / HSS authentication process has been performed.

There is some leeway in the SQN – It may not always match between the SIM and the HSS and that’s to be expected.
When the MME sends an Authentication-Information-Request it can ask for multiple vectors so it’s got some in reserve for the next time the subscriber attaches, and that’s allowed.

Information stored on USIM / SIM Card for LTE / EUTRAN / EPC - K key, OP/OPc key and SQN Sequence Number

But there are limits to how far out our SQN can be, and for good reason – One of the key purposes for the SQN is to protect against replay attacks, where the same vector is replayed to the UE. So the SQN on the HSS can be ahead of the SIM (within reason), but it can’t be behind – Odometers don’t go backwards.

So the issue was with the SQN on the SIM being out of Sync with the SQN in the IMS, how do we know this is the case, and how do we fix this?

Well there is a resync mechanism so the SIM can securely tell the HSS what the current SQN it is using, so the HSS can update it’s SQN.

When verifying the AUTN, the client may detect that the sequence numbers between the client and the server have fallen out of sync.
In this case, the client produces a synchronization parameter AUTS, using the shared secret K and the client sequence number SQN.
The AUTS parameter is delivered to the network in the authentication response, and the authentication can be tried again based on authentication vectors generated with the synchronized sequence number.

RFC 3110: HTTP Digest Authentication using AKA

In our example we can tell the sub is out of sync as in our Multimedia Authentication Request we see the SIP-Authorization AVP, which contains the AUTS (client synchronization parameter) which the SIM generated and the UE sent back to the S-CSCF. Our HSS can use the AUTS value to determine the correct SQN.

SIP-Authorization AVP in the Multimedia Authentication Request means the SQN is out of Sync and this AVP contains the RAND and AUTN required to Resync

Note: The SIP-Authorization AVP actually contains both the RAND and the AUTN concatenated together, so in the above example the first 32 bytes are the AUTN value, and the last 32 bytes are the RAND value.

So the HSS gets the AUTS and from it is able to calculate the correct SQN to use.

Then the HSS just generates a new Multimedia Authentication Answer with a new vector using the correct SQN, sends it back to the IMS and presto, the UE can respond to the challenge normally.

This feature is now fully implemented in PyHSS for anyone wanting to have a play with it and see how it all works.

And that friends, is how we do SQN resync in IMS!

SMS-over-IP Message Efficiency – K

Recently I read a post from someone talking about efficiency of USSD over IMS, and how crazy it was that such a small amount of data used so much overhead to get transferred across the network.

Having built an SMSc a while ago, my mind immediately jumped to SMS over IMS as being a great example of having so much overhead.

If we’re to consider sending the response “K” to a text message, how much overhead is there?

SMS PDU containing the message “K”

I’m using a common Qualcomm based smartphone, and here’s the numbers I’ve got from Wireshark when I send the message:

Transport Ethernet Header – 14 Bytes
Transport IP Header – 20 Bytes
Transport UDP Header – 8 Bytes
Transport GTP Header – 12 Bytes
User IP Header – 20 Bytes
IPsec ESP Header (For Um interface protection) – 22 Bytes
Encapsulated UDP Header – 8 Bytes
SIP Headers – 707 Bytes
SMS Header – 16 Bytes
SMS Message Body “K” – 1 Byte

Overall SIP, ESP, GTP and Transport PCAP for SIP MESSAGE

That seems pretty bad in terms of efficiency, but let’s look at how that actually works out:

This means our actual message body makes up just 1 byte of 828 bytes, or 0.12% of the size of the overall payload.

Even combined with the SMS header (which contains all the addressing information needed to route an SMS) it’s still just on 2% of the overall message.

So USSD efficiency isn’t great, but it’s not alone!

Kamailio I-CSCF – SRV Lookup Behaviour

Recently I had a strange issue I thought I’d share.

Using Kamailio as an Interrogating-CSCF, Kamailio was getting the S-CSCF details from the User-Authorization-Answer’s “Server-Name” (602) AVP.

The value was set to:

sip:scscf.mnc001.mcc001.3gppnetwork.org:5060

But the I-CSCF was only looking up A-Records for scscf.mnc001.mcc001.3gppnetwork.org, not using DNS-SRV.

The problem? The Server-Name I had configured as a full SIP URI in PyHSS including the port, meant that Kamailio only looks up the A-Record, and did not do a DNS-SRV lookup for the domain.

Dropping the port number saw all those delicious SRV records being queried.

Something to keep in mind if you use S-CSCF pooling with a Kamailio based I-CSCF, if you want to use SRV records for load balancing / traffic sharing, don’t include the port, and if instead you want it to go to the specified host found by an A-record, include the port.

SMS with Alphanumeric Source

Sending SMS with an alphanumeric String as the Source

If you’ve ever received an SMS from your operator, and the sender was the Operator name for example, you may be left wondering how it’s done.

In IMS you’d think this could be quite simple – You’d set the From header to be the name rather than the MSISDN, but for most SMSoIP deployments, the From header is ignored and instead the c header inside the SMS body is used.

So how do we get it to show text?

Well the TP-Originating address has the “Type of Number” (ToN) field which is typically set to International/National, but value 5 allows for the Digits to instead be alphanumeric characters.

GSM 7 bit encoding on the text in the TP-Originating Address digits and presto, you can send SMS to subscribers where the message shows as From an alphanumeric source.

On Android SMSs received from alphanumeric sources cannot be responded to (“no more “DO NOT REPLY TO THIS MESSAGE” at the end of each text), but on iOS devices you can respond, but if I send an SMS from “Nick” the reply from the subscriber using the iPhone will be sent to MSISDN 6425 (Nick on the telephone keypad).

FreeSWITCH, Kamailio & IMS Extensions

Recently I’ve been doing some work with FreeSWITCH as an IMS Conference Factory, I’ve written a bit about it before in this post on using FreeSWITCH with the AMR codec.

Pretty early on in my testing I faced a problem with subsequent in-dialog responses, like re-INVITEs used for holding the calls.

Every subsequent message, was getting a “420 Bad Extension” response from FreeSWITCH.

So what didn’t it like and why was FreeSWITCH generating 420 Bad Extension Responses to these subsequent messages?

Well, the “Extensions” FreeSWITCH is referring to are not extensions in the Telephony sense – as in related to the Dialplan, like an Extension Number to identify a user, but rather the Extensions (as in expansions) to the SIP Protocol introduced for IMS.

The re-INVITE contains a Require header with sec-agree which is a SIP Extension introduced for IMS, which FreeSWITCH does not have support for, and the re-INVITE says is required to support the call (Not true in this case).

Using a Kamailio based S-CSCF means it is easy to strip these Headers before forwarding the requests onto the Application Server, which is what I’ve done, and bingo, no more errors!

The Surprisingly Complicated World of SMS: Apple iPhone MT SMS

In iOS 15, Apple added support for iPhones to support SMS over IMS networks – SMSoIP. Previously iPhone users have been relying on CSFB / SMSoNAS (Using the SGs interface) to send SMS on 4G networks.

Getting this working recently led me to some issues that took me longer than I’d like to admit to work out the root cause of…

I was finding that when sending a Mobile Termianted SMS to an iPhone as a SIP MESSAGE, the iPhone would send back the 200 OK to confirm delivery, but it never showed up on the screen to the user.

The GSM A-I/F headers in an SMS PDU are used primarily for indicating the sender of an SMS (Some carriers are configured to get this from the SIP From header, but the SMS PDU is most common).

The RP-Destination Address is used to indicate the destination for the SMS, and on all the models of handset I’ve been testing with, this is set to the MSISDN of the Subscriber.

But some devices are really finicky about it’s contents. Case in point, Apple iPhones.

If you send a Mobile Terminated SMS to an iPhone, like the one below, the iPhone will accept and send back a 200 OK to this request.

The problem is it will never be displayed to the user… The message is marked as delivered, the phone has accepted it it just hasn’t shown it…

SMS reports as delivered by the iPhone (200 OK back) but never gets displayed to the user of the phone as the RP-Destination Address header is populated

The fix is simple enough, if you set the RP-Destination Address header to 0, the message will be displayed to the user, but still took me a shamefully long time to work out the problem.

RP-Destination Address set to 0 sent to the iPhone, this time it’ll get displayed to the user.

FreeSWITCH as an IMS Application Server

After getting AMR support in FreeSWITCH I set about creating an IMS Application Server for VoLTE / IMS networks using FreeSWITCH.

So in IMS what is an Application Server? Well, the answer is almost anything that’s not a CSCF.

An Application Server could handle your Voicemail, recorded announcements, a Conference Factory, or help interconnect with other systems (without using a BGCF).

I’ll be using mine as a simple bridge between my SIP network and the IMS core I’ve got for VoLTE, with FreeSWITCH transcoding between AMR to PCMA.

Setting up FreeSWITCH

You’ll need to setup FreeSWITCH as per your needs, so that’s however you want to use it.

This post won’t cover setting up FreeSWITCH, there’s plenty of good resources out there for that.

The only difference is when you install FreeSWITCH, you will want to compile with AMR Support, so that you can interact with mobile phones using the AMR codec, which I’ve documented how to do here.

Setting up your IMS

In order to get calls from the IMS to the Application Server, we need a way of routing the calls to the Application Server.

There are two standards-compliant ways to achieve this,

The first is to use ENUM to route the calls you want to send to the Application Server, to the application server.

If you want to go down that path using Kamailio as your IMS I’ve got a post on that topic here.

But this is a blunt instrument, after all, it’ll only ever be used at the start of the call, what if we want to send it to an AS because a destination can’t be reached and we want to play back a recorded announcement?

Well that’s where iFCs come into the picture. Through the use of Initial Filter Criterias, we’re able to route different types of SIP traffic, requests and responses, based on our needs. Again we can do this in Kamailio, with a little help from an HSS like PyHSS.