Even before 5G was released, the arms race to claim the “fastest” speeds on LTE, NSA and SA networks has continued, with pretty much every operator claiming a “first” or “fastest”.
I myself have the fastest 5G network available* but I thought I’d look at how big the values are we can put in for speed, these are the Maximum Bitrate Values (like AMBR) we can set on an APN/DNN, or on a Charging Rule.
*Measurement is of the fastest 5G network in an eastward facing office, operated by a person named Nick, in a town in Australia. Other networks operated by people other than those named Nick in eastward facing office outside of Australia were not compared.
The answer for Release 8 LTE is 4294967294 bytes per second, aka 4295 Mbps 4.295 Gbps.
Not bad, but why this number?
The Max-Requested-Bandwidth-DL AVP tells the PGW the max throughput allowed in bits per second. It’s a Unsigned32 so max value is 4294967294, hence the value.
But come release 15 some bright spark thought we may in the not to distant future break this barrier, so how do we go above this?
The answer was to bolt on another AVP – the “Extended-Max-Requested-BW-DL” AVP ( 554 ) was introduced, you might think that means the max speed now becomes 2x 4.295 Gbps but that’s not quite right – The units was shifted.
This AVP isn’t measuring bits per second it’s measuring kilobits per second.
So the standard Max-Requested-Bandwidth-DL AVP gives us 4.3 Gbps, while the Extended-Max-Requested-Bandwidth gives us a 4,295 Gbps.
We add the Extended-Max-Requested-Bandwidth AVP (4295 Gbps) onto the Max-Requested Bandwidth AVP (4.3 Gbps) giving us a total of 4,4299.3 Gbps.
Last year I purchased a cheap second hand Huawei macro base station – there’s lots of these on the market at the moment due to the fact they’re being replaced in many countries.
I’m using it in my lab environment, and as such the config I’ve got is very “bare bones” and basic. Keep in mind if you’re looking to deploy a Macro eNodeB in production, you may need more than just a blog post to get everything tuned and functioning properly…
In this post we’ll cover setting up a Huawei BTS3900 eNodeB from scratch, using the MML interface, without relying on the U2020 management tool.
Obviously the details I setup (IP Addressing, PLMN and RF parameters) are going to be different to what you’re configuring, so keep that in mind, where I’ve got my MME Addresses, site IDs, TACs, IP Addresses, RFUs, etc, you’ll need to substitute your own values.
A word on Cabinets
Typically these eNodeBs are shipped in cabinets, that contain the power supplies, alarm / environmental monitoring, power distribution, etc.
Early on in the setup process we’ll be setting the cabinet types we’ve got, and then later on we’ll tell the system what we have installed in which slots.
This is fine if you have a cabinet and know the type, but in my case at least I don’t have a cabinet manufactured by Huawei, just a rack with some kit mounted in it.
This is OK, but it leads to a few gotchas I need to add a cabinet (even though it doesn’t physically exist) and when I setup my RRUs I need to define what cabinet, slot and subrack it’s in, even though it isn’t in any. Keep this in mind as we go along and define the position of the equipment, that if you’re not using a real-world cabinet, the values mean nothing, but need to be kept consistent.
To begin we’ll need to setup the basics, by disabling DHCP and setting an local IP Address for the unit.
SET DHCPSW: SWITCH=DISABLE;
SET LOCALIP: IP="192.168.5.234", MASK="255.255.248.0";
Obviously your IP address details will be different. Next we’ll add an eNodeB function, the LMPT / UMPT can have multiple functions and multiple eNodeBs hosted on the same hardware, but in our case we’re just going to configure one:
Again, your eNodeB ID, location, site name, etc, are all going to be different, as will your location.
Next we’ll set the system to maintenance mode (MNTMODE), so we can make changes on the fly (this takes the eNB off the air, but we’re already off the air), you’ll need to adjust the start and end times to reflect the current time for the start time, and end time to be after you’re done setting all this up.
SET MNTMODE: MNTMode=INSTALL, ST=2013&09&20&15&00&00, ET=2013&09&25&15&00&00, MMSetRemark="NewSite Install";
Next we’ll set the operator details, this is the PLMN of the eNodeB, and create a new tracking area.
Next we’ll be setting and populating the cabinets I mentioned earlier. I’ll be telling the unit it’s inside a APM30 (Cabinet 0), and in Cabinet Number 0, Subrack 0, is a BBU3900.
//To modify the cabinet type, run the following command: ADD CABINET:CN=0,TYPE=APM30; //Add a BBU3900 subrack, run the following command: ADD SUBRACK:CN=0,SRN=0,TYPE=BBU3900; //To configure boards and RF datas, run the following commands:
And inside the BBU3900 there’s some cards of course, and each card has as slot, as per the drawing below.
In my environment I’ve got a LMPT in slot 7, and a LBBP in Slot 3. There’s a fan and a UPEU too, so: We’ll add a board in Slot No. 7, of type LMPT, We’ll add a board in Slot No. 3, of type LBBP working on FDD, We’ll add a fan board in Slot No. 16, and a UPEU in Slot No. 18.
Huawei publish design guides for which cards should be in which slots, the general rule is that your LMPT / UMPT card goes in Slot 7, with your BBP cards (UBBP or LBBP) in slots 3, then 2, then 1, then 0. Fans and UPEUs can only go in the slots designed to fit them, so that makes it a bit easier.
Next we’ll need to setup our RRUs, for this we’ll need to setup an RRU chain, which is the Huawei term for the CPRI links and add an RRU into it:
//Modify the reference signal power.
MOD PDSCHCFG: LocalCellId=1, ReferenceSignalPwr=-81;
//Add an operator for the cell.
ADD CELLOP: LocalCellId=0, TrackingAreaId=0;
//Activate the cell.
ACT CELL: LocalCellId=1;
The Binding Support Function is used in 4G and 5G networks to allow applications to authenticate against the network, it’s what we use to authenticate for XCAP and for an Entitlement Server.
Rather irritatingly, there are two BSF addresses in use:
If the ISIM is used for bootstrapping the FQDN to use is:
bsf.ims.mncXXX.mccYYY.pub.3gppnetwork.org
But if the USIM is used for bootstrapping the FQDN is
bsf.mncXXX.mccYYY.pub.3gppnetwork.org
You can override this by setting the 6FDA EF_GBANL (GBA NAF List) on the USIM or equivalent on the ISIM, however not all devices honour this from my testing.
We recently added support in PyHSS for fixed line SIP subscribers to attach to the IMS.
Traditional telecom operators are finding their fixed line network to be a bit of a money pit, something they’re required to keep operating to meet regulatory obligations, but the switches are sitting idle 99% of the time. As such we’re seeing more and more operators move fixed line subs onto their IMS.
This new feature means we can use PyHSS to serve as the brains for a fixed network, as well as for mobile, but there’s one catch – How we authenticate subscribers changes.
Most banks of line cards in a legacy telecom switches, or IP Phones, don’t have SIM slots to allow us to authenticate, so instead we’re forced to fallback to what they do support.
Unfortunately for the most part, what is supported by these IP phones or telecom switches is SIP MD5 Digest Authentication.
The Nonce is generated by the HSS and put into the Multimedia-Authentication-Answer, along with the subscriber’s password and sent in the clear to the S-CSCF.
The HSS then generates the the Multimedia-Auth Answer, it generates a nonce (in the 3GPP-SIP-Authenticate / 609 AVP) and sends the Subscriber’s password in the 3GPP-SIP-Authorization (610) AVP in response back to the S-CSCF.
I would have thought a better option would be for the HSS to generate the Nonce and Digest, and then the S-CSCF to just send the Nonce to the Sub and compare the returned Digest from the Sub against the expected Digest from the HSS, but it would limit flexibility (realm adaptation, etc) I guess.
The UE/UA (I guess it’s a UA in this context as it’s not a mobile) then generates its own Digest from the Nonce and sends it back to the S-CSCF via the P-CSCF.
The S-CSCF compares the received Digest response against the one it generated, and if the two match, the sub is authenticated and allowed to attach onto the network.
In the past I had my iFCs setup to look for the P-Access-Network-Info header to know if the call was coming from the IMS, but it wasn’t foolproof – Fixed line IMS subs didn’t have this header.
If you’ve ever worked in roaming, you’ll probably have had the misfortune of dealing with Transferred Account Procedures aka TAP files.
It’s used for billing a 2G GSM call right up to 5G data usage, if you use a service while roaming, somewhere in the world there’s a TAP file with your usage in it.
A brief history of TAP
TAP was originally specified by the GSMA in 1991 as a standard CDR interchange format between operators, for use in roaming scenarios.
Notice I said GSMA – Not 3GPP – This means there’s no 3GPP TS docs for this, it’s defined by the industry lobby group’s members, rather than the standards body.
So what does this actually mean? Well, if you’re MNO A and a customer from MNO B roams into your network, all the calls, SMS and data consumed by the roaming subscriber from MNO B will need to be billed to MNO B, by you, MNO A.
If a network operator wants to get paid for traffic used on their network by roaming subscribers, they’d better send out a TAP file to the roamer’s home network.
TAP is the file format generated my MNO A and sent to MNO B, containing all the usage charges that subscribers from MNO B have racked up while roaming into your network.
These are broken down into “Transactions” (CDRs), for events like making a call, connecting a PDN session and consuming data, or sending a text.
In the beginning of time, GSM provided only voice calling service. This meant that the only services a subscriber could consume while roaming was just making/receiving voice calls which were billed at the end of each month. – This meant billing was equally simple, every so often the visisted network would send the TAP files for the voice calls made by subscribers visited other networks, to the home networks, which would markup those charges, and add them onto the monthly invoice for each subscriber who was roaming.
But of course today, calling accounts for a tiny amount of usage on the network, but this happened gradually while passing through the introduction of SMS, CAMEL services, prepaid services, mobile data, etc. For all these services that could be offered, the TAP format had to evolve to handle each of these scenarios.
As we move towards a flat IP architecture, where voice calls and SMS sent while roaming are just data, TAP files for 4G and 5G networks only need to show data transactions, so the call objects, CAMEL parameters and SMS objects are all falling by the wayside.
What’s inside a TAP File
TAP uses the most beloved of formats – ASN1 to encode the data. This means it is strictly formatted and rigidly specified.
Each file contains a Sequence Number which is a monotonically increasing number, which allows the receiver to know if any files have been missed between the file that’s being currently parsed, an the previous file.
They also have a recipient and sender TADIG code, which is a code allocated by GSMA that uniquely identifies the sender and the recipient of the file.
The TAP records exist in one of two common format, Notification Records and transferBatch records.
These files are exchanged between operators, in practice this means “Dumped on an FTP server as agreed between the two”.
TAP Notification Records
Notifications are the simplest of TAP records and are used when there aren’t any CDRs for roaming events during the time period the TAP file covers.
These are essentially blank TAP files generated by the visited network to let the home network know it’s still there, but there are no roaming subs consuming services in that period.
Notification files are really simple, let’s take a look as one shown as JSON:
When we have services to bill and records to charge, that’s when instead we generate a transferBatch record.
It looks something like this:
There’s a lot going on in here, so let’s break it down section by section.
accountingInfo
The accountingInfo section specifies the currency, exchange rate parameters.
Keep in mind a TAP record generated by an operator in the US, would use USD, while the receiver of the file may be a European MNO dealing in EUR.
This gets even more complicated if you’re dealing with more obscure currencies where an intermediary currency is used, that’s where we bring in SDRs (“Special Drawing Right”) that map to the dollar value to be charged, kinda – the roaming agreement defines how many SDRs are in a dollar, in the example below we’re not using any, but you do see it.
When it comes to numbers and decimal places, TAP doesn’t exactly make it easy.
Significant Digits are defined by counting the first number before the decimal point and all the numbers to the right of the decimal point, so for example the number 1.234 would be 4 significant digits (1 digit before the decimal point and 3 digits after it).
Decimal Places are not actually supported for the Value fields in the TAP file. This is tricky because especially today when roaming tariffs are quite low, these values can be quite small, and we need to represent them as an integer number. TAP defines decimal places as the number of digits after the decimal place.
When it comes to the maximum number of decimal places, this actually impacts the maximum number we can store in the field – as ASN1 strictly enforce what we put in it.
The auditControlInfo section contains the number of CDRs (callEventDetailsCount) contained in the TAP file, the timestamp of the first and last CDR in the file, the total charge and any tax charged.
All of the currency information was provided in the accountingInfo so this is just giving us our totals.
A CDR has 30 days from the time it was generated / service consumed by the roamer, to be baked into a TAP file. After this we can no longer charge for it, so it’s important that the earliestCallTimeStamp is not more than 30 days before the fileCreationTimeStamp seen in batchControlInfo.
batchControlInfo
The batchControlInfo section specifies the time the TAP file became available for transfer, the time the file was created (usually the same), the sequence number and the sender / recipient TADIG codes.
As mentioned earlier, we track sequence number so the receiver can know if a TAP file has been missed; for example if you’ve got TAP file 1 and TAP file 3 comes in, you can determine you’ve missed TAP file 2.
Now we’re getting to the meat & potatoes of our TAP record, the CDRs themselves.
In LTE networks these are just records of data consumption, so let’s take a look inside the gprsCall records under callEventDetails:
In the gprsBasicCallInformation we’ve got as the name suggests the basic info about the data usage event. The time when the session started, the charging ID, the IMSI and the MSISDN of the subscriber to charge, along with their IP and the APN used.
Next up we have the gprsLocationInformation – rates and tariffs may be set based on the location of the subscriber, so we need to identify the area the sub was using the services to select correct tariff / rate for traffic in this destination.
The recEntity is the index number of the SGW / PGW used for the transaction (more on that later).
Next we have the gprsServiceUsed which, again as the name suggests, details the services used and the charge.
chargeDetailList contains the charged data (Made up of dataVolumeIncoming + dataVolumeOutgoing) and the cost.
The chargeableUnits indicates the actual data consumed, however most roaming agreements will standardise on some level of rounding, for example rounding up to the nearest Kilobyte (1024 bytes), so while a sub may consume 1025 bytes of data, they’d be billed for 2045 bytes of data. The data consumed is indicated in the chargeableUnits which indicates how much data was actually consumed, before any rounding policies where applied, while the amount that is actually charged (When taking into account rounding policies) isindicated inside Charged Units.
In the example below data usage is rounded up to the nearest 1024 bytes, 134390 bytes rounds up to the nearest 1024 gives you 135168 bytes.
As this is data we’re talking bytes, but not all bytes are created equal!
VoLTE traffic, using a QCI1 bearer is more valuable than QCI 9 cat videos, and TAP records take this into account in the Call Type Groups, each of which has a different price – Call Type Level 1 indicates the type of traffic, for S8 Home Routed LTE Traffic this is 10 (HGGSN/HP-GW), while Call Type Level 2 indicates the type of traffic as mapped to QCI values:
So Call Type Level 2 set to 20 indicates that this is “20 Unspecified/default LTE QCIs”, and Call Type Level 3 can be set to any value based on a defined inter-operator tariff.
recEntityType 7 means a PGW and contains the IP of the PGW in the Home PLMN, while recEntityType 8 means SGW and is the SGW in the Visited PLMN.
So this means if we reference recEntityCode 2 in a gprsCall, that we’re referring to an SGW at 1.2.3.5.
Lastly also got the utcTimeOffsetInfo to indicate the timezones used and assign a unique code to it.
Using the Records
We as humans? These records aren’t meant for us.
They’re designed to be generated by the Visited PLMN and sent to to the home PLMN, which ingests it and pays the amount specified in the time agreed.
Generally this is an FTP server that the TAP records get dumped into, and an automated bank transfer job based on the totals for the TAP records.
Testing of the TAP records is called “TADIG Testing” and it’s something we’ll go into another day, but in essence it’s validating that the output and contents of the files meet what both operators think is the contract pricing and specifications.
So that’s it! That’s what’s in a TAP record, what it does and how we use it!
GSMA are introducing BCE – Billing & Charging Evolution, a new standard, designed to last for the next 30+ years like TAP has. It’s still in its early days, but that’s the direction the GSMA has indicated it would like to go.
Everything was working on the IMS, then I go to bed, the next morning I fire up the test device and it just won’t authenticate to the IMS – The S-CSCF generated a 401 in response to the REGISTER, but the next REGISTER wouldn’t pass.
When we generate the vectors (for IMS auth and standard auth) one of the inputs to generate the vectors is the Sequence Number or SQN.
This SQN ticks over like an odometer for the number of times the SIM / HSS authentication process has been performed.
There is some leeway in the SQN – It may not always match between the SIM and the HSS and that’s to be expected. When the MME sends an Authentication-Information-Request it can ask for multiple vectors so it’s got some in reserve for the next time the subscriber attaches, and that’s allowed.
But there are limits to how far out our SQN can be, and for good reason – One of the key purposes for the SQN is to protect against replay attacks, where the same vector is replayed to the UE. So the SQN on the HSS can be ahead of the SIM (within reason), but it can’t be behind – Odometers don’t go backwards.
So the issue was with the SQN on the SIM being out of Sync with the SQN in the IMS, how do we know this is the case, and how do we fix this?
Well there is a resync mechanism so the SIM can securely tell the HSS what the current SQN it is using, so the HSS can update it’s SQN.
When verifying the AUTN, the client may detect that the sequence numbers between the client and the server have fallen out of sync. In this case, the client produces a synchronization parameter AUTS, using the shared secret K and the client sequence number SQN. The AUTS parameter is delivered to the network in the authentication response, and the authentication can be tried again based on authentication vectors generated with the synchronized sequence number.
In our example we can tell the sub is out of sync as in our Multimedia Authentication Request we see the SIP-Authorization AVP, which contains the AUTS (client synchronization parameter) which the SIM generated and the UE sent back to the S-CSCF. Our HSS can use the AUTS value to determine the correct SQN.
SIP-Authorization AVP in the Multimedia Authentication Request means the SQN is out of Sync and this AVP contains the RAND and AUTN required to Resync
Note: The SIP-Authorization AVP actually contains both the RAND and the AUTN concatenated together, so in the above example the first 32 bytes are the AUTN value, and the last 32 bytes are the RAND value.
So the HSS gets the AUTS and from it is able to calculate the correct SQN to use.
Then the HSS just generates a new Multimedia Authentication Answer with a new vector using the correct SQN, sends it back to the IMS and presto, the UE can respond to the challenge normally.
Misunderstood, under appreciated and more capable than people give it credit for, is our PCRF.
But what does it do?
Most folks describe the PCRF in hand wavy-terms – “it does policy and charging” is the answer you’ll get, but that doesn’t really tell you anything.
So let’s answer it in a way that hopefully makes some practical sense, starting with the acronym “PCRF” itself, it stands for Policy and Charging Rules Function, which is kind of two functions, one for policy and one for rules, so let’s take a look at both.
Policy
In cellular world, as in law, policy is the rules.
For us some examples of policy could be a “fair use policy” to limit customer usage to acceptable levels, but it can also be promotional packages, services like “free Spotify” packages, “Voice call priority” or “unmetered access to Nick’s Blog and maximum priority” packages, can be offered to customers.
All of these are examples of policy, and to make them work we need to target which subscribers and traffic we want to apply the policy to, and then apply the policy.
Charging Rules
Charging Rules are where the policy actually gets applied and the magic happens.
It’s where we take our policy and turn it into actionable stuff for the cellular world.
Let’s take an example of “unmetered access to Nick’s Blog and maximum priority” as something we want to offer in all our cellular plans, to provide access that doesn’t come out of your regular usage, as well as provide QCI 5 (Highest non dedicated QoS) to this traffic.
To achieve this we need to do 3 things:
Profile the traffic going to this website (so we capture this traffic and not regular other internet traffic)
Charge it differently – So it’s not coming from the subscriber’s regular balance
Up the QoS (QCI) on this traffic to ensure it’s high priority compared to the other traffic on the network
So how do we do that?
Profiling Traffic
So the first step we need to take in providing free access to this website is to filter out traffic to this website, from the traffic not going to this website.
Let’s imagine that this website is hosted on a single machine with the IP 1.2.3.4, and it serves traffic on TCP port 443. This is where IPFilterRules (aka TFTs or “Traffic Flow Templates”) and the Flow-Description AVP come into play. We’ve covered this in the past here, but let’s recap:
IPFilterRules are defined in the Diameter Base Protocol (IETF RFC 6733), where we can learn the basics of encoding them,
They take the format:
action dir proto from src to dst
The action is fairly simple, for all our Dedicated Bearer needs, and the Flow-Description AVP, the action is going to be permit. We’re not blocking here.
The direction (dir) in our case is either in or out, from the perspective of the UE.
Next up is the protocol number (proto), as defined by IANA, but chances are you’ll be using 17 (UDP) or 6 (TCP).
The from value is followed by an IP address with an optional subnet mask in CIDR format, for example from 10.45.0.0/16would match everything in the 10.45.0.0/16 network.
Following from you can also specify the port you want the rule to apply to, or, a range of ports.
Like the from, the tois encoded in the same way, with either a single IP, or a subnet, and optional ports specified.
And that’s it!
So let’s create a rule that matches all traffic to our website hosted on 1.2.3.4 TCP port 443,
permit out 6 from 1.2.3.4 443 to any 1-65535
permit out 6 from any 1-65535 to 1.2.3.4 443
All this info gets put into the Flow-Information AVPs:
With the above, any traffic going to/from 1.23.4 on port 443, will match this rule (unless there’s another rule with a higher precedence value).
Charging Actions
So with our traffic profiled, the next question is what actions are we going to take, well there’s two, we’re going to provide unmetered access to the profiled traffic, and we’re going to use QCI 4 for the traffic (because you’ll need a guaranteed bit rate bearer to access!).
Charging-Group for Profiled Traffic
To allow for Zero Rating for traffic matching this rule, we’ll need to use a different Rating Group.
Let’s imagine our default rating group for data is 10000, then any normal traffic going to the OCS will use rating group 10000, and the OCS will apply the specific rates and policies based on that.
Rating Groups are defined in the OCS, and dictate what rates get applied to what Rating Groups.
For us, our default rating group will be charged at the normal rates, but we can define a rating group value of 4000, and set the OCS to provide unlimited traffic to any Credit-Control-Requests that come in with Rating Group 4000.
This is how operators provide services like “Unlimited Facebook” for example, a Charging Rule matches the traffic to Facebook based on TFTs, and then the Rating Group is set differently to the default rating group, and the OCS just allows all traffic on that rating group, regardless of how much is consumed.
Inside our Charging-Rule-Definition, we populate the Rating-Group AVP to define what Rating Group we’re going to use.
Setting QoS for Profiled Traffic
The QoS Description AVP defines which QoS parameters (QCI / ARP / Guaranteed & Maximum Bandwidth) should be applied to the traffic that matches the rules we just defined.
As mentioned at the start, we’ll use QCI 4 for this traffic, and allocate MBR/GBR values for this traffic.
Putting it Together – The Charging Rule
So with our TFTs defined to match the traffic, our Rating Group to charge the traffic and our QoS to apply to the traffic, we’re ready to put the whole thing together.
So here it is, our “Free NVN” rule:
I’ve attached a PCAP of the flow to this post.
In our next post we’ll talk about how the PGW handles the installation of this rule.
One day recently I was messing with the XCAP server, trying to set the Call Forward timeout. In the process I triggered the UE to send a USSD request to the IMS.
Huh, I thought, “I wonder how hard it would be to build a USSD Gateway for our IMS?”, and this my friends, is the story of how I wasted a good chunk of my weekend trying (and failing) to add support for USSD.
You might be asking “Who still uses USSD?” – The use cases for USSD are pretty thin on the ground in this day and age, but I guess balance query, and uh…
But this is the story of what I tried before giving up and going outside…
Routing
First I’d need to get the USSD traffic towards the USSD Gateway, this means modifying iFCs. Skimming over the spec I can see the Recv-Info: header for USSD traffic should be set to “g.3gpp.ussd” so I knocked up an iFC to match that, and route the traffic to my dev USSD Gateway, and added it to the subscriber profile in PyHSS:
Easy peasy, now we have the USSD requests hitting our USSD Gateway.
The Response
I’ll admit that I didn’t jump straight to the TS doc from the start.
The first place I headed was Google to see if I could find any PCAPs of USSD over IMS/SIP.
And I did – Restcomm seems to have had a USSD product a few years back, and trawling around their stuff provided some reference PCAPs of USSD over SIP.
So the flow seemed pretty simple, SIP INVITE to set up the session, SIP INFO for in-dialog responses and a BYE at the end.
With all the USSD guts transferred as XML bodies, in a way that’s pretty easy to understand.
Being a Kamailio fan, that’s the first place I started, but quickly realised that SIP proxies, aren’t great at acting as the UAS.
So I needed to generate in-dialog SIP INFO messages, so I turned to the UAC module to generate the SIP INFO response.
My Kamailio code is super simple, but let’s have a look:
request_route {
xlog("Request $rm from $fU");
if(is_method("INVITE")){
xlog("USSD from $fU to $rU (Emergency number) CSeq is $cs ");
sl_reply("200", "OK Trying USSD Phase 1"); #Generate 200 OK
route("USSD_Response"); #Call USSD_Response route block
exit;
}
}
route["USSD_Response"]{
xlog("USSD_Response Route");
#Generate a new UAC Request
$uac_req(method)="INFO";
$uac_req(ruri)=$fu; #Copy From URI to Request URI
$uac_req(furi)=$tu; #Copy To URI to From URI
$uac_req(turi)=$fu; #Copy From URI to To URI
$uac_req(callid)=$ci; #Copy Call-ID
#Set Content Type to 3GPP USSD
$uac_req(hdrs)=$uac_req(hdrs) + "Content-Type: application/vnd.3gpp.ussd+xml\r\n";
#Set the USSD XML Response body
$uac_req(body)="<?xml version='1.0' encoding='UTF-8'?>
<ussd-data>
<language value=\"en\"/>
<ussd-string value=\"Bienvenido. Seleccione una opcion: 1 o 2.\"/>
</ussd-data>";
$uac_req(evroute)=1; #Set the event route to use on return replies
uac_req_send(); #Send it!
}
So the UAC module generates the 200 OK and sends it back.
“That was quick” I told myself, patting myself on the back before trying it out for the first time.
Huston, we have a problem – Although the Call-ID is the same, it’s not an in-dialog response as the tags aren’t present, this means our UE send back a 405 to the SIP INFO.
Right. Perhaps this is the time to read the Spec…
Okay, so the SIP INFO needs to be in dialog. Can we do that with the UAC module? Perhaps not…
But alas real life came back to rear its ugly head, and this adventure will have to continue another day…
Update: Thanks to a kindly provided PCAP I now know what I was doing wrong, and so we’ll soon have a follow up to this post named “Successes in cobbling together a USSD Gateway” just as soon as I have a weekend free.
NB-IoT introduces support for NIDD – Non-IP Data Delivery (NIDD) which is one of the cool features of NB-IoT that’s gaining more widespread adoption.
Let’s take a deep dive into NIDD.
The case against IP for IoT
In the over 40 years since IP was standardized, we’ve shoehorned many things onto IP, but IP was never designed or optimized for low power, low throughput applications.
For the battery life of an IoT device to be measured in years, it has to be very selective about what power hungry operations it does. Transmitting data over the air is one of the most power-intensive operations an IoT device can perform, so we need to do everything we can to limit how much data is sent, and how frequently.
Use Case – NB-IoT Tap
Let’s imagine we’re launching an IoT tap that transmits information about water used, as part of our revolutionary new “Water as a Service” model (WaaS) which removes the capex for residents building their own water treatment plant in their homes, and instead allows dynamic scaling of waterloads as they move to our new opex model.
If I turn on the tap and use 12L of water, when I turn off the tap, our IoT tap encodes the usage onto a single byte and sends the usage information to our rain-cloud service provider.
So we’re not constantly changing the batteries in our taps, we need to send this one byte of data as efficiently as possible, so as to maximize the battery life.
If we were to transport our data on TCP, well we’d need a 3 way handshake and several messages just to transmit the data we want to send.
Let’s see how our one byte of data would look if we transported it on TCP.
That sliver of blue in the diagram is our usage component, the rest is overhead used to get it there. Seems wasteful huh?
Sure, TCP isn’t great for this you say, you should use UDP! But even if we moved away from TCP to UDP, we’ve still got the IPv4 header and the UDP header wasting 28 bytes.
For efficiency’s sake (To keep our batteries lasting as long as possible) we want to send as few messages as possible, and where we do have to send messages, keep them very short, so IP is not a great fit here.
Enter NIDD – Non-IP Data Delivery.
Through NIDD we can just send the single hex byte, only be charged for the single hex byte, and only stay transmitting long enough to send this single byte of hex (Plus the NBIoT overheads / headers).
Compared to UDP transport, NIDD provides us a reduction of 28 bytes of overhead for each message, or a 96% reduction in message size, which translates to real power savings for our IoT device.
In summary – the more sending your device has to do, the more battery it consumes. So in a scenario where you’re trying to maximize power efficiency to keep your batter powered device running as long as possible, needing to transmit 28 bytes of wasted data to transport 1 byte of usable data, is a real waste.
Delivering the Payload
NIDD traffic is transported as raw hex data end to end, this means for our 1 byte of water usage data, the device would just send the hex value to be transferred and it’d pop out the other end.
To support this we introduce a new network element called the SCEF –Service Capability Exposure Function.
From a developer’s perspective, the SCEF is the gateway to our IoT devices. Through the RESTful API on the SCEF (T8 API), we can send and receive raw hex data to any of our IoT devices.
When one of our Water-as-a-Service Taps sends usage data as a hex byte, it’s the software talking on the T8 API to the SCEF that receives this data.
Data of course needs to be addressed, so we know where it’s coming from / going to, and T8 handles this, as well as message reliability, etc, etc.
This is a telco blog, so we should probably cover the MME connection, the MME talks via Diameter to the SCEF. In our next post we’ll go into these signaling flows in more detail.
If you’re wondering what the status of Open Source SCEF implementations are, then you may have already guessed I’m working on one!
Hopefully by now you’ve got a bit of an idea of how NIDD works in NB-IoT, and in our next posts we’ll dig deeper into the flows and look at some PCAPs together.
Having a central pair of Diameter routing agents allows us to drastically simplify our network, but what if we want to perform some translations on AVPs?
For starters, what is an AVP transformation? Well it’s simply rewriting the value of an AVP as the Diameter Request/Response passes through the DRA. A request may come into the DRA with IMSI xxxxxx and leave with IMSI yyyyyy if a translation is applied.
So why would we want to do this?
Well, what if we purchased another operator who used Realm X, and we use Realm Y, and we want to link the two networks, then we’d need to rewrite Realm Y to Realm X, and Realm X to Realm Y when they communicate, AVP transformations allow for this.
If we’re an MVNO with hosted IMSIs from an MNO, but want to keep just the one IMSI in our HSS/OCS, we can translate from the MNO hosted IMSI to our internal IMSI, using AVP transformations.
If our OCS supports only one rating group, and we want to rewrite all rating groups to that one value, AVP transformations cover this too.
There are lots of uses for this, and if you’ve worked with a bit of signaling before you’ll know that quite often these sorts of use-cases come up.
So how do we do this with freeDiameter?
To handle this I developed a module for passing each AVP to a Python function, which can then apply any transformation to a text based value, using every tool available to you in Python.
In the next post I’ll introduce rt_pyform and how we can use it with Python to translate Diameter AVPs.
What I typically refer to as Diameter interfaces / reference points, such as S6a, Sh, Sx, Sy, Gx, Gy, Zh, etc, etc, are also known as Applications.
Diameter Application Support
If you look inside the Capabilities Exchange Request / Answer dialog, what you’ll see is each side advertising the Applications (interfaces) that they support, each one being identified by an Application ID.
CER showing support for the 3GPP Zh Application-ID (Interface)
If two peers share a common Application-Id, then they can communicate using that Application / Interface.
For example, the above screenshot shows a peer with support for the Zh Interface (Spoiler alert, XCAP Gateway / BSF coming soon!). If two Diameter peers both have support for the Zh interface, then they can use that to send requests / responses to each other.
This is the basis of Diameter Routing.
Diameter Routing Tables
Like any router, our DRA needs to have logic to select which peer to route each message to.
For each Diameter connection to our DRA, it will build up a Diameter Routing table, with information on each peer, including the realm and applications it advertises support for.
Then, based on the logic defined in the DRA to select which Diameter peer to route each request to.
In its simplest form, Diameter routing is based on a few things:
Look at the DestinationRealm, and see if we have any peers at that realm
If we do then look at the DestinationHost, if that’s set, and the host is connected, and if it supports the specified Application-Id, then route it to that host
If no DestinationHost is specified, look at the peers we have available and find the one that supports the specified Application-Id, then route it to that host
Simplified Diameter Routing Table used by DRAs
With this in mind, we can go back to looking at how our DRA may route a request from a connected MME towards an HSS.
Let’s look at some examples of this at play.
The request from MME02 is for DestinationRealm mnc001.mcc001.3gppnetwork.org, which our DRA knows it has 4 connected peers in (3 if we exclude the source of the request, as we don’t want to route it back to itself of course).
So we have 3 contenders still for who could get the request, but wait! We have a DestinationHost specified, so the DRA confirms the host is available, and that it supports the requested ApplicationId and routes it to HSS02.
So just because we are going through a DRA does not mean we can’t specific which destination host we need, just like we would if we had a direct link between each Diameter peer.
Conversely, if we sent another S6a request from MME01 but with no DestinationHost set, let’s see how that would look.
Again, the request is from MME02 is for DestinationRealm mnc001.mcc001.3gppnetwork.org, which our DRA knows it has 3 other peers it could route this to. But only two of those peers support the S6a Application, so the request would be split between the two peers evenly.
Clever Routing with DRAs
So with our DRA in place we can simplify the network, we don’t need to build peer links between every Diameter device to every other, but let’s look at some other ways DRAs can help us.
Load Control
We may want to always send requests to HSS01 and only use HSS02 if HSS01 is not available, we can do this with a DRA.
Or we may want to split load 75% on one HSS and 25% on the other.
Both are great use cases for a DRA.
Routing based on Username
We may want to route requests in the DRA based on other factors, such as the IMSI.
Our IMSIs may start with 001010001xxx, but if we introduced an MVNO with IMSIs starting with 001010002xxx, we’d need to know to route all traffic where the IMSI belongs to the home network to the home network HSS, and all the MVNO IMSI traffic to the MVNO’s HSS, and DRAs handle this.
Inter-Realm Routing
One of the main use cases you’ll see for DRAs is in Roaming scenarios.
For example, if we have a roaming agreement with a subscriber who’s IMSIs start with 90170, we can route all the traffic for their subs towards their HSS.
But wait, their Realm will be mnc901.mcc070.3gppnetwork.org, so in that scenario we’ll need to add a rule to route the request to a different realm.
DRAs handle this also.
In our next post we’ll start actually setting up a DRA with a default route table, and then look at some more advanced options for Diameter routing like we’ve just discussed.
One slight caveat, is that mutual support does not always mean what you may expect. For example an MME and an HSS both support S6a, which is identified by Auth-Application-Id 16777251 (Vendor ID 10415), but one is a client and one is a server. Keep this in mind!
Answer Question 1: Because they make things simpler and more flexible for your Diameter traffic. Answer Question 2: With free software of course!
All about DRAs
But let’s dive a little deeper. Let’s look at the connection between an MME and an HSS (the S6a interface).
Direct Diameter link between two Diameter Peers
We configure the Diameter peers on MME1 and HSS01 so they know about each other and how to communicate, the link comes up and presto, away we go.
But we’re building networks here! N+1 redundancy and all that, so now we have two HSSes and two MMEs.
Direct Diameter link between 4 Diameter peers
Okay, bit messy, but that’s okay…
But then our network grows to 10 MMEs, and 3 HSSes and you can probably see where this is going, but let’s drive the point home.
Direct Diameter connections for a network with 10x MME and 3x HSS
Now imagine once you’ve set all this up you need to do some maintenance work on HSS03, so need to shut down the Diameter peer on 10 different MMEs in order to isolate it and deisolate it.
The problem here is pretty evident, all those links are messy, cumbersome and they just don’t scale.
If you’re someone with a bit of networking experience (and let’s face it, you’re here after all), then you’re probably thinking “What if we just had a central system to route all the Diameter messages?”
An Agent that could Route Diameter, a Diameter Routing Agent perhaps…
By introducing a DRA we build Diameter peer links between each of our Diameter devices (MME / HSS, etc) and the DRA, rather than directly between each peer.
With Diameter Routing Agent
Then from the DRA we can route Diameter requests and responses between them.
Let’s go back to our 10x MME and 3x HSS network and see how it looks with a DRA instead.
So much cleaner!
Not only does this look better, but it makes our life operating the network a whole lot easier.
Each MME sends their S6a traffic to the DRA, which finds a healthy HSS from the 3 and sends the requests to it, and relays the responses as well.
We can do clever load balancing now as well.
Plus if a peer goes down, the DRA detects the failure and just routes to one of the others.
If we were to introduce a new HSS, we wouldn’t need to configure anything on the MMEs, just add HSS04 to the DRA and it’ll start getting traffic.
Plus from an operations standpoint, now if we want to to take an HSS offline for maintenance, we just shut down the link on the HSS and all HSS traffic will get routed to the other two HSS instances.
In our next post we’ll talk about the Routing part of the DRA, how the decisions are made and all the nuances, and then in the following post we’ll actually build a DRA and start routing some traffic around!
If you work with IMS or Packet Core, there’s a good chance you need DNS to work, and it doesn’t always.
When I run traces, I’ve always found I get swamped with DNS traffic, UE traffic, OS monitoring, updates, etc, all combine into a big firehose – while my Wireshark filters for finding EPC and IMS traffic is pretty good, my achilles heel has always been filtering the DNS traffic to just get the queries and responses I want out of it.
Well, today I made that a bit better.
By adding this to your Wireshark filter:
dns contains 33:67:70:70:6e:65:74:77:6f:72:6b:03:6f:72:67:00
You’ll only see DNS Queries and Responses for domains at the 3gppnetwork.org domain.
This makes my traces much easier to read, and hopefully will do the same for you!
Bonus, here’s my current Wireshark filter for working EPC/IMS:
(diameter and diameter.cmd.code != 280) or (sip and !(sip.Method == "OPTIONS") and !(sip.CSeq.method == "OPTIONS")) or (smpp and (smpp.command_id != 0x00000015 and smpp.command_id != 0x80000015)) or (mgcp and !(mgcp.req.verb == "AUEP") and !(mgcp.rsp.rspcode == 500)) or isup or sccp or rtpevent or s1ap or gtpv2 or pfcp or (dns contains 33:67:70:70:6e:65:74:77:6f:72:6b:03:6f:72:67:00)
I build phone networks, and unfortunately, I’m not able to be everywhere at once.
This means sometimes I have to test things in networks I may not be within the coverage of.
To get around this, I’ve setup something pretty simple, but also pretty powerful – Remote test phones.
Using a Raspberry Pi, Intel NUC, or any old computer, I’m able to remotely control Android handsets out in the field, in the coverage footprint of whatever network I need.
This means I can make test calls, run speed testing, signal strength measurements, on real phones out in the network, without leaving my office.
Base OS
Because of some particularities with Wayland and X11, for this I’d steer clear of Ubuntu distributions, and suggest using Debian if you’re using x86 hardware, and Raspbian if you’re using a Pi.
Setup Android Debug Bridge (adb)
The base of this whole system is ADB, the Android Debug Bridge, which exposes the ability to remotely control an Android phone over USB.
You can also do this over WiFi, but I find for device testing, wired allows me to airplane mode a device or disable data, which I can’t do if the device is connected to ADB via WiFi.
There’s lot of info online about setting Android Debug Bridge up on your device, unlocking the Developer Mode settings, etc, if you’ve not done this before I’ll just refer you to the official docs.
Before we plug in the phones we’ll need to setup the software on our remote testing machine, which is simple enough:
Now we can plug in each of the remote phones we want to use for testing and run the command “adb devices” which should list the phones with connected to the machine with ADB enabled:
[email protected]:~$ adb devices
List of devices attached
ABCDEFGHIJK unauthenticated
LMNOPQRSTUV unauthenticated
You’ll get a popup on each device asking if you want to allow USB debugging – If this is going to be a set-and-forget deployment, make sure you tick “Always allow from this Computer” so you don’t have to drive out and repeat this step, and away you go.
Lastly we can run adb devices again to confirm everything is in the connected state
Scrcpy
scrcpy an open-source remote screen mirror / controller that allows us to control Android devices from a computer.
In our case we’re going to install with Snap (if you hate snaps as many folks do, you can also compile from source):
After SSHing into the box, we can just run scrcpy and boom, there’s the window we can interact with.
If you’ve got multiple devices connected to the same device, you’ll need to specify the ADB device ID, and of course, you can have multiple sessions open at the same time.
scrcpy -s 61771fe5
That’s it, as simple as that.
Tweaking
A few settings you may need to set:
I like to enable the “Show taps” option so I can see where my mouse is on the touchscreen and see what I’ve done, it makes it a lot easier when recording from the screen as well for the person watching to follow along.
You’ll probably also want to disable the lock screen and keep the screen awake
Some OEMs have an additonal tick box if you want to be able to interact with the device (rather than just view the screen), which often requires signing into an account, if you see this toggle, you’ll need to turn it on:
Ansible Playbook
I’ve had to build a few of these, so I’ve put an Ansible Playbook on Github so you can create your own.
If you’ve ever received an SMS from your operator, and the sender was the Operator name for example, you may be left wondering how it’s done.
In IMS you’d think this could be quite simple – You’d set the From header to be the name rather than the MSISDN, but for most SMSoIP deployments, the From header is ignored and instead the c header inside the SMS body is used.
So how do we get it to show text?
Well the TP-Originating address has the “Type of Number” (ToN) field which is typically set to International/National, but value 5 allows for the Digits to instead be alphanumeric characters.
GSM 7 bit encoding on the text in the TP-Originating Address digits and presto, you can send SMS to subscribers where the message shows as From an alphanumeric source.
On Android SMSs received from alphanumeric sources cannot be responded to (“no more “DO NOT REPLY TO THIS MESSAGE” at the end of each text), but on iOS devices you can respond, but if I send an SMS from “Nick” the reply from the subscriber using the iPhone will be sent to MSISDN 6425 (Nick on the telephone keypad).
This post is one of a series of packet capture analysis challenges designed to test your ability to understand what is going on in a network from packet captures. Download the Packet Capture and see how many of the questions you can answer from the attached packet capture.
The answers are at the bottom of this page, along with how we got to the answers.
This challenge focuses on the Evolved Packet Core, specifically the S1 and Diameter interfaces.
In Uplink messages from the eNodeB the EUTRAN-GCI field contains the Cell-ID of the eNodeB.
In this case the Cell-ID is 1.
Answer: What is the Tracking Area?
The tracking area is 123.
This information is available in the TAI field in the Uplink S1 messages.
Answer: Does the device attaching to the network support VoLTE?
No, the device does not support VoLTE.
There are a few ways we can get to this answer, and VoLTE support in the phone does not mean VoLTE will be enabled, but we can see the Voice Domain preference is set to CS Voice Only, meaning GSM/UMTS for voice calling.
This is common on cheaper handsets that do not support VoLTE.
Answer: What type of IP is the subscriber requesting for this PDN session? (IPv4/IPv6/Both)?
The subscriber is requesting an IPv4 address only.
We can see this in the ESM Message Container for the PDN Connectivity Request, the PDN type is “IPv4”.
Answer: What is the Diameter Application ID for S6a?
Answer: 16777251
This is shown for the Vendor-Specific-Application-Id AVP on an S6a message.
Answer: What is the Crytpo RES returned by the HSS, and what is the RES returned by the SIM/UE?
The RES (Response) and X-RES (Expected Response) Both are “dba298fe58effb09“, they do match, which means this subscriber was authenticated successfully.
In iOS 15, Apple added support for iPhones to support SMS over IMS networks – SMSoIP. Previously iPhone users have been relying on CSFB / SMSoNAS (Using the SGs interface) to send SMS on 4G networks.
Getting this working recently led me to some issues that took me longer than I’d like to admit to work out the root cause of…
I was finding that when sending a Mobile Termianted SMS to an iPhone as a SIP MESSAGE, the iPhone would send back the 200 OK to confirm delivery, but it never showed up on the screen to the user.
The GSM A-I/F headers in an SMS PDU are used primarily for indicating the sender of an SMS (Some carriers are configured to get this from the SIP From header, but the SMS PDU is most common).
The RP-Destination Address is used to indicate the destination for the SMS, and on all the models of handset I’ve been testing with, this is set to the MSISDN of the Subscriber.
But some devices are really finicky about it’s contents. Case in point, Apple iPhones.
If you send a Mobile Terminated SMS to an iPhone, like the one below, the iPhone will accept and send back a 200 OK to this request.
The problem is it will never be displayed to the user… The message is marked as delivered, the phone has accepted it it just hasn’t shown it…
SMS reports as delivered by the iPhone (200 OK back) but never gets displayed to the user of the phone as the RP-Destination Address header is populated
The fix is simple enough, if you set the RP-Destination Address header to 0, the message will be displayed to the user, but still took me a shamefully long time to work out the problem.
RP-Destination Address set to 0 sent to the iPhone, this time it’ll get displayed to the user.
To support Dedicated Bearers we first have to have a way of profiling the traffic, to classify the traffic as being the type we want to provide the Dedicated Bearer for.
The first step involves a request from an Application Function (AF) to the PCRF via the Rx interface.
The most common type of AF would be a P-CSCF. When a VoLTE call gets setup the P-CSCF requests that a dedicated bearer be setup for the IP Address and Ports involved in the VoLTE call, to ensure users get the best possible call quality.
But Application Functions aren’t limited to just VoLTE – You could also embed an Application Function into the server for an online game to enable a dedicated bearer for users playing that game, or a sports streaming app that detects when a user starts streaming sports and creates a dedicated bearer for that user to send the traffic down.
The request to setup a dedicated bearer comes in the form of a Diameter request message from the AF, using the Rx reference point, typically from the P-CSCF to the PCRF in the network in an “AA-Request”.
Of main interest in the AA-Request is the Media Component AVP, that contains all the details needed to identify the traffic flow.
Now our PCRF is in charge of policy, and know which P-GW is serving the required subscriber. So the PCRF takes this information and sends a Gx Re-Auth Request to the PCEF in the P-GW serving the subscriber, with a Charging Rule the PCEF in the P-GW needs to install, to profile and apply QoS to the bearer.
Charging Rule Definition’s Flow-Information AVPs showing the information needed to profile the traffic
The QoS Description AVP defines which QoS parameters (QCI / ARP / Guaranteed & Maximum Bandwidth) should be applied to the traffic that matches the rules we just defined.
QoS Information AVP showing requested QoS Parameters
The P-GW sends back a Gx Re-Auth Answer, and gets to work actually setting up these bearers.
With the rule installed on the PCEF, it’s time to get this new bearer set up on the UE / eNodeB.
The P-GW sends a GTPv2 “Create Bearer Request” to the S-GW which forwards it onto the MME, to setup / define the Dedicated Bearer to be setup on the eNodeB.
GTPv2 “Create Bearer Request” sent by the P-Gw to the S-GW forwarded from the S-GW to the MME
The MME translates this into an S1 “E-RAB Setup Request” which it sends to the eNodeB to setup,
S1 E-RAB Setup request showing the E-RAB to be setup
Assuming the eNodeB has the resources to setup this bearer, it provides the details to the UE and sets up the bearer, sending confirmation back to the MME in the S1 “E-RAB Setup Response” message, which the MME translates back into GTPv2 for a “Create Bearer Response”
All this effort to keep your VoLTE calls sounding great!
Want more telecom goodness?
I have a good old fashioned RSS feed you can subscribe to.