Number Pads – Calculator or Phone?

If you’re typing on a full size keyboard there’s a good chance that to your right, there’s a number pad.

The number 5 is in the middle – That’s to be expected, but is 1 in the top left or bottom left?

Being derived from an adding machine keypad, the number pad on a keyboard has a 1 will be in the bottom left, however in the 1950s when telephone keypads were being introduced, only folks who worked in accounting had adding machines.

So when it came time to work out the best layout, the result we have today was a determined through a stack of research and testing by Human Factors Engineering Department of Bell Labs who studied the most efficient layout of keys, and tested focus groups to find the layout that provided the best level of speed and accuracy.

That landed with the 1 in the top left, and that’s what we still have today.

Oddly ATM and Card terminals opted to use the telephone layout, rather than the adding machine layout, while number pads use the adding machine layout.

A few exceptions to this exist, for example the Telecom ComputerPhone (Aka the Merlin Tonto in the UK, or the New Zealand Post Office Computerphone, or the ICL One Per Desk) which is the keyboard as envisioned by the telephone company.

NBNco’s FTTN – What’s in the box?

Note: All information contained here is sourced from: Photos provided by NBNco’s press pages, Googling part numbers from these photos, and public costing information.

This post covers the specifics and capabilities of NBNco’s FTTN solution, and is the result of some internet sleuthing.

If some of the info in here is now out of date, I’d love to know, let me know in the comments or drop me an email and I’ll update it.

FTTN in Numbers

A total of 24,544 nodes have been deployed upon completion of roll out. Each node is provisioned with 384 subscriber ports.

The hardware has 10Gbps shared between the 384 subscriber lines, equating to 208Mbps per subscriber.

Construction costs were $2.311 billion and hardware costs were $1.513 billion,

For the hardware this equates to $61,644 per node or $160 per subscriber line connected (each node is provisioned with 384 ports)

Full cost for node including hardware, construction and provisioning is $244,150 per node, which is $635 per port.

To operate the FTTN infrastructure costs $709 million per year (Made up of costs such as power, equipment servicing and spares). This equates to $28k per node per annum, or $75 per subscriber. (This does not take into account other costs such as access to the copper, transmission network, etc, just the costs to have the unit powered on the footpath.)

Overview

Inside the FTTN cabinets is a Alcatel Lucent (now Nokia) ISAM 7330 cabinet mounted on it’s side,

On the inside left of the door is a optic fibre tray where the transmission links come into the cabinet,

On the extreme left is a custom panel. It contains I/Os that are fed to the 7330, such as door open sensor, battery monitoring, AC power in, SPD and breaker.

Connection to subscriber lines happens on a frame at the end of the cabinet.

Alcatel Lucent ISAM 7330 FTTN

NBN co’s nodes are made up of Alcatel Lucent (Now Nokia) ISAM (Intelligent Services Access Manager) 7330 FTTN rack mounted it’s side.

SlotTypeFunction
1GFC (General Facilities Card)Power and alarm management
2NT Slot (NANT-E Card)Main processing and transmission
3NTIO Slot (NDPS-C Card)VDSL vectoring number-crunching
4NT Slot (Free)Optional (Unused) backup Main processing and transmission
5-12LT (NDLT-F)48 Port VDSL Subscriber DSLAM Interfaces
Slot numbering is just counting L to R, ALU documentation has different numbering

First up is the GFC (General Facilities Card) which handles alarm input / output, and power distribution. This connects to the custom IO panel on the far left of the cabinet, meaning the on-board IO ports aren’t all populated as it’s handled by the IO panel. (More on that later)

Next up is the first NT slot, there are two on the 7330, but in NBN’s case, only one is used; the second can be used for redundancy if one of the cards were to fail, but it seems this option has not been selected. In the first and only populated NT slot is an NANT-E Card (Combined NT Unit E) which handles transmission and main processing.

All the ISAM NANT cards support support RIPv2! But only the NANT-E card also supports BGP – Interestingly they don’t have BGP on all the NANT cards?

To the right of that is the NTIO slot, which has a NDPS-C card, which handles the vector processing for VDSL.

Brief overview of Vectoring: By adding vectoring to DSL signals allows noise on subscriber loops to be modeled, and then cancelled out with an integrated anti-phase signal matching that of the noise.

The vectoring in VDSL relies on pretty complex number crunching as the DSLAM has to constantly process the vectoring coefficients which are different for each line and can change based on the conditions of the subscriber loop etc. To do this the NDPS-C has two roles;
The NDPS-C’s Vectoring Control Entity performs non-real time calculations of vectoring coefficients and handles joining and leaving of vectored VDSL2 lines.
While the NDPS-C’s Vectoring Processor performs the real time matrix calculations based on crosstalk correction samples for the VDSL symbols collected from the subscriber lines.
The NDPS-C has a Twinax connection to every second LT Card.

After the NTIO slot is the unused NT slot.

Finally we have the 8 LT slots for line cards, which for FTTN is using the NDLT-F are 48 port line cards.

The 8 card slots allows 384 subscriber lines per node.

These are the cards which the actual subscriber lines ultimately connect to. With 10Gbps available from the NT to the LTs, means each LT card with 48 subs so 208 Mbps per subscriber max theoretical throughput.

POTS overlay is supported, this allowed VF services coexisted on the same copper during the rollout. M / X pairs are no longer added inline on new connections. (More on that on cabling).

Power & Environment

The 7330 has a 40 amp draw at -48v would mean the unit consumes 1920w

The -48v supply is provided by 2x Eltek Flatpack2 rectifiers, each providing 1Kw each.

These can be configured to provide 1Kw with redundancy to protect against the failure of one of the Flatpack2 units, or 2Kw with no redundancy, which is what is used here.

On the extreme left is a custom panel. It contains alarm I/Os that are fed to the 7330, such as door open sensor, battery monitoring, etc.

It also is the input for AC power in, surge protection device and breakers.

I did have some additional information on the batteries used and the power calculations, however NBNco’s security team have asked that this be removed.

Cabling

Incoming transmission fiber comes in on NBNco’s green ribbon fibre, which terminates on a break out tray on the left hand side wall of the cabinet. Spliced inside the tray is a duplex LC pigtail for connecting the SMF to the 7330. I don’t have the specifics on the optics used.

Subscriber lines come in via an IDC distribution frame (Quante IDS) on the right hand side end of the cabinet, accessed through a seperate door.

This frame is referred to as the CCF – Cross Connect Frame.

There are two sets of blocks on the CCF, termination of ‘X’ and ‘C’ Pairs.

‘X’ Pairs are the VF Pairs (PSTN lines) connecting to the pillar where they are jumpered back to the ‘M’ pairs back to the serving exchange,

‘C’ Pairs are the pairs containing combined VDSL & VF services to to the pillar where they are jumpered to the ‘O’ pairs which run out to the customer’s premises,

Cisco ITP STP – Network Appearance

Short one,
The other day I needed to add a Network Appearance on an SS7/SS7 M3UA linkset.

Network Appearances on M3UA links are kinda like a port number, in that they allow you to distinguish traffic to the same point code, but handled by different logical entities.

When I added the NA parameter on the Linkset nothing happened.

If you’re facing the same you’ll need to set:

cs7 multi-instance

In the global config (this is the part I missed).

Then select the M3UA linkset you want to change and add the network-appearance parameter:

network-appearance 10

And bingo, you’ll start seeing it in your M3UA traffic:

Huawei BBU 3900 Architecture

Huawei BTS3900 eNB Configuration

Last year I purchased a cheap second hand Huawei macro base station – there’s lots of these on the market at the moment due to the fact they’re being replaced in many countries.

I’m using it in my lab environment, and as such the config I’ve got is very “bare bones” and basic. Keep in mind if you’re looking to deploy a Macro eNodeB in production, you may need more than just a blog post to get everything tuned and functioning properly…

In this post we’ll cover setting up a Huawei BTS3900 eNodeB from scratch, using the MML interface, without relying on the U2020 management tool.

Obviously the details I setup (IP Addressing, PLMN and RF parameters) are going to be different to what you’re configuring, so keep that in mind, where I’ve got my MME Addresses, site IDs, TACs, IP Addresses, RFUs, etc, you’ll need to substitute your own values.

A word on Cabinets

Typically these eNodeBs are shipped in cabinets, that contain the power supplies, alarm / environmental monitoring, power distribution, etc.

Early on in the setup process we’ll be setting the cabinet types we’ve got, and then later on we’ll tell the system what we have installed in which slots.

This is fine if you have a cabinet and know the type, but in my case at least I don’t have a cabinet manufactured by Huawei, just a rack with some kit mounted in it.

This is OK, but it leads to a few gotchas I need to add a cabinet (even though it doesn’t physically exist) and when I setup my RRUs I need to define what cabinet, slot and subrack it’s in, even though it isn’t in any. Keep this in mind as we go along and define the position of the equipment, that if you’re not using a real-world cabinet, the values mean nothing, but need to be kept consistent.

The Basics

Before we get started, familiarise yourself with the Huawei MML we’ll use for configuring the unit, and log into the Web UI and bring up an MML shell.

To begin we’ll need to setup the basics, by disabling DHCP and setting an local IP Address for the unit.

 SET DHCPSW: SWITCH=DISABLE;
 SET LOCALIP: IP="192.168.5.234", MASK="255.255.248.0";

Obviously your IP address details will be different.
Next we’ll add an eNodeB function, the LMPT / UMPT can have multiple functions and multiple eNodeBs hosted on the same hardware, but in our case we’re just going to configure one:

 ADD ENODEBFUNCTION: eNodeBFunctionName="LTE", ApplicationRef=1, eNodeBId=9527;
 SET NE: NENAME="HUAWEI", LOCATION="NewSite", DID="NewSite12345", SITENAME="NewSite1", USERLABEL="NewInitSite";
 ADD LOCATION: LOCATIONNAME="NewSite", GCDF=Degree, LATITUDEDEGFORMAT=0, LONGITUDEDEGFORMAT=0; 

Again, your eNodeB ID, location, site name, etc, are all going to be different, as will your location.

Next we’ll set the system to maintenance mode (MNTMODE), so we can make changes on the fly (this takes the eNB off the air, but we’re already off the air), you’ll need to adjust the start and end times to reflect the current time for the start time, and end time to be after you’re done setting all this up.

 SET MNTMODE: MNTMode=INSTALL, ST=2013&09&20&15&00&00, ET=2013&09&25&15&00&00, MMSetRemark="NewSite Install";

Next we’ll set the operator details, this is the PLMN of the eNodeB, and create a new tracking area.

 ADD CNOPERATOR: CnOperatorId=0, CnOperatorName="NickTest", CnOperatorType=CNOPERATOR_PRIMARY, Mcc="001", Mnc="01";
ADD CNOPERATORTA: TrackingAreaId=0, CnOperatorId=0, Tac=1;

Next we’ll be setting and populating the cabinets I mentioned earlier. I’ll be telling the unit it’s inside a APM30 (Cabinet 0), and in Cabinet Number 0, Subrack 0, is a BBU3900.

 //To modify the cabinet type, run the following command:
ADD CABINET:CN=0,TYPE=APM30;
//Add a BBU3900 subrack, run the following command:
ADD SUBRACK:CN=0,SRN=0,TYPE=BBU3900;
//To configure boards and RF datas, run the following commands:

And inside the BBU3900 there’s some cards of course, and each card has as slot, as per the drawing below.

In my environment I’ve got a LMPT in slot 7, and a LBBP in Slot 3. There’s a fan and a UPEU too, so:
We’ll add a board in Slot No. 7, of type LMPT,
We’ll add a board in Slot No. 3, of type LBBP working on FDD,
We’ll add a fan board in Slot No. 16, and a UPEU in Slot No. 18.

 ADD BRD:SN=7,BT=LMPT;
 ADD BRD:CN=0,SRN=0,SN=3,BT=LBBP,WM=TDD;
 ADD BRD:CN=0,SRN=0,SN=16,BT=FAN;
 ADD BRD:CN=0,SRN=0,SN=18,BT=UPEU;

Huawei publish design guides for which cards should be in which slots, the general rule is that your LMPT / UMPT card goes in Slot 7, with your BBP cards (UBBP or LBBP) in slots 3, then 2, then 1, then 0. Fans and UPEUs can only go in the slots designed to fit them, so that makes it a bit easier.

Next we’ll need to setup our RRUs, for this we’ll need to setup an RRU chain, which is the Huawei term for the CPRI links and add an RRU into it:

ADD RRUCHAIN:RCN=10,TT=CHAIN,BM=COLD,HSRN=70,HSN=0,HPN=0;

ADD RRU:CN=0,SRN=60,SN=0,TP=BRANCH,RCN=10,PS=0,RT=MPMU,RS=TDL,RXNUM=0,TXNUM=0;

With our RRU chains defined, we’ll need to setup our transport network to get the traffic back to the S-GW / MME:

SET ETHPORT: SN=7, SBT=BASE_BOARD, PA=COPPER, SPEED=AUTO, DUPLEX=AUTO;
ADD DEVIP: SN=7, SBT=BASE_BOARD, PT=ETH, PN=0, IP="10.10.10.67", MASK="255.255.255.0";
ADD IPRT: RTIDX=0, SN=7, SBT=BASE_BOARD, DSTIP="10.166.1.251", DSTMASK="255.255.255.255", RTTYPE=NEXTHOP, NEXTHOP="10.10.10.1"; 
ADD IPRT: RTIDX=1, SN=7, SBT=BASE_BOARD, DSTIP="10.4.3.3", DSTMASK="255.255.255.255", RTTYPE=NEXTHOP, NEXTHOP="10.10.10.1"; 
ADD IPRT: RTIDX=2, SN=7, SBT=BASE_BOARD, DSTIP="10.3.3.3", DSTMASK="255.255.255.255", RTTYPE=NEXTHOP, NEXTHOP="10.10.10.1";
ADD IPRT: RTIDX=3, SN=7, SBT=BASE_BOARD, DSTIP="10.60.60.60", DSTMASK="255.255.255.255", RTTYPE=NEXTHOP, NEXTHOP="10.10.10.1";
ADD OMCH: IP="10.10.10.67", MASK="255.255.255.0", PEERIP="10.166.1.251", PEERMASK="255.255.255.255", BEAR=IPV4, BRT=YES, RTIDX=0, BINDSECONDARYRT=NO, CHECKTYPE=NONE;
ADD VLANMAP: NEXTHOPIP="10.10.10.1", MASK="255.255.248.0", VLANMODE=SINGLEVLAN, VLANID=3721, SETPRIO=DISABLE; 
ADD SCTPTEMPLATE: SCTPTEMPLATEID=0, SWITCHBACKFLAG=ENABLE;
ADD SCTPHOST: SCTPHOSTID=0, IPVERSION=IPv4, SIGIP1V4="10.10.10.67", SIGIP1SECSWITCH=DISABLE, SIGIP2SECSWITCH=DISABLE, PN=2000, SCTPTEMPLATEID=0;
ADD SCTPPEER: SCTPPEERID=0, IPVERSION=IPv4, SIGIP1V4="10.3.3.3", SIGIP1SECSWITCH=DISABLE, SIGIP2SECSWITCH=DISABLE, PN=2000;
ADD USERPLANEHOST: UPHOSTID=0, IPVERSION=IPv4, LOCIPV4="10.10.10.67", IPSECSWITCH=DISABLE;
ADD EPGROUP: EPGROUPID=0;
ADD SCTPHOST2EPGRP: EPGROUPID=0, SCTPHOSTID=0; 
ADD SCTPPEER2EPGRP: EPGROUPID=0, SCTPPEERID=0;
ADD UPHOST2EPGRP: EPGROUPID=0, UPHOSTID=0;
ADD S1: S1Id=0, CnOperatorId=0, EpGroupCfgFlag=CP_UP_CFG, CpEpGroupId=0, UpEpGroupId=0;


We’ll need clocking and time as well, we’ll use NTP and GPS:

SET TIMESRC: TIMESRC=NTP; 
ADD NTPC: MODE=IPV4, IP="10.166.1.251", PORT=123, SYNCCYCLE=60, AUTHMODE=PLAIN; 
SET MASTERNTPS: MODE=IPV4, IP="10.166.1.251"; 
SET TZ: ZONET=GMT+0800, DST=NO;

ADD GPS: SRN=0, SN=7;
SET CLKMODE: MODE=MANUAL, CLKSRC=GPS, SRCNO=0;
SET CLKSYNCMODE:CLKSYNCMODE=TIME;

Next we’ll need to define a sector, sector equipment & cell, then link it to a sector equipment group:

ADD SECTOR:SECTORID=0,ANTNUM=2,ANT1CN=0,ANT1SRN=60,ANT1SN=255, ANT1N=R0A,ANT2CN=0,ANT2SRN=60,ANT2SN=255,ANT2N=R0B,CREATESECTOREQM=FALSE;

ADD SECTOREQM:SECTOREQMID=0,SECTORID=0,ANTNUM=2,ANT1CN=0, ANT1SRN=60,ANT1SN=255,ANT1N=R0A,ANTTYPE1=RXTX_MODE,ANT2CN=0,ANT2SRN=60,ANT2SN=255,ANT2N=R0B,ANTTYPE2=RXTX_MODE;

ADD CELL:LOCALCELLID=1,CELLNAME="CELL1",FREQBAND=41,ULEARFCNCFGIND=NOT_CFG,DLEARFCN=40340,ULBANDWIDTH=CELL_BW_N100,DLBANDWIDTH=CELL_BW_N100,CELLID=1,PHYCELLID=1,FDDTDDIND=CELL_TDD,SUBFRAMEASSIGNMENT=SA2,SPECIALSUBFRAMEPATTERNS=SSP5,ROOTSEQUENCEIDX=0,CUSTOMIZEDBANDWIDTHCFGIND=NOT_CFG,EMERGENCYAREAIDCFGIND=NOT_CFG,UEPOWERMAXCFGIND=NOT_CFG,MULTIRRUCELLFLAG=BOOLEAN_TRUE,MULTIRRUCELLMODE=MPRU_AGGREGATION, CPRICOMPRESSION=NORMAL_COMPRESSION,TXRXMODE=2T2R;

ADD EUSECTOREQMGROUP:LOCALCELLID=1,SECTOREQMGROUPID=1;
ADD EUSECTOREQMID2GROUP:LOCALCELLID=1,SECTOREQMGROUPID=1, SECTOREQMID=0;

Alright, now we can activate it:

//Modify the reference signal power.
MOD PDSCHCFG: LocalCellId=1, ReferenceSignalPwr=-81;

//Add an operator for the cell.
ADD CELLOP: LocalCellId=0, TrackingAreaId=0;

//Activate the cell.
ACT CELL: LocalCellId=1;

And lastly we can define some neighboring cells:

//Configure neighboring cells. 
ADD EUTRANINTERNFREQ: LocalCellId=1, DlEarfcn=3100, UlEarfcnCfgInd=NOT_CFG, CellReselPriorityCfgInd=NOT_CFG, SpeedDependSPCfgInd=NOT_CFG, MeasBandWidth=MBW100, PmaxCfgInd=NOT_CFG, QqualMinCfgInd=NOT_CFG;
ADD EUTRANEXTERNALCELL: Mcc="460", Mnc="02", eNodeBId=236, CellId=0, DlEarfcn=3100, UlEarfcnCfgInd=NOT_CFG, PhyCellId=236, Tac=33;
ADD EUTRANINTERFREQNCELL: LocalCellId=1, Mcc="460", Mnc="02", eNodeBId=236, CellId=0;

BSF Addresses

The Binding Support Function is used in 4G and 5G networks to allow applications to authenticate against the network, it’s what we use to authenticate for XCAP and for an Entitlement Server.

Rather irritatingly, there are two BSF addresses in use:

If the ISIM is used for bootstrapping the FQDN to use is:

bsf.ims.mncXXX.mccYYY.pub.3gppnetwork.org

But if the USIM is used for bootstrapping the FQDN is

bsf.mncXXX.mccYYY.pub.3gppnetwork.org

You can override this by setting the 6FDA EF_GBANL (GBA NAF List) on the USIM or equivalent on the ISIM, however not all devices honour this from my testing.

Will 5GC be used in Wireline Access? No. Here’s why.

One of the hyped benefits of a 5G Core Networks is that 5GC can be used for wired networks (think DSL or GPON) – In marketing terms this is called “Wireless Wireline Convergence” (5G WWC) meaning DSL operators, cable operators and fibre network operators can all get in on this sweet 5GC action and use this sexy 5G Core Network tech.

This is something that’s in the standards, and that the big kit vendors are pushing heavily in their marketing materials. But will it take off? And should operators of wireline networks (fixed networks) be looking to embrace 5GC?

Comparing 5GC with current wireline network technologies isn’t comparing apples to apples, it’s apples to oranges, and they’re different fruits.

At its heart, the 3GPP Core Networks (including 5G Core) address one particular use cases of the cellular industry: Subscriber mobility – Allowing a customer to move around the network, being served by different kit (gNodeBs) while keeping the same IP Address.

The most important function of 5GC is subscriber mobility.

This is achieved through the use of encapsulating all the subscriber’s IP data into a GTP (A protocol that’s been around since 2G first added data).

Do I need a 5GC for my Fixed Network?

Wireline networks are fixed. Subscribers don’t constantly move around the network. A GPON customer doesn’t need to move their OLT every 30 minutes to a new location.

Encapsulating a fixed subscriber’s traffic in GTP adds significant processing overhead, for almost no gain – The needs of a wireline network operator, are vastly different to the needs of a cellular core.

Today, you can take a /24 IPv4 block, route it to a DSLAM, OLT or CMTS, and give an IP to 254 customers – No cellular core needed, just a router and your access device and you’re done, and this has been possible for decades.
Because there’s no mobility the GTP encapsulation that is the bedrock for cellular, is not needed.

Rather than routing directly to Access Network kit, most fixed operators deploy BRAS systems used for fixed access. Like the cellular packet core, BRAS has been around for a very long time, with a massive install base and a sea of engineering experience in house, it meets the needs of the wireline industry who define its functions and roles along with kit vendors of wireline kit; the fixed industry working groups defined the BRAS in the same way the 3GPP and cellular industry working groups defined 5G Core.

I don’t forsee that we’ll see large scale replacement of BRAS by 5GC, for the same reason a wireless operator won’t replace their mobile core with a BRAS and PPPoE – They’re designed to meet different needs.

All the other features that have been added to the 3GPP Core Network functionality, like limiting speed, guaranteed throughput bearers, 5QI / QCI values, etc, are addons – nice-to-haves. All of these capabilities could be implemented in wireline networks today – if the business case and customer demand was there.

But what about slicing?

With dropping ARPUs across the board, additional services relating to QoS (“Network Slicing”) are being held up as the saving grace of revenues for cellular operators and 5G as a whole, however this has yet to be realized and early indications suggest this is not going to be anywhere near as lucrative as previously hoped.

What about cost savings?

In terms of cost-per-bit of throughput, the existing install base wireline operators have of heavy-metal kit capable of terabit switching and routing has been around for some time in fixed world, and is what most 5G Cores will connect to as their upstream anyway, so there won’t be any significant savings on equipment, power consumption or footprint to be gained.

Fixed networks transport the majority of the world’s data today – Wireline access still accounts for the majority of traffic volumes, so wireline kit handles a higher magnitude of throughput than it’s Packet Core / 5GC cousins already.

Cutting down the number of parts in the network is good though right?

If you’re operating both a Packet Core for Cellular, and a fixed network today, then you might think if you moved from the traditional BRAS architecture fore the wired network to 5GC, you could drop all those pesky routers and switches clogging up your CO, Exchanges and Data Centers.

The problem is that you still need all of those after the 5GC to be able to get the traffic anywhere users want to go. So the 5GC will still need all of that kit, all your border routers and peering routers will remain unchanged, as well as domestic transmission, MPLS and transport.

The parts required for operating fixed networks is actually pretty darn small in comparison to that of 5GC.

TL;DR?

While cellular vendors would love to sell their 5GC platform into fixed operators, the premise that they are willing to replace existing BRAS architectures with 5GC, is as unlikely in my view as 5GC being replaced by BRAS.

Tiny Pillars in the CAN

On the rare occasions I’m not tied to my desk, I’m out for a long run along some back roads somewhere.

Every now and then I come across these tiny telecom pillars for cross-connection (and don’t shoot at them) – I mostly find them around the edges of distribution areas.
I had some recollection that these were originally for trunk lines between exchanges (maybe there was some truth to this?), but some digging in old docs show these were just for interconnecting main or branch cables with distribution cables, in areas where the 600 and 1200 pair pillars / cabinets would be overkill.

They’re built like the 900/1800 pair cabinets, but just scaled down versions, supporting 1x 100 pair main cable, 1x 100 pair distribution cables and 2x 50 pair distribution cables.

It seems like these were largely decomed when NBN took over, leaving most with a big X sprayed on them.

While I was looking through the docs I also found reference to a 180 pair pillar, which looked very similar, but I’ve yet to see any of them left in the wild. Better keep running ’till I find one!

Authenticating Fixed Line Subscribers into IMS

We recently added support in PyHSS for fixed line SIP subscribers to attach to the IMS.

Traditional telecom operators are finding their fixed line network to be a bit of a money pit, something they’re required to keep operating to meet regulatory obligations, but the switches are sitting idle 99% of the time. As such we’re seeing more and more operators move fixed line subs onto their IMS.

This new feature means we can use PyHSS to serve as the brains for a fixed network, as well as for mobile, but there’s one catch – How we authenticate subscribers changes.

Most banks of line cards in a legacy telecom switches, or IP Phones, don’t have SIM slots to allow us to authenticate, so instead we’re forced to fallback to what they do support.

Unfortunately for the most part, what is supported by these IP phones or telecom switches is SIP MD5 Digest Authentication.

The Nonce is generated by the HSS and put into the Multimedia-Authentication-Answer, along with the subscriber’s password and sent in the clear to the S-CSCF.

Subscriber with Password made up of all 1's MAA response from HSS for Digest-MD5 Auth

The HSS then generates the the Multimedia-Auth Answer, it generates a nonce (in the 3GPP-SIP-Authenticate / 609 AVP) and sends the Subscriber’s password in the 3GPP-SIP-Authorization (610) AVP in response back to the S-CSCF.

I would have thought a better option would be for the HSS to generate the Nonce and Digest, and then the S-CSCF to just send the Nonce to the Sub and compare the returned Digest from the Sub against the expected Digest from the HSS, but it would limit flexibility (realm adaptation, etc) I guess.

The UE/UA (I guess it’s a UA in this context as it’s not a mobile) then generates its own Digest from the Nonce and sends it back to the S-CSCF via the P-CSCF.

The S-CSCF compares the received Digest response against the one it generated, and if the two match, the sub is authenticated and allowed to attach onto the network.

IMS iFC – SPT Session Cases

Mostly just reference material for me:

Possible values:

  • 0 (ORIGINATING_SESSION)
  • 1 TERMINATING_REGISTERED
  • 2 (TERMINATING_UNREGISTERED)
  • 3 (ORIGINATING_UNREGISTERED

In the past I had my iFCs setup to look for the P-Access-Network-Info header to know if the call was coming from the IMS, but it wasn’t foolproof – Fixed line IMS subs didn’t have this header.

            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>0</Group>
                    <Method>INVITE</Method>
                    <Extension></Extension>
                </SPT>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>1</Group>
                    <SIPHeader>
                      <Header>P-Access-Network-Info</Header>
                    </SIPHeader>
                </SPT>                
            </TriggerPoint>

But now I’m using the Session Cases to know if the call is coming from a registered IMS user:

        <!-- SIP INVITE Traffic from Registered Sub-->
        <InitialFilterCriteria>
            <Priority>30</Priority>
            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>0</Group>
                    <Method>INVITE</Method>
                    <Extension></Extension>
                </SPT>
                <SPT>
                    <Group>0</Group>
                    <SessionCase>0</SessionCase>
                </SPT>             
            </TriggerPoint>

Using fs_cli and ESL

If you work with FreeSWITCH there’s a good chance every time you do, you run fs_cli and attempt to read the firehose of data shown when making a call to make sense of what’s going on and why what you’re trying to do isn’t working.

But, if you are also using the Event Socket Language service built into FreeSWITCH (Which you totally should) either for programming FreeSWITCH behaviour, or for realtime charging with CGrateS, then you’ll find that fs_cli will fail to connect.

That’s because we’ve edited the event_socket.conf.xml file, and fs_cli uses the event socket to connect to FreeSWITCH as well.

But there’s a simple fix,

Create a new file in /etc/fs_cli.conf and populate it with the info needed to connect to your ESL session you defined in event_socket.conf.xml, so if this is is your

<configuration name="event_socket.conf" description="Socket Client">
  <settings>
    <param name="nat-map" value="false"/>
    <param name="listen-ip" value="10.98.0.76"/>
    <param name="listen-port" value="8021"/>
    <param name="password" value="mysupersecretpassword"/>
    <param name="apply-inbound-acl" value="NickACL"/>
  </settings>
</configuration>

Then your /etc/fs_cli.conf should look like:

[default]
; Put me in /etc/fs_cli.conf or ~/.fs_cli_conf
;overide any default options here
loglevel => 6
log-uuid => false


host     => 10.98.0.76
port     => 8021
password => mysupersecretpassword
debug    => 7

And that’s it, now you can run fs_cli and connect to the terminal once more!

How much computing power is in a SIM (And is it enough to get humans to the Moon?)

The first thing people learn about SIMs or the Smart Cards that the SIM / USIM app runs on, is that “There’s a little computer in the card”. So how little is this computer, and what’s the computing power in my draw full of SIMs?

So for starters the SIM manufacturers love their NDAs, so I can’t post the chip specifications for the actual cards in my draw, but here’s some comparable specs from a seller selling Java based smart cards online:

Specs for Smart Card

4K of RAM is 4069 bytes.
For comparison the Apollo Guidance Computer had 2048 words of RAM, but each “word” was 16 bits (two bytes), so actually this would translate to 4069 bytes so equal with one of these smart cards in terms of RAM – So the smart card above is on par with the AGC that took humans to the moon in terms of RAM, althhough the SIMs would be a wee bit larger if they were also using magnetic core memory like the AGC!

The Nintendo Entertainment System was powered by a MOS Technology 6502, it had access to 2K of RAM, two the Smart Card has twice as much RAM as the NES, so it could get you to the moon and play Super Mario Bros.

What about comparing Non-Volatile Memory (Storage)? Well, the smart card has 145KB of ROM / NVM, while Apollo flew with 36,864 words of RAM, each word is two bits to 73,728 Bytes, so roughly half of what the Smart Card has – Winner – Smart Card, again, without relying on core rope memory like AGC.

SIM cards are clocked kinda funkily so comparing processor speeds is tricky. Smart Cards are clocked off the device they connect to, which feeds them a clock signal via the CLK pin. The minimum clock speed is 1Mhz while the max is 5Mhz.

Now I’m somewhat of a hoarder when it comes to SIM Cards; in the course of my work I have to deal with a lot of SIMs…

Generally when we’re getting SIMs manufactured, during the Batch Approval Process (BAP) the SIM vendor will send ~25 cards for validation and testing. It’s not uncommon to go through several revisions. I probably do 10 of these a year for customers, so that’s 250 cards right there.

Then when the BAP is done I’ll get another 100 or so production cards for the lab, device testing, etc, this probably happens 3 times a year.

So that’s 550 SIMs a year, I do clean out every so often, but let’s call it 1000 cards in the lab in total.

In terms of ROM that gives me a combined 141.25 MB, I could store two Nintendo 64 games, or one Mini CD of data, stored across a thousand SIM cards – And you thought installing software from a few floppies was a pain in the backside, imagine accessing data from 1000 Smart Cards!

What about tying the smart cards together to use as a giant RAM BUS? Well our 1000 cards give us a combined 3.91 MB of RAM, well that’d almost be enough to run Windows 95, and enough to comfortably run Windows 3.1.

Practical do do any of this? Not at all, now if you’ll excuse me I think it’s time I throw out some SIMs…

The time Bell Labs brought the Statue of Liberty under its roof (Literally)

It’s 1986 and you’ve got a 31 tons of copper, in the form of a giant 46 meter tall statue, that’s looking a bit worse for wear.

The Statue of Liberty has had water pooling in some areas, causing areas of her copper skin to corrode, and in some cases wearing all the way through.

On the other side of the iron curtain (it’s still up after all) there are probably quite a number of folks experienced in looking after giant statues, but alas, you’re the US National Parks Service and seeking help from the Soviets is probably a bad look.

The statue is made of Copper, and who knows more about copper than the phone company, with a vast, vast network of copper lines spanning the country?

So the National Parks Service called upon Bell Labs to help.

The Bell Labs’ chemists assigned to the project quickly pointed out that just replacing the corroded copper with new copper would hardly blend in – You’d have the shiny brown copper colour in the new sections, which wouldn’t match the verdigris that occurs through the oxidation of the copper, which would take years to form. (When she was delivered, the statue had a copper colour like you’d see in Copper piping, not the green patina we see today.)

Bell Labs staff looked at artificially creating the patina with acid solutions, to speed up the process to match the new copper with the old, but it was found it may cause structural weak points.

John Franey who was a technical assistant working at Bell Labs’ Murray Hill laboratories must have looked up at the roof of their buildings, constructed in 1941, and thought “Well that looks pretty close…”, so the naturally patinaed roof of Bell Lab’s New Jersey campus was peeled up and sent off for patching the statue.

Modern day roof at Murray Hill now with the verdigris that’s had 40 years to form

Murray Hill got a shiny new copper roof to replace the old green one they’d just given up, and the particles of copper corrosion scraped off the dismantled roof of a Bell Labs were mixed with acetone into a special spray used as concealer on the statue’s skin.

In exchange, Bell Labs staff were given some of the copper plates removed from the statue, so they could study the natural corrosion process in copper, in various weather conditions, which in turn would lead to a better understanding of how to build and maintain their copper plant.

Sources

The Idea Factory – Book by Jon Gertner

New York Times: TECHNOLOGY; STATUE’S REPAIR AIDS RESEARCH – Stuart Diamond – Feb. 14, 1985

New York Times: BELL LAB SCIENTISTS WORKING AS LIBERTY’S ‘DERMATOLOGISTS’ – Marian H. Mundy – June 29, 1986

SSH into Cisco STPs

If it ain’t broke don’t fix is an addage that the telecom industry has well and truly applied to the SS7 space.

If you’ve got an SS7 network (especially one built on TDM links) the general philosophy is don’t touch it and hope to retire before it dies.

The Cisco STP (Internet Transfer Point) is a good example of this, and for that reason I still work on them.

But OpenSSH and standards have moved on, and SSHing into them these days requires some extra (insecure) parameters to access, so here they are:

ssh -oKexAlgorithms=+diffie-hellman-group1-sha1 -oHostKeyAlgorithms=+ssh-rsa -caes128-cbc [email protected]

Will get you into an Version 12.3(4r)T4 Cisco ITP. Be sure to run sho ver and marvel at that uptime!

Inside a 32×32 MIMO Antenna

For the past few months I’ve had a Band 78 NR active antenna unit sitting next to my desk.

It’s a very cool bit of kit that doesn’t get enough love, but I thought I’d pop open the radome and take a peek inside.

Individual antenna elements

What I found very interesting is that it’s not all antennas in there!

… 29, 30, 31, 32. Yup. Checks out.

There are the expected number of antennas (I mean if I opened it up and found 31 antennas I’d have been surprised) but they don’t take up the whole volume of the unit, only about half,

AAU with Radome reinstalled

Well, after that strip show, back to sitting in my office until I need to test something 5G SA again…

SQN Sync in IMS Auth

So the issue was a head scratcher.

Everything was working on the IMS, then I go to bed, the next morning I fire up the test device and it just won’t authenticate to the IMS – The S-CSCF generated a 401 in response to the REGISTER, but the next REGISTER wouldn’t pass.

Wireshark just shows me this loop:

UE -> IMS: REGISTER
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)
UE -> IMS: REGISTER with response
IMS -> UE: 401 Unauthorized (With Challenge)

So what’s going on here?

IMS uses AKAv1-MD5 for Authentication, this is slightly different to the standard AKA auth used in cellular, but if you’re curious, we’ve covered by IMS Authentication and standard AKA based SIM Authentication in cellular networks before.

When we generate the vectors (for IMS auth and standard auth) one of the inputs to generate the vectors is the Sequence Number or SQN.

This SQN ticks over like an odometer for the number of times the SIM / HSS authentication process has been performed.

There is some leeway in the SQN – It may not always match between the SIM and the HSS and that’s to be expected.
When the MME sends an Authentication-Information-Request it can ask for multiple vectors so it’s got some in reserve for the next time the subscriber attaches, and that’s allowed.

Information stored on USIM / SIM Card for LTE / EUTRAN / EPC - K key, OP/OPc key and SQN Sequence Number

But there are limits to how far out our SQN can be, and for good reason – One of the key purposes for the SQN is to protect against replay attacks, where the same vector is replayed to the UE. So the SQN on the HSS can be ahead of the SIM (within reason), but it can’t be behind – Odometers don’t go backwards.

So the issue was with the SQN on the SIM being out of Sync with the SQN in the IMS, how do we know this is the case, and how do we fix this?

Well there is a resync mechanism so the SIM can securely tell the HSS what the current SQN it is using, so the HSS can update it’s SQN.

When verifying the AUTN, the client may detect that the sequence numbers between the client and the server have fallen out of sync.
In this case, the client produces a synchronization parameter AUTS, using the shared secret K and the client sequence number SQN.
The AUTS parameter is delivered to the network in the authentication response, and the authentication can be tried again based on authentication vectors generated with the synchronized sequence number.

RFC 3110: HTTP Digest Authentication using AKA

In our example we can tell the sub is out of sync as in our Multimedia Authentication Request we see the SIP-Authorization AVP, which contains the AUTS (client synchronization parameter) which the SIM generated and the UE sent back to the S-CSCF. Our HSS can use the AUTS value to determine the correct SQN.

SIP-Authorization AVP in the Multimedia Authentication Request means the SQN is out of Sync and this AVP contains the RAND and AUTN required to Resync

Note: The SIP-Authorization AVP actually contains both the RAND and the AUTN concatenated together, so in the above example the first 32 bytes are the AUTN value, and the last 32 bytes are the RAND value.

So the HSS gets the AUTS and from it is able to calculate the correct SQN to use.

Then the HSS just generates a new Multimedia Authentication Answer with a new vector using the correct SQN, sends it back to the IMS and presto, the UE can respond to the challenge normally.

This feature is now fully implemented in PyHSS for anyone wanting to have a play with it and see how it all works.

And that friends, is how we do SQN resync in IMS!

HOMER API in Python

We’re doing more and more network automation, and something that came up as valuable to us would be to have all the IPs in HOMER SIP Capture come up as the hostnames of the VM running the service.

Luckily for us HOMER has an API for this ready to roll, and best of all, it’s Swagger based and easily documented (awesome!).

(Probably through my own failure to properly RTFM) I was struggling to work out the correct (current) way to Authenticate against the API service using a username and password.

Because the HOMER team are awesome however, the web UI for HOMER, is just an API client.

This means to look at how to log into the API, I just needed to fire up Wireshark, log into the Web UI via my browser and then flick through the packets for a real world example of how to do this.

Homer Login JSON body as seen by Wireshark

In the Login action I could see the browser posts a JSON body with the username and password to /api/v3/auth

{"username":"admin","password":"sipcapture","type":"internal"}

And in return the Homer API Server responds with a 201 Created an a auth token back:

Now in order to use the API we just need to include that token in our Authorization: header then we can hit all the API endpoints we want!

For me, the goal we were setting out to achieve was to setup the aliases from our automatically populated list of hosts. So using the info above I setup a simple Python script with Requests to achieve this:

import requests
s = requests.Session()

#Login and get Token
url = 'http://homer:9080/api/v3/auth'
json_data = {"username":"admin","password":"sipcapture"}
x = s.post(url, json = json_data)
print(x.content)
token = x.json()['token']
print("Token is: " + str(token))


#Add new Alias
alias_json = {
          "alias": "Blog Example",
          "captureID": "0",
          "id": 0,
          "ip": "1.2.3.4",
          "mask": 32,
          "port": 5060,
          "status": True
        }

x = s.post('http://homer:9080/api/v3/alias', json = alias_json, headers={'Authorization': 'Bearer ' + token})
print(x.status_code)
print(x.content)


#Print all Aliases
x = s.get('http://homer:9080/api/v3/alias', headers={'Authorization': 'Bearer ' + token})
print(x.json())

And bingo we’re done, a new alias defined.

We wrapped this up in a for loop for each of the hosts / subnets we use and hooked it into our build system and away we go!

With the Homer API the world is your oyster in terms of functionality, all the features of the Web UI are exposed on the API as the Web UI just uses the API (something I wish was more common!).

Using the Swagger based API docs you can see examples of how to achieve everything you need to, and if you ever get stuck, just fire up Wireshark and do it in the Homer WebUI for an example of how the bodies should look.

Thanks to the Homer team at QXIP for making such a great product!

Getting to know the PCRF for traffic Policy, Rules & Rating

Misunderstood, under appreciated and more capable than people give it credit for, is our PCRF.

But what does it do?

Most folks describe the PCRF in hand wavy-terms – “it does policy and charging” is the answer you’ll get, but that doesn’t really tell you anything.

So let’s answer it in a way that hopefully makes some practical sense, starting with the acronym “PCRF” itself, it stands for Policy and Charging Rules Function, which is kind of two functions, one for policy and one for rules, so let’s take a look at both.

Policy

In cellular world, as in law, policy is the rules.

For us some examples of policy could be a “fair use policy” to limit customer usage to acceptable levels, but it can also be promotional packages, services like “free Spotify” packages, “Voice call priority” or “unmetered access to Nick’s Blog and maximum priority” packages, can be offered to customers.

All of these are examples of policy, and to make them work we need to target which subscribers and traffic we want to apply the policy to, and then apply the policy.

Charging Rules

Charging Rules are where the policy actually gets applied and the magic happens.

It’s where we take our policy and turn it into actionable stuff for the cellular world.

Let’s take an example of “unmetered access to Nick’s Blog and maximum priority” as something we want to offer in all our cellular plans, to provide access that doesn’t come out of your regular usage, as well as provide QCI 5 (Highest non dedicated QoS) to this traffic.

To achieve this we need to do 3 things:

  • Profile the traffic going to this website (so we capture this traffic and not regular other internet traffic)
  • Charge it differently – So it’s not coming from the subscriber’s regular balance
  • Up the QoS (QCI) on this traffic to ensure it’s high priority compared to the other traffic on the network

So how do we do that?

Profiling Traffic

So the first step we need to take in providing free access to this website is to filter out traffic to this website, from the traffic not going to this website.

Let’s imagine that this website is hosted on a single machine with the IP 1.2.3.4, and it serves traffic on TCP port 443. This is where IPFilterRules (aka TFTs or “Traffic Flow Templates”) and the Flow-Description AVP come into play. We’ve covered this in the past here, but let’s recap:

IPFilterRules are defined in the Diameter Base Protocol (IETF RFC 6733), where we can learn the basics of encoding them,

They take the format:

action dir proto from src to dst

The action is fairly simple, for all our Dedicated Bearer needs, and the Flow-Description AVP, the action is going to be permit. We’re not blocking here.

The direction (dir) in our case is either in or out, from the perspective of the UE.

Next up is the protocol number (proto), as defined by IANA, but chances are you’ll be using 17 (UDP) or 6 (TCP).

The from value is followed by an IP address with an optional subnet mask in CIDR format, for example from 10.45.0.0/16 would match everything in the 10.45.0.0/16 network.

Following from you can also specify the port you want the rule to apply to, or, a range of ports.

Like the from, the to is encoded in the same way, with either a single IP, or a subnet, and optional ports specified.

And that’s it!

So let’s create a rule that matches all traffic to our website hosted on 1.2.3.4 TCP port 443,

permit out 6 from 1.2.3.4 443 to any 1-65535
permit out 6 from any 1-65535 to 1.2.3.4 443

All this info gets put into the Flow-Information AVPs:

With the above, any traffic going to/from 1.23.4 on port 443, will match this rule (unless there’s another rule with a higher precedence value).

Charging Actions

So with our traffic profiled, the next question is what actions are we going to take, well there’s two, we’re going to provide unmetered access to the profiled traffic, and we’re going to use QCI 4 for the traffic (because you’ll need a guaranteed bit rate bearer to access!).

Charging-Group for Profiled Traffic

To allow for Zero Rating for traffic matching this rule, we’ll need to use a different Rating Group.

Let’s imagine our default rating group for data is 10000, then any normal traffic going to the OCS will use rating group 10000, and the OCS will apply the specific rates and policies based on that.

Rating Groups are defined in the OCS, and dictate what rates get applied to what Rating Groups.

For us, our default rating group will be charged at the normal rates, but we can define a rating group value of 4000, and set the OCS to provide unlimited traffic to any Credit-Control-Requests that come in with Rating Group 4000.

This is how operators provide services like “Unlimited Facebook” for example, a Charging Rule matches the traffic to Facebook based on TFTs, and then the Rating Group is set differently to the default rating group, and the OCS just allows all traffic on that rating group, regardless of how much is consumed.

Inside our Charging-Rule-Definition, we populate the Rating-Group AVP to define what Rating Group we’re going to use.

Setting QoS for Profiled Traffic

The QoS Description AVP defines which QoS parameters (QCI / ARP / Guaranteed & Maximum Bandwidth) should be applied to the traffic that matches the rules we just defined.

As mentioned at the start, we’ll use QCI 4 for this traffic, and allocate MBR/GBR values for this traffic.

Putting it Together – The Charging Rule

So with our TFTs defined to match the traffic, our Rating Group to charge the traffic and our QoS to apply to the traffic, we’re ready to put the whole thing together.

So here it is, our “Free NVN” rule:

I’ve attached a PCAP of the flow to this post.

In our next post we’ll talk about how the PGW handles the installation of this rule.

Failures in cobbling together a USSD Gateway

One day recently I was messing with the XCAP server, trying to set the Call Forward timeout. In the process I triggered the UE to send a USSD request to the IMS.

Huh, I thought, “I wonder how hard it would be to build a USSD Gateway for our IMS?”, and this my friends, is the story of how I wasted a good chunk of my weekend trying (and failing) to add support for USSD.

You might be asking “Who still uses USSD?” – The use cases for USSD are pretty thin on the ground in this day and age, but I guess balance query, and uh…

But this is the story of what I tried before giving up and going outside…

Routing

First I’d need to get the USSD traffic towards the USSD Gateway, this means modifying iFCs. Skimming over the spec I can see the Recv-Info: header for USSD traffic should be set to “g.3gpp.ussd” so I knocked up an iFC to match that, and route the traffic to my dev USSD Gateway, and added it to the subscriber profile in PyHSS:

  <!-- SIP USSD Traffic to USSD-GW-->
        <InitialFilterCriteria>
            <Priority>25</Priority>
            <TriggerPoint>
                <ConditionTypeCNF>1</ConditionTypeCNF>
                <SPT>
                    <ConditionNegated>0</ConditionNegated>
                    <Group>1</Group>
                    <SIPHeader>
                      <Header>Recv-Info</Header>
                      <Content>"g.3gpp.ussd"</Content>
                    </SIPHeader>
                </SPT>                
            </TriggerPoint>
            <ApplicationServer>
                <ServerName>sip:ussdgw:5060</ServerName>
                <DefaultHandling>0</DefaultHandling>
            </ApplicationServer>
        </InitialFilterCriteria>

Easy peasy, now we have the USSD requests hitting our USSD Gateway.

The Response

I’ll admit that I didn’t jump straight to the TS doc from the start.

The first place I headed was Google to see if I could find any PCAPs of USSD over IMS/SIP.

And I did – Restcomm seems to have had a USSD product a few years back, and trawling around their stuff provided some reference PCAPs of USSD over SIP.

So the flow seemed pretty simple, SIP INVITE to set up the session, SIP INFO for in-dialog responses and a BYE at the end.

With all the USSD guts transferred as XML bodies, in a way that’s pretty easy to understand.

Being a Kamailio fan, that’s the first place I started, but quickly realised that SIP proxies, aren’t great at acting as the UAS.

So I needed to generate in-dialog SIP INFO messages, so I turned to the UAC module to generate the SIP INFO response.

My Kamailio code is super simple, but let’s have a look:

request_route {

        xlog("Request $rm from $fU");

        if(is_method("INVITE")){
                xlog("USSD from $fU to $rU (Emergency number) CSeq is $cs ");
                sl_reply("200", "OK Trying USSD Phase 1");      #Generate 200 OK
                route("USSD_Response"); #Call USSD_Response route block
                exit;
        }
}

route["USSD_Response"]{
        xlog("USSD_Response Route");
        #Generate a new UAC Request
        $uac_req(method)="INFO";
        $uac_req(ruri)=$fu;     #Copy From URI to Request URI
        $uac_req(furi)=$tu;     #Copy To URI to From URI
        $uac_req(turi)=$fu;     #Copy From URI to To URI
        $uac_req(callid)=$ci;   #Copy Call-ID
                                #Set Content Type to 3GPP USSD
        $uac_req(hdrs)=$uac_req(hdrs) + "Content-Type: application/vnd.3gpp.ussd+xml\r\n";
                                #Set the USSD XML Response body
        $uac_req(body)="<?xml version='1.0' encoding='UTF-8'?>
        <ussd-data>
                <language value=\"en\"/>
                <ussd-string value=\"Bienvenido. Seleccione una opcion: 1 o 2.\"/>
        </ussd-data>";
        $uac_req(evroute)=1;    #Set the event route to use on return replies
        uac_req_send();         #Send it!
}

So the UAC module generates the 200 OK and sends it back.

“That was quick” I told myself, patting myself on the back before trying it out for the first time.

Huston, we have a problem – Although the Call-ID is the same, it’s not an in-dialog response as the tags aren’t present, this means our UE send back a 405 to the SIP INFO.

Right. Perhaps this is the time to read the Spec…

Okay, so the SIP INFO needs to be in dialog. Can we do that with the UAC module? Perhaps not…

But the Transaction Module ™ in Kamailio exposes and option on the ctl API to generate an in-dialog UAC – this could be perfect…

But alas real life came back to rear its ugly head, and this adventure will have to continue another day…

Update: Thanks to a kindly provided PCAP I now know what I was doing wrong, and so we’ll soon have a follow up to this post named “Successes in cobbling together a USSD Gateway” just as soon as I have a weekend free.

Kamailio Bytes: Adding Prometheus + Grafana to Kamailio

I recently fell in love with the Prometheus + Grafana combo, and I’m including it in as much of my workflow as possible, so today we’ll be integrating this with another favorite – Kamailio.

Why would we want to integrate Kamailio into Prometheus + Grafana? Observability, monitoring, alerting, cool dashboards to make it look like you’re doing complicated stuff, this duo have it all!

I’m going to assume some level of familiarity with Prometheus here, and at least a basic level of understanding of Kamailio (if you’ve never worked with Kamailio before, check out my Kamailio 101 Series, then jump back here).

So what will we achieve today?

We’ll start with the simple SIP Registrar in Kamailio from this post, and we’ll add on the xhttp_prom module, and use it to expose some stats on the rate of requests, and responses sent to those requests.

So to get started we’ll need to load some extra modules, xhttp_prom module requires xhttp (If you’d like to learn the basics of xhttp there’s also a Kamailio Bytes – xHTTP Module post covering the basics) so we’ll load both.

xHTTP also has some extra requirements to load, so in the top of our config we’ll explicitly specify what ports we want to bind to, and set two parameters that control how Kamailio handles HTTP requests (otherwise you’ll not get responses for HTTP GET requests).

listen=tcp:0.0.0.0:9090
listen=tcp:0.0.0.0:5060
listen=udp:0.0.0.0:5060

http_reply_parse=yes
tcp_accept_no_cl=yes

Then where you load all your modules we’ll load xhttp and xhttp_prom, and set the basic parameters:

loadmodule "xhttp.so"
loadmodule "xhttp_prom.so"

# Define two counters and a gauge
modparam("xhttp_prom", "xhttp_prom_stats", "all")

By setting xhttp_prom module to expose all stats, this exposes all of Kamailio’s internal stats as counters to Prometheus – This means we don’t need to define all our own counters / histograms / gauges, instead we can use the built in ones from Kamailio. Of course we can define our own custom ones, but we’ll do that in our next post.

Lastly we’ll need to add an event route to handle HTTP requests to the /metrics URL:

event_route[xhttp:request] {
	xlog("Got a request!");
	xlog("$ru");
	$var(xhttp_prom_root) = $(hu{s.substr,0,8});
	if ($var(xhttp_prom_root) == "/metrics") {
			xlog("Called metrics");
			prom_dispatch();
			xlog("prom_dispatch() called");
			return;
	} else
		xhttp_reply("200", "OK", "text/html",
        		"<html><body>Wrong URL $hu</body></html>");
}

Restart, and browse to the IP of your Kamailio instance (mine is 10.01.23) port 9090 /metrics and you’ll get something like this:

Kamailio metrics endpoing used by Prometheus

That my friends, is the sort of data that Prometheus gobbles up, so let’s point Prometheus at it and see what data we get back.

Over on my Prometheus server I’ve edited /etc/prometheus/prometheus.yml to target our new Prometheus endpoint.

  - job_name: "kamailo"
    static_configs:
      - targets: ["10.0.1.23:9090"]  
    honor_timestamps: false

So how can we see this data? Well first off if we log into Prometheus we can see the data flowing in:

If we throw some SIP REGISTER traffic at our Kamailio instance and check on the kamailio_registrar_accepted_regs stat we can see our registrations.

After a few clicks in Grafana we can run some graphs for this data too:

So that’s it, Kamailio’s core stats are now exposed to Prometheus, and we can render this information in Grafana.

There’s a copy of the full code used here available in the Github, and in our next post we’ll look at defining our own metrics in Kamailio and then interacting with them.