Category Archives: Notes

Automatic Cell Planning with Atoll: Site Selection

One of the really neat features about using automated RF planning tools like Forsk Atoll is you’re able to get it to automatically try out tweaks and look at how that impacts performance.

In the past you’d adjust something, run the simulation again, look at the results and compare to what you had before,

Atoll’s ACP (Automatic Cell Planning) module allows you to automate this, and in most cases, it does a better job than I would!

Today we’ll look at Cell Site Selection in Atoll.

To begin with we’ll limit the computation area down to a polygon we draw around the area in question,

In the Geo tab we’ll select Zones -> Computation Zone and select Edit

We’ll create a new Polygon and draw around the area we are going to analyze. You can automate this step based on population levels, etc, if you’ve got that data present.

So now we’ve set our computation area to the selection, but if we didn’t do this, we’d be computing for the whole world, and that might take a while…

Generating Candidate Sites

Atoll sucks at this, I’ve found if your computation zone is set, and it’s not a rectangle, bad things happen, so I’ve written a little script to generate candidates for me.

Creating an new ACP Job

From the Network tab, right click on ACP Automatic Cell Planning and select New

Optimization Tab

Before we can define all the specifics of what we’re looking to plan / improve, we need to set some limits on the software itself and tell it what we’re looking to improve.

The resolution defines how precise the results should be, and the iterations defines how many changes the software should run through.

The higher the number of iterations, the better the results, but it’s not linear – The improvement between 1000 iterations and 1,000,000,000 iterations is typically pretty minor, and this is because ACP works kind of a “getting warmer” philosophy, where it changes a value up or down, looks at the overall result and then if the result was better, changes the value again until it stops getting better.

As I’m working in a fairly small area I’m going to set 100 iterations and a 50m resolution.

In the optimization tab we can also set constraints, for example we’re looking at where to place cell sites in an area, and as far as Atoll is concerned if we just throw hundreds of sites at an area we’ll have pretty good results, but the economics of that doesn’t work, so we can set constraints, for example for site selection we may want to set the max number of cell sites. As we are importing ~5k candidate locations, we probably don’t want to build 5k cell sites 20m apart, so set this to be a reasonable number for your geography.

When using ACP for Optimization as we can see later on, we can also set cost constraints regarding the cost to make changes, but for now this is just going to pick best cell sites locations for us.

Objectives Tab

Next up we’ll need to setup Automatic Cell Plannings’ objectives.

For ACP to be an effective tool we need to define what we’re looking for in terms of success, you can’t just throw it some values and say “Make it better” – we need to define what parameters we’re looking to improve. We do this by setting Objectives.

Your objectives are going to be based on your needs and wants, but for this example we’re building greenfield networks, so want to offer coverage over an area, as well as good RSRP and RSRQ, so we will set the objectives to Coverage of 95% of the Computation Zone for this post, with a secondary objective of increasing RSRP and RSRQ.

But today I’m modeling for coverage, so let’s set that:

As we’re planning for LTE we need to set the UE parameters, as I’m planning for a mobile network, I’ll need to set the service type and terminal.

Reconfiguration

Now we’ve defined the Objectives, it’s now time to define what values ACP can mess with to try and achieve these objectives, for some ACP runs you may be adjusting tilts or azimuths, swapping out antennas, etc, but today we’re looking for where we can put cell sites to be the most effective to serve our target area.

Now we import our candidate list. This might be a list of potential towers you can use, or in my case, for something greenfield, I’m just importing a list of points on a map every X meters to find the best locations to place towers.

From the “Reconfiguration”, we’ll select “Setup” to add the sites we want to evalute.

Atoll has “Automatic Candidate Positioning” which allows it to generate pins on the map, but I’ve not had any luck with it, instead I’m importing a list of candidates I’ve generated via a little Python script, so I’ll select “Import from File”.

Pick my file and set the parameters for importing the data like so.

Now we’ve got candidates for cell sites defined, we set the station template to populate and then we’re good to go.

Running ACP

Once you’ve tweaked all your ACP values as required, we can run the ACP job,

As ACP runs you’ll see a graph showing the objectives and the levels it needs to reach to satisfy them, this step can take a super dooper long time – Especially if your computation zone is large or your number of candidates is large.

But eventually we’ll be a lot older and wearier, but ACP will have completed, and we can checkout the Optimization it’s created.

In my case the objectives failed to be met, but that’s OK for me,

One it’s completed the Changes tab outlines the recommended changes, and the Objectives outlines how this has performed against the criteria we outlined at the start, and if we’re happy with the result, we can Commit the changes to put them on the map from the commit tab.

With that done I weed out the sites in impractical locations, the the ones in the sea…

Now we’ve got the sites plugged in, the next thing we’ll start doing is optimizing them.

When we’re dealing with greenfield builds like we are today, the “Move to highest location with X Meters” function is super useful. If you’ve got a high point on a property, we want to build our tower on the highest point, so the tower is moved to the highest point.

One thing to note is this just plans our grid. It won’t adjust azimuths, downtilts, etc, in one operation. We need to use another ACP operation to achieve that, and that’s the content of a different post!

Call forwarding in SS7/ISUP

Had an interesting fault come across my desk the other day; calls were failing when the called party (an SSP we talk to via SS7/ISUP) had an exchange based call forward in place.

In SIP, we can do call forwarding one of two ways, we can send a 302 Redirect or we can generate a new SIP invite.

But in ISUP how is it done?

We’re a SIP based network, but we do talk some SS7/ISUP on the edges, and it was important that we handled this correctly.

I could see in the Address Complete Message (ACM) sent back to our network that there was redirection information here:

We would see the B party SSP release the call as soon as it sent this.

This made me wonder if we, as the originating network, were supposed to redirect to the new B party and send a new Initial Address Message?

After a lot of digging in the ITU Q.7xx docs (I’m not where near as fast at finding information in specs written prior to my birth, than I am with the 3GPP specs) I found my answer – These headers are informational only, the B party SSP is meant to re-target the message, and send us an Alerting or Answer message when it’s done so.

Setting up TR-069 to manage Calix Endpoints

Recently one of our customers who’s got a large number of Calix E7 ONTs needed some help to automate some of the network management tasks to do with the CPEs.

We’d setup an TR-069 Auto Configuration Server (ACS) for the Calix RGs (The modems) so that we could manage the config parameters on the devices.

Setup was suprisingly easy, after installing some god-awful 90’s Java stuff to access Calix’s “CMS” we pointed everything at our ACS (Per screenshot below) and presto, a few thousand CPEs were there ready to be centrally managed.

Basic CAMEL Charging Flow

CAMEL handles charging in 2G and 3G networks, much like Diameter handles charging in LTE.

CAMEL runs on top of SS7, specifically it sits on top of TCAP, which sits on top of SCCP, which can ride on M3UA or MTP3 (so it sits at the same layer as MAP).

CAMEL is primarily focused on charging for Voice & SMS services, as data generally uses Diameter, so it’s voice and SMS we’ll focus on.

CAMEL is spoken between the MSC (gsmSSF) and the OCS (gsmSCF).

Basic Call State Model

CAMEL is closely related to the Intelligent Network stuff on the 1980s, and steals a lot of it’s ideas from there, unfortunately if you’re to read the CAMEL standard it also implies you were involved in IN stuff and had been born at that point, alas I was neither.

So the key to understanding CAMEL is the Basic Call State Model (BCSM) which is a model of all the different states a call can be in, such as ringing, answered, abandoned, call failed, etc, etc.

Over CAMEL, our OCS can be told by the MSC when a certain event happens; the MSC can tell the OCS, that the call has changed state. For example a BCSM event might indicate the call has hung up, is ringing, cancelled, etc.

Below is the list of all the valid BCSM states:

List of BCSM states for events

Basic MO Call with CAMEL

Our subscriber makes an outbound call.

Based on the data the MSC has in it from the HLR, it knows that we should use CAMEL for this call, and it has the SCCP Address of the OCS (gsmSCF) it needs to send the CAMEL messages to.

So the MSC sends an InitialDP message to the OCS (via it’s Global Title Address) to Authorize the call that the user is trying to make.

This is like any other Authorization step for an OCS, which allows the OCS to authorize the call by checking the subscriber is valid, check if they’re allowed to call that destination and they’ve got the balance to do so, etc.

initialDP message from an MSC to an OCS

The initialDP (Initial Detection Point) is telling our OCS all about the call event that’s being requested, who’s calling, what number they’ve dialed, where they are in the network (of note especially if they’re roaming), etc, etc.

The OCS runs through it’s own checks to see if it wants to allow the call to proceed by checking if the subscriber has got enough balance, unit reservation, etc, etc, and if it does, the OCS sends back a Continue message to the MSC to allow the call to continue.

Generally the OCS also uses this message as a chance to subscribe to BCSM Events using RequestReportBCSMEventArg so the OCS will get notified by the MSC when the state of the call changes. This means the MSC will tell us when the state of the call changes; events like the call getting answered, disconnected, etc. This is critical so we know when the call gets answered and hung-up, so we can charge correctly.

In the below example, as well as sending the Continue and RequestReportBCSMEventArg the OCS is also setting the ChargingArgs for this call, so the MSC knows who to charge (the caller) set via sendingSide and that the MSC must send an Apply Charging Report (ACR) messages every 300 units (1 unit = 100 ms, so a value of 300 = 300 x 100 milliseconds = 30 seconds) so the OCS keeps track of what’s going on.

continue sent by the OCS to the MSC, also including reportBCSMEvent and applyCharging messages

At this point the call can start to proceed – In ISUP terms the InitialDP is triggered between the Initial Address Message and the Address Complete message is sent after the continue is sent back.

Or in a slightly less appropriate analogy but easier to understand for SIP folks, the InitialDP is sent for INVITE and the 180 RINGING is sent once the continue message is received.

Call is Answered

So at this stage our call can start to ring.

As we’ve subscribed to BCSM events in our last message, the MSC is going to tell us when the call gets answered or the call times out, is abandoned or the sun burns out.

The MSC provides this info a eventReportBCSM, which is very simple and just tells us the event that’s been triggered, in the example below, the call was answered.

eventReportBCSM from MSC to OCS

These eventReportBCSM are informational from the MSC to the OCS, so the OCS doesn’t need to send anything back, but the OCS does need to mark the call as answered so it can start timing the call.

At this stage, the call is connected and our two parties are talking, but our MSC has been told it needs to send us applyChargingReports every 30 seconds (due to the value of 300 in maxCallPeriodDuration) after the call was connected, so the MSC sends the OCS it’s first applyChargingReport 30 seconds after the call was answered:

applyChargingReport sent by the MSC to the OCS every reporting period

We can calculate the duration of the call so far based on the time of the eventReportBCSM, then the OCS must make a decision of if it should allow the call to continue or not.

For simplicity’s sake, let’s imagine we’re still got a balance in the OCS and the OCS wants the call to continue, the OCS send back an applyCharging message to the MSC in response, and includes the current allowed maxCallPeriodDuration, keeping in mind the value is x100 and in nanoseconds (so this is 30 seconds).

applyCharging from the OCS back to the MSC

Perfect, our call is good to go for another 30 more seconds, son in 30 seconds we’ll get another ACR messages from MSC to the OCS to keep it abreast of what’s going on.

Now one of two things is going to happen, either subscriber is going to burn through all of their minutes, and get their call cutoff, or the call will end while they’ve still got balance, let’s look at both scenarios.

Normal Hangup Scenario

When the call ends, we get an applyChargingReport from the MSC to the OCS.

As we’ve subscribed to reportBCSMEvent we get both the applyChargingReport with legActive: False` so we know the call has hungup, and we’ve got an event report to tell us more about the event, in this case a hangup from the Originating Side.

reportBCSMEvent and applyChargingReport Sent by the MSC to the OCS to indicate the call has ended, note the legActive flag is now false

Lastly the OCS confirms by sending a releaseCall to the MSC, to indicate all legs should now terminate.

releaseCall Sent by OCS to MSC at the very end

So that’s it!

Obviously there are other flows, such as running out of balance mid-call, rejecting a call, SMS and PBX / VPN services that rely on CAMEL, but hopefully you now understand the basics of how CAMEL based charging looks and works.

If you’re looking for a CAMEL capable OCS or a CAMEL to Diameter or API gateway, get in touch!

CGrateS time Metas

There are so many ways you can format time for things like Expiry or ActionPlans in CGrateS, this is mostly just a quick reference for me:

  • *asap (Now)
  • *now
  • *every_minute
  • *hourly
  • *monthly
  • *monthly_estimated
  • *yearly
  • *daily
  • *weekly
  • mo+1h2m
  • *always (What?)
  • *month_end
  • *month_end+1h2m
  • +20s
  • 1375212790
  • +24h
  • 2016-09-14T19:37:43.665+0000
  • 20160419210007.037
  • 31/05/2015 14:46:00
  • 08.04.2014 22:14:29
  • 20131023215149
  • “2013-12-30 15:00:01 +0430 +0430”
  • “2013-12-30 15:00:01 +0000 UTC”
  • “2013-12-30 15:00:01”

Stolen from: https://github.com/cgrates/cgrates/blob/8fec8dbca1f28436f8658dbcb8be9d03ee7ab9ee/utils/coreutils_test.go#L242

Enabling logging on Cisco ITP Signaling Transfer Point

Mostly just for my own notes, but when debugging SCCP translation on a Cisco ITP STP, this is probably obvious for folks who are more Cisco focused:

Enabling debug:

debug cs7 m3ua packet
debug cs7 m3ua all
debug cs7 sccp event ALL
debug cs7 sccp gtt-accounting
terminal monitor

Disabling debug:

no debug cs7 m3ua packet
no debug cs7 m3ua all
no debug cs7 sccp event ALL
no debug cs7 sccp gtt-accounting

GTPv2 Source Ports

Ask anyone in the industry and they’ll tell you that GTPv2-C (aka GTP-C) uses port 2123, and they’re right, kinda.

Per TS 129.274 the Create Session Request should be sent to port 2123, but the source port can be any port:

The UDP Source Port for a GTPv2 Initial message is a locally allocated port number at the sending GTP entity.

So this means that while the Destination Port is 2123, the source port is not always 2123.

So what about a response to this? Our Create Session Response must go where?

Create Session request coming from 166.x.y.z from a random port 36225
Going to the PGW on 172.x.y.z port 2123

The response goes to the same port the request came on, so for the example above, as the source port was 36225, the Create Session Response must be sent to port 36225.

Because:

The UDP Destination Port value of a GTPv2 Triggered message and for a Triggered Reply message shall be the value of the UDP Source Port of the corresponding message to which this GTPv2 entity is replying, except in the case of the SGSN pool scenario.

But that’s where the association ends.

So if our PGW wants to send a Create Bearer Request to the SGW, that’s an initial message, so must go to port 2123, even if the Create Session Request came from a random different port.

Installing Calix CMS Java tool on Ubuntu in 2025

Ah, another post in my “how to make software work that was made with Java in the 1990s” post, except Calix last updated this software in 2022 – make of that what you will…

This time is Calix Management System (CMS), the Java app for managing equipment in exchanges / COs from Calix.

On Ubuntu 24.04 LTS it requires JRE version 8:

sudo apt install openjdk-8-jre
sudo apt install execstack

With that installed I could install CMS

/install.bin LAX_VM /usr/lib/jvm/java-8-openjdk-amd64/bin/java

Then it came time to run it, I chose to install in my home directory in a folder named “Calix” (default).

First you’ve got to make their startup script executable:

~/Calix$ chmod +x Start\ CMS

Then we need to modify it to point to the openjdk Java 8 binary, the simplest way is to just add the LAX_VM on startup:

~/Calix$ ./Start\ CMS LAX_VM /usr/lib/jvm/java-8-openjdk-amd64/bin/java

And you’re in.

TFTs & Create Bearer Requests

What is included in the Charging Rule on Gx ultimately turns into a Create Bearer Request on GTPv2-C.

But the mapping it’s always obvious, today I got stuck on the difference between a Single remote port type, and a single local port type, thinking that the Packet Filter Direction in the TFT controlled this – It doesn’t – It’s controlled by the order of your Traffic Flow Template rule.

Input TFT:

"permit out 17 from any 50000 to any"

Leads to Packet filter component type identifier: Single remote port type

Whereas a TFT of:

permit out 17 from any to any 50000

Leads to Packet filter component type identifier: Single local port type (64)

Holiday Reading list 2024

As summer reaches full swing in Australia and the level of effort I put into blog posts wains, here’s a lost of books I’m to-read or have read this year.

I can’t imagine a telecom book club being super popular, but if you’ve got any recommendations for good telecom related reads, I’d love to hear them!

The End of Telecoms History – William Webb (Read)

I read this this year, Webb is one of those folks who’s paycheck doesn’t come from shilling hardware, and he’s been pretty good at making accurate predictions and soothsaying, even when what he says upsets some.

The launch of 5G pretty much played out exactly how one of his other books (The 5G Myth) predicted, and the premise of The End of Telecoms History is that if we look at the data which suggest that bandwidth growth will not continue unabated forever, what does that mean?

I’ve a feeling there are a telecom execs quietly reading this book (while making sure that no sees them reading it) and planning for a potential future in a world of enough bandwidth to satisfy demand, and how this would impact their bottom lines and overall business model, even if outwardly everyone still claims the growth will continue forever.

The Iron Wire: A novel of the Adelaide to Darwin telegraph line – Garry Kilworth (Read)

A fun imagined romp about adventures in the bush while connecting a nation in the 18th century, the story is inspired by the real world events but are fictional, it’s a fun way to explore the topic and add bushrangers into the mix.

Rogers v. Rogers: The Battle for Control of Canada’s Telecom Empire – Alexandra Posadzki (Read)

Just finished this; I’ve worked with a lot of operators in the past, both big some small (the best ones are small), and it’s fascinating to understand at a board level how things get done in telecom giants, even if the Rogers’ family aren’t the best example of how to do this…

Chip War: The Fight for the World’s Most Critical Technology – Chris Miller (Read)

Without integrated circuits the telecom industry is back to relays and electromagnetically switching traffic (not that I’m against this).

Miller’s book outlines how we got to our current situation, and how the products coming out of TSMC and SMIC will shape the future of tech at a fundamental level.

How the World Was One – Arthur C Clarke (To Read)

Famed science fiction writer Arthur C Clarke had a penchant for scuba diving and communications (can relate) hence his interest in submarine telephony.

I read “Voice Across the Sea” a few years ago (on an actual paper based book no less!) but this is freely available as an eBook and I’m looking forward to reading it.

Introducing Elixir – Simon St. Laurent & J. David Eisenberg (Reading)

The dev team at Omnitouch are all about Elixir, and being an old dinosaur I figured I should at least learn the basics!

I’m still working my way through the book, having a folder of examples typed out from the book (I can’t learn through copy / paste!), enjoying it so far, even if I’m slower than I’d like.

Adventures in Innovation: Inside the Rise and Fall of Nortel – John Tyson

My first job was with Nortel, so I’ve got a bit of a soft spot of the former Canadian telecom behemoth, and never felt I’d had a satisfactory explanation as to where it all went wrong. I got this book expecting a bit more insight into the fall part, but this book gave an interesting account as to the design of things I’d never put much thought into before.

The Real Internet Architecture: Past, Present, and Future Evolution – Zave, Pamela;Rexford, Jennifer; (To Read)

This came from a recommendation on Twitter, I know almost nothing about it other than that, but I’m keen to dig into this.

Burn Book: A Tech Love Story – Kara Swisher (Read)

A fun insight into the life and times of the big tech.

The 6G Manifesto – William Webb

There’s a Simpsons’ scene where Lisa is buying an Al Gore book named “Sane Planning, Sensible Tomorrow” and says “I hope it’s as exciting as his other book, ‘Rational Thinking, Reasonable Future'”.

I can’t help but feel Webb’s books are kinda like this (in a good way).

Realism is so important; staying grounded in reality is critical. Operators who go chasing fairy tales of driving higher ARPUs with wacky ideas with no business case or demand from end customers (and generally pushed by vendors, rather than operators) will struggle to remain viable in the future if they pour all their cash into things that won’t see a return, so I’m looking forward to reading some sane ideas as to how to approach the unnecessary Gs.

Flash SMS Messages

Stumbled across these the other day, while messing around with some values on our SMSc.

Setting the Data Coding Scheme to 16 with GSM7 encoding flags the SMS as “Flash message”, which means it pops up on the screen of the phone on top of whatever the user is doing.

While reading a quality telecom blog bam! There’s the flash SMS popping over whatever I was reading.

Oddly while there’s plenty of info online about Flash SMS, it does not appear in the 3GPP specifications for SMS.

Turns out they still work, move over RCS and A2P, it’s all about Flash messages!

There’s no real secret to this other than to set the Data Coding Scheme to 16, which is GSM7 with Flash class set. That’s it.

If you’re interested in the internal machinations of how SMS works, I’ve got a few posts on the topicYou can find a list of them here.

Obviously to take advantage of this you’d need to be a network operator, or have access to the network you wish to deliver to. Recently more A2P providers are filtering non vanilla SMS traffic to filter out stuff like SMS OTA message or SIM specific messages, so there’s a good chance this may not work through A2P providers.

Mobile Network Code – 2 or 3 Digits?

Every mobile network broadcasts a Public Mobile Network Code – aka a PLMN. This 6 octet value is used to identify the network (Although this gets murky with shared codes and Private networks and when OEMs make them for codes they don’t own).

It’s made up of a Mobile Country Code followed by a Mobile Network code.

One of the guys at work asked a seemingly simple question, is the PLMN with MCC 505 and MNC 57 the same as MCC 505 MNC 057 – It’s on 6 octets after all.

So is Mobile Network Code 57 the same as Mobile Network Code 057 in the PLMN code?

The answer is no, and it’s a massive pain in the butt.

All countries use 3 digit Mobile Country Codes, so Australia, is 505. That part is easy.

The tricky part is that some countries (Like Australia) use 2 digit Mobile Network Codes, while others (Like the US) use 3 digit mobile network codes.

This means our 6 digit PLMN has to get padded when encoding a 2 digit Mobile Network Code, which is a pain, but also that by looking at the IMSI alone, you don’t know if the PLMN is 2 digit or 3 digit – IMSI 5055710000001 could be parsed as MCC 505, MNC 571 or MCC 505, MNC 57.

Why would you do this? Why would a regulator opt to have 1/10th the addressable size of network codes – I don’t know, and I haven’t been able to find an answer – If you know please drop a comment, I’d love to know.

So how do we handle this?

There are files in the SIM profile to indicate the length of the MNC, the Administrative Domain EF on the SIM allow us to indicate if the MNC is 2 digit or 3 digit, and the HPLMNwAct and the other *PLMN* EFs encode the PLMN as 6 digit, with or without padding, to allow differentiation.

That’s all well and good from a SIM perspective, but less useful for scenarios where you might be the Visited PLMN for example, and only see the IMSI of a Subscriber.

We worked on a project in a country that mixed both 2 digit and 3 digit Mobile Network Codes, under the same Mobile Country Code. Certain Qualcomm phones would do very very strange things, and it took us a long time and a lot of SIM OTA to resolve the issue, but that’s a story for another day…

private_data_dir on Ansible Runners called from Python

We’ve got a web based front end in our CRM which triggers Ansible Playbooks for provisioning of customer services, this works really well, except I’d been facing a totally baffling issue of late.

Ansible Plays (Provisioning jobs) would have variables set that they inherited from other Ansible Plays, essentially if I set a variable in Play X and then ran Play Y, the variable would still be set.

I thought this was an issue with our database caching the play data showing the results from a previous play, that wasn’t the case.

Then I thought our API that triggers this might be passing extra variables in that it had cached, wasn’t the case.

In the end I ran the Ansible function call in it’s simplest possible form, with no API, no database, nothing but plain vanilla Ansible called from Python

    # Run the actual playbook
    r = ansible_runner.run(
        private_data_dir='/tmp/',
        playbook=playbook_path,
        extravars=extra_vars,
        inventory=inventory_path,
        event_handler=event_handler_obj.event_handler,
        quiet=False
    )

And I still I had the same issue of variables being inherited.

So what was the issue? Well the title gives it away, the private_data_dir parameter creates a folder in that directory, called env/extravars which a JSON file lives with all the vars from all the provisioning runs.

Removing the parameter from the Python function call resolved my issue, but did not give me back the half a day I spent debugging this…

Tales from the Trenches – SMS Data Coding Scheme 0

The Data Coding Scheme (DCS or TP-DCS) header in an SMS body indicates what encoding is used in that message.

It means if we’re using UCS-2 (UTF16) special characters like Emojis etc, in in our message, the phone knows to decode the data in the message body using UTF, because the Data Coding Scheme (DCS) header indicates the contents are encoded in UTF.

Likewise, if we’re not using any fancy characters in our message and the message is encoded as plain old GSM7, we set set the DCS to 0 to indicate this is using GSM7.

From my experience, I’d always assumed that DCS0 (Default) == GSM7, but today I learned, that’s not always the case. Some SMSc entities treat DCS0 as Latin.

Let me explain why this is stupid and why I wasted a lot of time on this.

We can indicate that a message is encoded as Latin by setting the DCS to 0x03:

We cannot indicate that the message is encoded as GSM7 through anything other than the default alphabet (DCS 0).

Latin has it’s own encoding flag, if I wanted the message treated as Latin, I’d indicate the message encoding is Latin in the DCS bit!

I spent a bunch of time trying to work out why a customer was having issues getting messages to subscribers on another operator, and it turned out the other operator treats messages we send to them on SMPP with DCS0 as Latin encoding, and then cracks the sads when trying to deliver it.

The above diff shows the message we send (Right), and the message they dry to deliver (left).

Well, lesson learned…

Kamailio – NDB_Redis – undefined symbol: redisvCommand

I ran into this issue the other day while compiling Kamailio from source:

Jul 03 23:49:35 kamailio[305199]: ERROR: <core> [core/sr_module.c:608]: ksr_load_module(): could not open module </usr/local/lib64/kamailio/modules/ndb_redis.so>: /usr/local/lib64/kamailio/modules/ndb_redis.so: undefined symbol: redisvCommand
Jul 03 23:49:35 kamailio[305199]: CRITICAL: <core> [core/cfg.y:4024]: yyerror_at(): parse error in config file /etc/kamailio/kamailio.cfg, line 487, column 12-22: failed to load module

So what was going on?

Kamailio’s NDB Redis module relies on libhiredis which was installed, but the path it installs to wasn’t in the LD_LIBRARY_PATH environment variable.

So I added /usr/lib/x86_64-linux-gnu to LD_LIBRARY_PATH, recompiled and presto, the error is gone!

export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH

CGrateS – SERVER_ERROR: unexpected end of JSON input

I started seeing this error the other day when running CDRsv1.GetCDRs on the CGrateS API:

SERVER_ERROR: unexpected end of JSON input

It seemed related to certain CDRs in the cdrs table of StoreDB.

After some digging, I found the stupid simple problem:

I’d written too much data to extra_fields, leading MySQL to cut off the data mid way through, meaning it couldn’t be reconstructed as JSON by CGrateS again.

Like the rounding issue I had, this wasn’t an issue with CGrateS but with MySQL.

Quick fix:

sudo mysql cgrates -e "ALTER TABLE cdrs MODIFY extra_fields LONGTEXT;"

And new fields can exceed this length without being cut off.

Virtualizing VMware to run inside Proxmox

Like a lot of companies, we’re moving away from VMware, and in our case, shifting to Proxmox.

But that doesn’t mean we can get entirely away from VMware, but more that it’s not our hypervisor of choice anymore, and this means shifting our dev environments and lab off VMware to Proxmox first.

So today I sat down to try and shift everything to Proxmox, while keeping the VMware based VMs accessible until they can slowly die of bitrot.

A sane person would probably utilize Proxmox’s fancy new tool for migrating VMs from VMware to Proxmox, and it’s great, but in our case at least, it required logging into each VM and remapping NICs, etc, which is tricky on boxes I don’t have access to – Plus we need to keep some VMware capability for testing / labbing stuff up.

So I decided into install Proxmox onto the bare metal servers, and then create a VMware virtual machine inside the Proxmox stack, to host a VMware ESXi instance.

I started off inside VMware (Before installing any Proxmox) by moving all the VMs onto a single physical disk, which I then removed from the server, so as to not accidentally format the one disk I didn’t want to format.

Next I nuked the server and setup the new stack with Proxmox, which is a doddle, and not something I’ll cover.

Then I loaded a VMware ISO into Proxmox and started setting up the VM.

Now, nested virtualization is a real pain in the behind.

VMware doesn’t like not being run on bare metal, and it took me a good long amount of time to find the hardware config that I could setup in Proxmox that VMware would accept.

Create the VM in the Web UI; I found using a SATA drive worked while SCSI failed, so create a SATA based LVM image to use, and mount the datastore ISO.

Then edit /etc/pve/qemu-server/your_id.conf and replace the netX, args, boot and ostype to match the below:

args: -cpu host,+invtsc,-vmx
boot: order=ide2;net0
cores: 32
cpu: host
ide2: local:iso/VMware-VMvisor-Installer-8.0U2b-23305546.x86_64.iso,media=cdrom,size=620274K
memory: 128000
meta: creation-qemu=8.1.5,ctime=1717231453
name: esxi
net0: vmxnet3=BC:24:11:95:8A:50,bridge=vmbr0
numa: 0
ostype: l26
sata0: ssd-3600gb:vm-100-disk-1,size=32G
scsihw: pvscsi
smbios1: uuid=6066a9cd-6910-4902-abc7-dfb223042630
sockets: 1
vmgenid: 6fde194f-9932-463e-b38a-82d2e7e1f2dd

Now you can go and start the VM, but once you’ve got the VMware splash screen, you’ll need to press Shift + O to enter the boot options.

At the runweasle cdromBoot after it add allowLegacyCPU=true – This will allow ESXi to use our (virtual) CPU.

Next up you’ll install VMware ESXi just like you’ve probably done 100 times before (is this the last time?), and once it’s done installing, power off, we’ll have to make few changes to the VM definition file.

Then after install we need to change the boot order, by updating:

boot: order=sata0

And unmount the ISO:

ide2: none,media=cdrom

Now remember how I’d pulled the hard disk containing all the VMware VMs out so I couldn’t break it? Well, don’t drop that, because now we’re going to map that physical drive into the VM for VMware, so I can boot all those VMs.

I plugged in the drive and I used this to find the drive I’d just inserted:

fdisk -l

Which showed the drive I’d just added last, with it’s VMware file system.

So next we need to map this through the VM we just created inside Proxmox, so VMware inside Proxmox can access the VMware file system on the disk filled with all our old VMware VMs.

VM. VM. VM. The word has lost all meaning to me at this stage.

We can see the mount point of our physical disk; in our case is /dev/sdc so that’s what we’ll pass through to the VM.

Here’s my final .conf file for the Proxmox VM:

args: -cpu host
balloon: 1024
boot: order=sata0
cores: 32
ide2: none,media=cdrom
kvm: 1
memory: 128000
meta: creation-qemu=8.1.5,ctime=1717231453
name: esxi
net0: vmxnet3=BC:24:11:95:8A:50,bridge=vmbr0
numa: 0
ostype: l26
sata0: ssd-3600gb:vm-100-disk-1,size=32G
sata1: /dev/sda
scsihw: pvscsi
smbios1: uuid=6066a9cd-6910-4902-abc7-dfb223042630
sockets: 1
vmgenid: 6fde194f-9932-463e-b38a-82d2e7e1f2dd

Now I can boot up the VM, log into VMware and behold, our disk is visible:

Last thing we need to do is get it mounted as a VMware Datastore so we can boot all those VMs.

For that, we’ll enable SSH on the ESXi box and SSH into it.

We’ll use the esxcfg-volume -l command to get the UUID of our drive

[root@localhost:~] esxcfg-volume -l
Scanning for VMFS-6 host activity (4096 bytes/HB, 1024 HBs).
VMFS UUID/label: 6513c05a-43db8c20-db0f-c81f66ea471a/FatBoy
Can mount: Yes
Can resignature: Yes
Extent name: t10.ATA_____QEMU_HARDDISK___________________________QM00007_____________:1 range: 0 - 1716735 (MB)

In my case the drive is aptly named “FatBoy”, so we’ll grab the UUID up to the slash, and then use that as the Mount parameter:

[root@localhost:~] esxcfg-volume -M 6513c05a-43db8c20-db0f-c81f66ea471a
Persistently mounting volume 6513c05a-43db8c20-db0f-c81f66ea471a

And now, if everything has gone well, after logging into the Web UI, you’ll see this:

Then the last step is going to be re-registering all the VMs, you can do this by hand, by selecting the .vmx file and adding it.

Alternately, if you’re lazy like me, I wrote a little script to do the same thing:

[root@localhost:~] cat load_vms3.sh 


#!/bin/bash

# Datastore name
DATASTORE="FatBoi/"

# Log file to store the output
LOG_FILE="/var/log/register_vms.log"

# Clear the log file
> $LOG_FILE

echo "Starting VM registration process on datastore: $DATASTORE" | tee -a $LOG_FILE

# Check if datastore directory exists
if [ ! -d "/vmfs/volumes/$DATASTORE" ]; then
echo "Datastore $DATASTORE does not exist!" | tee -a $LOG_FILE
exit 1
fi

# Find all .vmx files in the datastore and register them
find /vmfs/volumes/$DATASTORE -type f -name "*.vmx" | while read VMX_PATH; do
echo "Registering VM: $VMX_PATH" | tee -a $LOG_FILE
vim-cmd solo/registervm "$VMX_PATH" | tee -a $LOG_FILE
done

echo "VM registration process completed." | tee -a $LOG_FILE

[root@localhost:~] sh load_vms3.sh

Now with all your VMs loaded, you should almost be ready to roll and power them all back on.

And here’s where I ran into another issue:

Luckily the Proxmox wiki had the answer:

For an Intel CPU here’s what you run on the Proxmox hypervisor:

echo "options kvm-intel nested=Y" > /etc/modprobe.d/kvm-intel.conf

But before we reboot the Hypervisor (Proxmox) we’ll have to reboot the VMware hypervisor too, because here’s something else to make you punch the screen:

Luckily we can fix this one globaly.

SSH into the VMware box, edit /etc/vmware/config.xml file and add:

vhv.enable = "FALSE"

Which will disable the performance counters.

Now power off the VMware VM, and reboot the Proxmox hypervisor, when it powers on again, Proxmox will allow nested virtualization, and when you power back on the VMware VM, you’ll have performance counters disabled, and then, you will be done.

Yeah, not a great use of my Saturday, but here we are…

DNS’ role in S8-Home Routing Roaming

S8 Home Routing is a really simple concept, the traffic goes from the SGW in the visited PLMN to the PGW in the home PLMN, so the PCRF, OCS/OFCS, IMS, IP Addresses, etc, etc, are all in the home network, and this avoids huge amounts of complexity.

But in order for this to work, the visited network MME needs to find the PGW of the home network, and with over 700 roaming networks in commercial use, each one with potentially hundreds of unique APNs each routing to a different PGW, this is a tricky proposition.

If you’ve configured your PGW peers statically on your MME, that’s fine, but it doesn’t scale very well – And if you add an MVNO who wants their own PGW for serving their APN, well you’ll be adding some complexity there to, so what to do?

Well, the answer is DNS.

By taking the APN to be served, the home PLMN and the interface type desired, with some funky DNS queries, our MME can determine which PGW should be selected for a request.

Let’s take a look, for a UE from MNC XXX MCC YYY roaming into our network, trying to access the “IMS” APN.

Our MME knows the network code of the roaming subscriber from the IMSI is MNC XXX, MCC YYY, and that the UE is requesting the IMS APN.

So our MME crafts a DNS request for the NAPTR query for ims.apn.epc.mncXXX.mccYYY.3gppnetwork.org:

Because the domain is epc.mncXXX.mccYYY.3gppnetwork.org it’s routed to the authoritative DNS server in the home network, which sends back the response:

We’ve got a few peers to pick from, so we need to filter this list of Answers to only those that are relevant to us.

First we filter by the Service tag, whihc for each listed peer shows what services that peer supports.

But since we’re looking for S8, we need to find a peer who’s “Service” tag string contains:

x-3gpp-pgw:x-s8-gtp

We’re looking for two bits of info here, the presence of x-3gpp-pgw in the Service to indicate that this peer is a PGW and x-s8-gtp to indicate that this peer supports the S8 interface.

A service string like this:

x-3gpp-pgw:x-s5-gtp

Would be excluded as it only supports S5 not S8 (Even though they are largely the same interface, S8 is used in roaming).

It’s also not uncommon to see both services indicated as supported, in which case that peer could be selected too:

x-3gpp-pgw:x-s5-gtp:x-s8-gtp

(The answers in the screenshot include :x-gp which means the PGWs advertised are also co-located with a GGSN)

So with our answers whittled down to only those that meet our needs, we next use the Order and the Preference to pick our best candidate, this is the same as regular DNS selection logic.

From our candidate, we’ve also got the Regex Replacement, which allows our original DNS request to be re-written, which allows us to point at a single peer.

In our answer, we see the original request ims.apn.epc.mncXXX.mccYYY.3gppnetwork.org is to be re-written to topon.lb1.pgw01.epc.mncXXX.mccYYY.3gppnetwork.org.

This is the FQDN of the PGW we should use.

Now we know the FQND we should use, we just do an A-Record lookup (Or AAAA record lookup if it is IPv6) for that peer we are targeting, to turn that FQDN into an IP address we can use.

And then in comes the response:

So now our MME knows the IP of the PGW, it can craft a Create Session request where the F-TEID for the S8 interface has the PGW IP set on it that we selected.

For more info on this TS 129.303 (Domain Name System Procedures) is the definitive doc, but the GSMA’s IR.88 “LTE and EPC Roaming Guidelines” provides a handy reference.

vCenter – Partition Folder Isolation

While vCenter doesn’t really do contexts / mutli-tenants / VPCs like the hyperscalers, there are simple (ish) ways to do context separation inside VMware vCenter.

This means you can have a user who only has access to say a folder of VMs, but not able to see VMs outside of that folder.

Create a new Role inside vCenter from Administration -> Roles -> Add

Give the role all Virtual Machine privileges:

Create a new account (Can be on AD if you’re not using Local accounts, this is just our lab so I’ve created it as a Local account) for the user. We’re using this account for Ansible so we’ve used “Demo-Ansible-User” as the username

Now create a folder for the group of VMs, or pick an existing folder you want to give access to.

Right click on the folder and select “Add Permission”.

We give permission to that user with that role on the folder and make sure you tick “Propagate to children” (I missed this step before and had to repeat it):

If you are using templates, make sure the template is either in the folder, or apply the same permission to the template, by right clicking on it, Add Permission, same as this.

Finally you should be able to log in as that user and see the template, and clone it from the web UI, or create VMs but only within that folder.

CGrateS – ActionTriggers

In our last post we looked at Actions and ActionPlans, and one of the really funky things we can do is setting ActionPlans to trigger on a time schedule or setting ActionTriggers to trigger on an event.

We’re going to build on the examples we had on the last post, so we’ll assume your code is up to the point where we’ve added a Signup Bonus to an account, using an ActionPlan we assigned when creating the account.

In this post, we’re going to create an action that charges $6, called “Action_Monthly_Charge“, and tie it to an ActionPlan called “ActionPlan_Monthly_Charge“, but to demo how this works rather than charging this Monthly, we’re going to charge it every minute.

Then with our balances ticking down, we’ll set up an ActionTrigger to trigger when the balance drops below $95, and alert us.

Defining the Monthly Charge Action

The Action for the Monthly charge will look much like the other actions we’ve defined, except the Identifier is *debit so we know we’re deducting from the balance, and we’ll log to the CDRs table too:

# Action to add a Monthly charge of $6
Action_Monthly_Charge = {
    "id": "0",
    "method": "ApierV1.SetActions",
    "params": [
        {
          "ActionsId": "Action_Monthly_Charge",
          "Actions": [
              {
                'Identifier': '*debit',
                'BalanceType': '*monetary',
               'Units': 6,
               'Id': 'Action_Monthly_Charge_Debit',
               'Weight': 70},
              {
                  "Identifier": "*log",
                  "Weight": 60,
                  'Id' : "Action_Monthly_Charge_Log"
              },
              {
                  "Identifier": "*cdrlog",
                  "BalanceId": "",
                  "BalanceUuid": "",
                  "BalanceType": "*monetary",
                  "Directions": "*out",
                  "Units": 0,
                  "ExpiryTime": "",
                  "Filter": "",
                  "TimingTags": "",
                  "DestinationIds": "",
                  "RatingSubject": "",
                  "Categories": "",
                  "SharedGroups": "",
                  "BalanceWeight": 0,
                  "ExtraParameters": "{\"Category\":\"^activation\",\"Destination\":\"Recurring Charge\"}",
                  "BalanceBlocker": "false",
                  "BalanceDisabled": "false",
                  "Weight": 80
              },
          ]}]}
pprint.pprint(CGRateS_Obj.SendData(Action_Monthly_Charge))

Next we’ll need to wrap this up into an ActionPlan, this is where some of the magic happens. Inside the action plan we can set a once off time, or a recurring time, kinda like Cron.

We’re setting the time to *every_minute so things will happen quickly while we watch, this action will get triggered every 60 seconds. In real life of course, for a Monthly charge, we’d want to trigger this Action monthly, so we’d set this value to *monthly. If we wanted this to charge on the 2nd of the month we’d set the MonthDays to “2”, etc, etc.

# # Create ActionPlan using SetActionPlan to trigger the Action_Monthly_Charge
SetActionPlan_Daily_Action_Monthly_Charge_JSON = {
    "method": "ApierV1.SetActionPlan",
    "params": [{
        "Id": "ActionPlan_Monthly_Charge",
        "ActionPlan": [{
            "ActionsId": "Action_Monthly_Charge",
            "Years": "*any",
            "Months": "*any",
            "MonthDays": "*any",
            "WeekDays": "*any",
            "Time": "*every_minute",
            "Weight": 10
        }],
        "Overwrite": True,
        "ReloadScheduler": True
    }]
}
pprint.pprint(CGRateS_Obj.SendData(
    SetActionPlan_Daily_Action_Monthly_Charge_JSON))

Alright, but now what’s going to happen?

If you think the accounts will start getting debited every 60 seconds after applying this, you’d be wrong, we need to associate this ActionPlan with an Account first, this is how we control which accounts get which ActionPlans tied to them, to do this we’ll use the SetAccout API again we’ve been using to create accounts:

# Create the Account object inside CGrateS & assign ActionPlan_Signup_Bonus and ActionPlan_Monthly_Charge
Create_Account_JSON = {
    "method": "ApierV2.SetAccount",
    "params": [
        {
            "Tenant": "cgrates.org",
            "Account": str(Account),
            "ActionPlanIds": ["ActionPlan_Signup_Bonus", "ActionPlan_Monthly_Charge"],
            "ActionPlansOverwrite": True,
            "ReloadScheduler":True
        }
    ]
}
print(CGRateS_Obj.SendData(Create_Account_JSON))

So what’s going to happen if we run this?

Well, for starters the ActionPlan named “ActionPlan_Signup_Bonus” is going to be triggered, as in the ActionPlan it’s Timing is set to *asap, so CGrateS will apply the corresponding Action (“Action_Add_Signup_Bonus“) right away, which will credit the account $99.

But a minute after that, we’ll trigger the ActionPlan named “ActionPlan_Monthly_Charge”, as the timing for this is set to *every_minute, when the Action “Action_Monthly_Charge” is triggered, it’s going to be deducting $6 from the balance.

We can check this by using the GetAccount API:

# Get Account Info
pprint.pprint(CGRateS_Obj.SendData({'method': 'ApierV2.GetAccount', 'params': [
              {"Tenant": "cgrates.org", "Account": str(Account)}]}))

You should see a balance of $99 to start with, and then after 60 seconds, it should be down to $93, and so on.

{'error': None,
 'id': None,
 'result': {'ActionTriggers': None,
            'AllowNegative': False,
            'BalanceMap': {'*monetary': [{'Blocker': False,
                                          'Categories': {},
                                          'DestinationIDs': {},
                                          'Disabled': False,
                                          'ExpirationDate': '2023-11-17T14:57:20.71493633+11:00',
                                          'Factor': None,
                                          'ID': 'Balance_Signup_Bonus',
                                          'RatingSubject': '',
                                          'SharedGroups': {},
                                          'TimingIDs': {},
                                          'Timings': None,
                                          'Uuid': '3a896369-8107-4e32-bcef-2d078c981b8a',
                                          'Value': 99,
                                          'Weight': 1200}]},
            'Disabled': False,
            'ID': 'cgrates.org:Nick_Test_123',
            'UnitCounters': None,
            'UpdateTime': '2023-10-17T14:57:21.802521707+11:00'}}

Triggering Actions based on Balances with ActionTriggers

Okay, so we’ve set up recurring charges, now let’s get notified if the balance drops below $95, we’ll start, like we have before, with defining an Action, this will log to the CDRs table, HTTP post and write to syslog:


#Define a new Action to send an HTTP POST
Action_HTTP_Notify_95 = {
    "id": "0",
    "method": "ApierV1.SetActions",
    "params": [
        {
          "ActionsId": "Action_HTTP_Notify_95",
          "Actions": [
              {
                  "Identifier": "*cdrlog",
                  "BalanceId": "",
                  "BalanceUuid": "",
                  "BalanceType": "*monetary",
                  "Directions": "*out",
                  "Units": 0,
                  "ExpiryTime": "",
                  "Filter": "",
                  "TimingTags": "",
                  "DestinationIds": "",
                  "RatingSubject": "",
                  "Categories": "",
                  "SharedGroups": "",
                  "BalanceWeight": 0,
                  "ExtraParameters": "{\"Category\":\"^activation\",\"Destination\":\"Balance dipped below $95\"}",
                  "BalanceBlocker": "false",
                  "BalanceDisabled": "false",
                  "Weight": 80
              },
              {
                  "Identifier": "*http_post_async",
                  "ExtraParameters": "http://10.177.2.135/95_remaining",
                  "ExpiryTime": "*unlimited",
                  "Weight": 700
              },
              {
                  "Identifier": "*log",
                  "Weight": 1200
              }
          ]}]}
pprint.pprint(CGRateS_Obj.SendData(Action_HTTP_Notify_95))

Now we’ll define an ActionTrigger to check if the balance is below $95 and trigger our newly created Action (“Action_HTTP_Notify_95“) when that condition is met:


#Define ActionTrigger
ActionTrigger_95_Remaining_JSON = {
    "method": "APIerSv1.SetActionTrigger",
    "params": [
        {
            "GroupID" : "ActionTrigger_95_Remaining",
            "ActionTrigger": 
                {
                    "BalanceType": "*monetary",
                    "Balance" : {
                        'BalanceType': '*monetary',
                        'ID' : "*default",
                        'BalanceID' : "*default",
                        'Value' : 95,
                        },
                    "ThresholdType": "*min_balance",
                    "ThresholdValue": 95,
                    "Weight": 10,
                    "ActionsID" : "Action_HTTP_Notify_95",
                },
            "Overwrite": True
        }
    ]
}
pprint.pprint(CGRateS_Obj.SendData(ActionTrigger_95_Remaining_JSON))

We’ve defined the ThresholdType of *min_balance, but we could equally set this to ThresholdType to *max_balance, *balance_expired or trigger when a certain Counter has been triggered enough times.

Adding an ActionTrigger to an Account

Again, like the ActionPlan we created before, before the ActionTrigger we just created will be used, we need to associate it with an Account, for this we’ll use the AddAccountActionTriggers API, specify the Account and the ActionTriggerID for the ActionTrigger we just created.


#Add ActionTrigger to Account 
Add_ActionTrigger_to_Account_JSON = {
    "method": "APIerSv1.AddAccountActionTriggers",
    "params": [
        {
            "Tenant": "cgrates.org",
            "Account": str(Account),
            "ActionTriggerIDs": ["ActionTrigger_95_Remaining"],
            "ActionTriggersOverwrite": True
        }
    ]
}
pprint.pprint(CGRateS_Obj.SendData(Add_ActionTrigger_to_Account_JSON))

If we run this all together, creating the account with the “ActionPlan_Signup_Bonus” will give the account a $99 Balance. But after 60 seconds, “ActionPlan_Monthly_Charge” will kick in, and every 60 seconds after that, at which point the balance will get to below $95 when CGrateS will trigger the ActionTriggerActionTrigger_95_Remaining” and get the HTTP POST to the HTTP endpoint and log entry:

We can check on this using the ApierV2.GetAccount method, where we’ll see the ActionTrigger we just defined.

Checking out the LastExecutionTime we can see if the ActionTrigger been triggered or not.

So using this technique, we can notify a customer when they’ve used a certain amount of their balance, but we can lock out Accounts who have spent more than their allocated spend limit by setting an Action that suspends the Account once it reaches a certain level. We notify customers when balance expires, or if a certain number of counters has been triggered.

As always I’ve put all the code used in this example, from start to finish, up on GitHub.