IBM Message Gateway monitoring with Instana

Tony Hickman
6 min readJan 31, 2022

One of the projects I have worked on implemented IBM Message Gateway running within an OpenShift Cluster. IBM Message Gateway comes with its own “WebUI” for management and monitoring but I was asked if it would be possible to monitor IBM Message Gateway using Instana. Always up for a challenge I enlisted the skills of one of my colleagues in IBM Client Engineering — Alex Seymour — and together we set about exploring the “art of the possible”

In the traditional approach of “divide and conquer”, Alex took on the challenge of how to present the monitoring information to Instana and I looked into how to get monitoring information from IBM Message Gateway.

First lets look at the IBM Message Gateway question… So after reading the “Knowledge Centre” documentation it was clear that there were couple of options:

  1. Use the REST interface — This would give the richest access to the information being captured but would require polling of the REST API
  2. Consume the metrics published by the server by subscribing to the $SYS topic — This would use the pub/sub model which aligns with the purpose of MQTT and the IBM Message Gateway

As you can see option 2 was the natural choice and details of this “feed” can be found in the documentation here. Although simple there are a couple of aspects of this approach to consider:

  1. Data is published every 2 seconds which may be a bit to frequent so some form of rate limiting when pushing to Instana could be needed
  2. A lot of the metrics are cumulative since the last restart and so looking for delta changes would require additional programming.

So with the IBM Message Gateway requirement solved lets look at what we found for the Instana end.

Firstly out of the box there is no Instana sensor for IBM Message Gateway so another approach needed to be used. After a read through of the Instana documentation and some guidance from our colleagues in Instana we found that the simplest approach would be to use the generic “statsd” sensor which is a base part of Instana. To use this sensor the Instana agent configuration needs to be updated. As we are running with in an OpenShift cluster this meant updating the instana-agent configmap in the instana-agent configmap to add the following:

com.instana.plugin.statsd:
enabled: true
ports:
udp: 8125
mgmt: 8126
bind-ip: "0.0.0.0"
flush-interval: 10

This enables the statsd sensor to run on every worker node and is accessible via that worker nodes IP address. With this enabled we were able to send statsd events into Instana and see them tracked agaist the worker which we targeted.

So we now had a way to get metrics from IBM Message Gateway and a way to push them into Instana but we still needed to do more work. Firstly we needed to understands statsd in a bit more detail and secondly we needed to figure out how to link the two environments.

In the case of statsd things turned out to be really simple. The statsd protocol consists of a message which contains a label for the data, the data and an indicator of the nature of the data (counter, gauge etc). For details see the statsd github documentation. Now the labels don’t have any hierarchy so we would need to make sure that we considered this when creating the metrics to ensure that the label propagates the right level of detail re what the metric is representing. Fortunately the IBM Message Gateway metrics provide more of a structure. To start with there is a topic level split based on the metrics objectType i.e. storage, memory, endpoint and topic. Next the published payload provides details on the server node name and name of the object emitting the metrics. For exmaple:

{
"Version":"5.0.0.2",
"NodeName":"messaging-server-0",
"TimeStamp":"2022-01-31T13:59:16.201Z",
"ObjectType":"Endpoint",
"Name":"DemoExt",
"Interface":"All",
"Enabled":true,
"TotalConnections":8221,
"ActiveConnections":6,
"BadConnections":8199,
"MsgRead":70,
"MsgWrite":4787586,
"BytesRead":495619,
"BytesWrite":2031134005,
"LostMessageCount":0,
"WarnMessageCount":0,
"ResetTime":"2022-01-24T11:26:46.533Z"
}

Here you can see that the ObjectType is Endpoint and the NodeName is messaging-server-0 at a higher level the topic name is $SYS/ResourceStatistics/Endpoint. So if the statsd label is defined as<NodeName>.<ObjectType>.<Name>.<Metric> we are in pretty good shape. As well as the label the metric can be sent as a counter or a gauge . Counters being positioned as a value that is incremented or decremented by the metric where as Gauge’s being absolute values that are replaced by the new metric value. For the IBM Message Gateway metrics Alex and I decide to stick with gauges for all of them.

After that the only thing left was to build something to move the metrics from IBM Message Gateway and into Instana. As is often the case I once again turned to my old friend NodeRED. NodeRED has out of the box support for MQTT so subscribing to the $SYS topic would be easy and after a scan of available Nodes, Alex found a very useful “statsd” node node-red-contrib-statsd . With all the parts in place I created the following flow:

IBM Message Gateway to Instana Flow

In this flow Initialise is setting counters for the metrics groups so that the flow only sends after “X” number of instances of the metric group have been received from IBM Message Gateway. Just now the flow is hard coded to 10 we plan to move that to a environment variable.

The main body of the flow is pretty simple. For each metrics message recieved we…

  1. Convert the message to JSON
  2. We copy the NodeName and ObjectType from the msg.payload into the root of the msg object
  3. We split the msg routing based on the ObjectType
  4. For each ObjectType we check if we should process this instance of metrics, if not nullis returned and the flow ends, otherwise the msg object is returned and the flow continues
  5. Split metric object in msg.payload into is elements
  6. For each element check if is one “Version”, “NodeName” or “TimeStamp” and if it is return null to stop flow as we are not interested in these ones. For all others set msg.topic to the name of the statsd metric and return msg to keep the flow processing
  7. The statsd node send the metric to the configured statsd server which in our case is a worker node IP address.

We tested the flow a simple NodeRED environment we installed in the the namespace where the IBM Message Gateway runs . Once we had proven it was working, Alex build an image containing the flow and connection details.

With everything in place the the metrics being routed using the NodeRED flow we could look at the Instana environment. Looking at the worker who’s IP was configured in the statsd node in out flow we could see statsd metrics.

statsd metrics in worker node

Now we had the metrics in Instana it was a simple job to create a custom dashboard to show them.

Basic Dashboard

So thats it we have the basics in place to allow IBM Message Gateway metrics to be monitored and presented via Instana. There are some more bits we need to look at:

  1. Passing the statsd target IP address in — we have it being passed as an environment variable but the statsd node code needs to be updated to allow it to be passed in dynamically
  2. Implement so level of past state memory so we can present “delta” changes rather than new cumulative value. This may be tricky in some cases
  3. Pass the “throttling” value in via an environment variable rather than hard coding

--

--

Tony Hickman

I‘ve worked for IBM all of my career and am an avid technologist who is keen to get his hands dirty. My role affords me this opportunity and I share what I can