Monitoring DFW Heap Usage


When operating NSX-v DFW at scale, the recommendation from VMware on memory/heap usage per host is to stick to a maximum of around 80% usage. But then this often raises the question of how to monitor the actual memory/heap the the DFW is consuming per host.

As the NSX-v Distributed Firewall is running in kernel on the vSphere hypervisor, it uses leverages vSphere memory heaps. So to monitor the memory usage per host, it is these vSphere heaps that need to be monitored.

To check heap usage on a vSphere host, you must use the undocumented vsish command.

You can read more about the vsish command on Willams blog here.

There are a number of different heaps used by the NSX-v Distributed Firewall, and for this blog post, we will be focusing on NSX-v 6.2.3 and above. The following 1-Liner will display some of the relevant pieces of information.

Which will give you an output similar to the following:

And you can see from the output above, i’ve highlighted the lines which outline the percentage free of the heaps maximum size. This number should be above 20%.

Seeing the output above on the command line is all well and good, however the command is required to be run interactively on the command line of every host, which does not lend itself to being very efficient when running 10’s, 100’s or even 1000’s of vSphere hosts.

What can we do about that? Well, what if we could write a script that would extract the “percent free of max size” and send it to your syslog server for each of the heaps used by the NSX-v Distributed Firewall. That would be cool, wouldn’t it.

Here is a basic script that does just that.

Save the script to somewhere on your vSphere host. For this example I will save it in /

Make sure to change the permissions on the file to allow execution – chmod +x heapmon.sh

If run interactively, it will show the heap name, heap id, percent free of max size & number of failed allocations.

/var/log/syslog.log will also contain the following log entries

If for some reason you haven’t setup any syslog targets on your vSphere hosts, and need to do this manually, here are some quick commands you can use. Obviously you will need to replace the IP Address with the IP/FQDN of your own syslog server.

To have this script run periodically, you can add it to the root user crontab, which is located at /var/spool/cron/crontabs/root

For my example I have configured the script run every 6 hours with the following crontab entry

Once you edit the crontab for the root user, you need to restart the crond service. Detailed instructions for specific versions of vSphere can be found in the following article – VMware KB 1033346

Find the process id of crond

Kill the crond process

Start the busy box crond process

Since this information is sent to the syslog.log file via the logger function, the logs will also appear in your syslog platform of choice. This is what they will look like in Log Insight by default.

Once this information is in Log Insight, it is possible to extract the fields in the syslog message so that they are searchable by Log Insight and you can then perform meaningful actions with them.

I’ve created a simple Log Insight Content Pack which has a sample dashboard and all the extracted fields already configured to be used as a starting point for setting up heap monitoring.

Download Link – SneakU vSIP Heap Monitoring Content Pack

Once installed, and you have some heap information being received by Log Insight, your logs should look something similar to the following.

 

With Log Insight now having an understanding of the format of the information, it is possible to create charts, queries and dashboards to display or alert on the data. Here is a screenshot of a sample Dashboard widget that shows the minimum values seems for each sip heap across each host for the past 24 hours.

And Log Insight will also allow an alert to be created based on the syslog data, which means that you can set an alert when any of the heaps drops below 20% free.

 

 

This post should provide a starting point on how to monitor DFW heap usage across your environment.

 

Leave a Reply