Another tool we’ve got a fair amount of mileage on is what we internally refer to as the Nagios Shell (ngsh). We have used Nagios since our early days (circa 2005), and it has served us very well to keep an eye on our infrastructure. Over time, we started writing tools to poke and probe Nagios in one way or another. The end result of this process was a hodgepodge of tools that parsed status.dat and did other things they really shouldn’t. We lacked consistence across the toolset, some of them took forever to run (we have a decently large environment), and others failed in mysterious ways.
About a year and a half ago we decided to stop the madness, and were lucky enough to run across Mathias Kettner’s fantastic MK Livestatus module. We consolidated eight different tools into a single one, added richer functionality in terms of querying Nagios, and put away mysterious failures we had grown accustomed to live with, knowing status.dat parsing was biting us.
We christened the new tool the Nagios Shell, since it was intended to run on the CLI, and that opened an entire new set of functionality and correctness in managing our environment.
The current incantation is comprised of two scripts, ngsh (a shell script) and ngsq (Python script), and requires that you build MK Livestatus into your Nagios instance. A new generation is on the works, one which replaces this mixture with a toolkit written entirely in Ruby and provides far more flexibility than the current one, including a RESTish interface so that Nagios can be controlled over a HTTP interface (more on that soon).
The README has some brief examples of usage, and soon the wiki will contain a roadmap of improvements.