NSClient++ (or nscp as I tend to call it nowadays) aims to be a simple yet powerful and secure monitoring daemon for windows operating systems. It is built for Nagios but nothing in the daemon is actually Nagios specific and could probably with little or no change be integrated in any monitoring software that supports running user tools for polling.
The structure of the daemon is a simple NT service that loads a plug-ins to an internal stack. The plug-ins can then request data (poll performance data) from the other plug-ins through the internal stack. As of now there are a few plug-ins for basic performance data collection.
NSClient++ can be extended in two ways you can either write your own plug-in to or you can execute an external script (as of now batch/exe/*). Writing your own plug-in is of course the most powerful way but requires knowledge of C++ or other language which can produce DLLs and interface with regular C programs.
NSClient++ comes with a few modules out of the box that does various checks. A list of the modules and there potential use is listed below here.
This module has various disk related checks such as drive/directory usage and hopefully in the future more similar such checks. Feel free to request checks that you feel are good to have.
Command |
Description |
CheckFileSize |
Check the size of one or more files or directories. |
CheckDriveSize |
Check the size of one or more Drives |
This check does a recursive size calculation of the directory (or file) specified. A request has one or more options described in the table below.
Option |
Values |
Description |
MaxWarn |
Size GMKB |
The maximum size the directory is allowed before a warning state is returned. |
MaxCrit |
Size GMKB |
The maximum size the directory is allowed before a critical state is returned. |
MinWarn |
Size GMKB |
The minimum size the directory is allowed before a warning state is returned. |
MinCrit |
Size GMKB |
The minimum size the directory is allowed before a critical state is returned. |
ShowAll |
None |
A Boolean flag to show size of directories that are not in an alarm state. If this is not specified only drives with an alarm state will be listed in the resulting string. |
File |
File or directory name |
The name of the file or directory that should have its size calculated. Notice that large directory structures will take a long time to check. |
File:<alias> |
File or directory name |
Same as the file option but using a short alias in the returned data. |
The “Size gmkb” is a way to simply specify large sizes simply add a postfix describing the unit you want thus 1k is the same as 1024. 1m is the same as 1048576 etc.
Example:
CheckFileSize ShowAll MaxWarn=1024M MaxCrit=4096M File:WIN=c:\WINDOWS\*.*
Will return something along the lines of this:
WIN: 1G (2110962363B)|WIN:2110962363:1073741824:4294967296
Option |
Values |
Description |
MaxWarn |
Size GMKB% |
The maximum size the directory is allowed before a warning state is returned. |
MaxCrit |
Size GMKB% |
The maximum size the directory is allowed before a critical state is returned. |
MinWarn |
Size GMKB% |
The minimum size the directory is allowed before a warning state is returned. |
MinCrit |
Size GMKB% |
The minimum size the directory is allowed before a critical state is returned. |
ShowAll |
None |
A Boolean flag to show size of directories that are not in an alarm state. If this is not specified only drives with an alarm state will be listed in the resulting string. |
Drive |
A Drive letter |
The letter of the drive to check. Notice that the drive has to be a fixed drive. |
The “Size GMKB%” is similar “Size GMKB” but with an added option of specifying the value in percent of disk space. For example 80% will mean a value of 80% of total drive space.
Simple module to check for errors in the system event log. This module is in an early stage and feedback would be appreciated.
Command |
Description |
CheckEventLog |
Check for errors in the event log. |
Yet to be written
A quick introduction though:
First option is the logfile to parse. (Application, System etc)
Options have the following format.
warn.require.eventType=warning
<alert>.<action>.<key>=<value>
Where “alert” is either warning or critical or all depending on the type of alert to generate if this rule is matched.
Where action is either require or exclude depending on if the options should be required by the state (if this rule is needed to generate an alert) or excluded (if the rule is matched the alert cannot be generated).
Where key is one of the following value:
eventType
eventSource
eventSourceRegexp
generatedBeforeDelta
generatedAfterDelta
writtenBeforeDelta
writtenAfterDelta
regexp
A sample is shown below:
Application critical.require.eventType=error truncate=1024 descriptions all.exclude.eventSourceRegexp=^(Win|Msi|NSClient\+\+|Userenv|ASP\.NET|LoadPerf|Outlook|Application E|NSClient).*
A module to check various system related things. A list of the modules and there potential use is listed below here.
Command |
Description |
checkCPU |
Check CPU load |
checkUpTime |
Check system uptime |
checkServiceState |
Check state of a service |
checkProcState |
Check state of a process |
checkMem |
Check memory usage (page) |
This check calculates an average of CPU usage for a specified period of time. The data is always collected in the background and the size and interval is configured from the CPUBufferSize and CheckResolution options. A request has one or more options described in the table below.
Option |
Values |
Description |
warn |
load in % |
Load to go above to generate a warning. |
crit |
load in % |
Load to go above to generate a critical state. |
time |
time |
The time to calculate average over. |
nsclient |
|
Flag to make the plug in run in nsclient compatibility mode |
Time can use any of the following postfixes. w=week, d=day, h=hour, m=minute and s=second.
Example:
checkCPU warn=80 crit=90 time=20m time=10s time=4
This will check CPU load 20minutes and 10seconds and 4 “units” (depends on the current CheckInterval) if any of the loads are above 80% a warning state will be returned and if any of the loads are above 90% a critical state will be returned.
This check checks the uptime of a server and if the time is less then the times given as arguments a state is returned.
Option |
Values |
Description |
warn |
time |
Minimum uptime time to not generate a warning state. |
crit |
time |
Minimum uptime time to not generate a critical state. |
nsclient |
|
Flag to make the plug in run in nsclient compatibility mode |
This check checks the state of one or more service on the system and generates a critical state if any service is not in the required state.
Option |
Values |
Description |
ShowAll |
|
A flag to toggle if all service states should be listed. |
ShowFail |
(default) |
A flag to indicate if only failed service states should be listed. |
service=state |
|
A service name and a state the service should have. The state can be either started or stopped. If no state is given started is assumed. |
Example
checkServiceState showAll myService MyStoppedService=stopped
This check checks the state of one or more processes on the system and generates a critical state if any process is not in the required state
Option |
Values |
Description |
ShowAll |
|
A flag to toggle if all process states should be listed. |
ShowFail |
(default) |
A flag to indicate if only failed process states should be listed. |
process=state |
|
A process name and a state the process should have. The state can be either started or stopped. If no state is given started is assumed. The name is the name of the executable. |
Example
checkProcState showAll my.exe quake.exe=stopped word.exe=started
This check checks the memory (page) usage and generates a state if the memory is above or below give parameters.
Option |
Values |
Description |
MaxWarn |
Size GMKB% |
The maximum size allowed before a warning state is returned. |
MaxCrit |
Size GMKB% |
The maximum size allowed before a critical state is returned. |
MinWarn |
Size GMKB% |
The minimum size allowed before a warning state is returned. |
MinCrit |
Size GMKB% |
The minimum size allowed before a critical state is returned. |
ShowAll |
None |
A Boolean flag to show size even if no state is returned (?). |
A module that logs all messages to file if no logging module is loaded no error messages will be logged thus it is hard to find problems. I recommend using this module at least until NSClient++ becomes stable. Again not a command handler module so no commands.
This module accepts incoming NRPE connections and responds by executing various checks and returns their result. To use this you need to have check_nrpe or another NRPE client. This is similar to check_nt (NSClient) but much more flexible and supports encryption. This only drawback is that it lacks any authorization (something I hope will come within the next few months).
As this module has the ability to generate command handlers by configuration there are command handlers but nothing built in.
This module can add two types of command handlers.
First there are external command handlers that execute a separate program or script and simply return the output and return status from that. The other possibility is to create an alias for an internal command.
To add an external command you add a command definition under the “NRPE Handlers” section. A command definition has the following syntax:
command_name=/some/executable with some arguments
for instance:
test_batch_file=c:\test.bat foo $ARG1$ bar
The above example will on an incoming “test_batch_file” execute the c:\test.bat file and return the output as text and the return code as the Nagios status.
To add an internal command or alias is perhaps a better word. You add a command definition under the “NRPE Handlers” section. A command definition with the following syntax:
command_name=inject some_other_command with some arguments
for instance:
check_cpu=inject checkCPU warn=80 crit=90 5 10 15
The above example will on an incoming “check_cpu” execute the internal command “checkCPU” with predefined arguments give in the command definition.
A module to do simple system related checks. Such as CPU load memory usage and process and service status.
Yet to be written
Yet to be written
A simple module to show an icon in the tray when the service is running this module does not export any check commands.
NSClient++ comes with simple command line option for registering (and deregistering) the service but it does not have a GUI installer.
Thus to install the Client you only need to copy the files to a directory of you choice and then run “NSClient /install”.
Before you start NSClient++ you need to configure the client this is done by editing the configuration file (NSC.ini). The configuration file is a simple text file and is explained in detail under Configuration.
To install NSClient++ execute the following command:
NSClient++ /install
To uninstall NSClient++ execute the following command:
NSClient++ /uninstall
To start NSClient++ execute the following command:
NSClient++ /start
To stop NSClient++ execute the following command:
NSClient++ /stop
If you only wish to test it or debug the client you can use the following without installing it first.
NSClient++ /test
Configuration is fairly simple and straight forward. Open the configuration file in notepad (or you favorite editor) “notepad <installation path>\NSC.ini”
The file has sections (denoted with section name in brackets) and key/value pairs (denoted by key=value). Thus it has the same syntax as pretty much any other INI file in windows.
The sections are described in short below. The default configuration file has a lot of examples and comments so make sure you change this before you use NSClient++ as some of the examples might be potential security issues.
This section has options for how logging is performed. First off notice that for logging to make sense you need to enable the “FileLogger.dll” module that logs all log data to a text file in the same directory as the NSClient++ binary if you don’t enable any logging module nothing will be logged.
The options you have available here are
Option |
Default value |
Description |
debug |
0 |
A Boolean value that toggles if debug information should be logged or not. This can be either 1 or 0. |
file |
nsclient.log |
The file to write log data to. If no directory is used this is relative to the NSClient++ binary. |
This section configures the system tray module.
Option |
Default value |
Description |
defaultCommand |
… |
A string that will be the default in the inject command dialog. |
This is the NSClient module configuration options.
This is subject to change in the near future
Option |
Default value |
Description |
port |
12489 |
The port to listen to |
password |
|
The password that incoming client needs to authorize themselves by. |
allowed_hosts |
|
A list (coma separated) with hosts that are allowed to poll information from NSClient++ |
use_ssl |
0 |
Boolean value to toggle SSL encryption. This is not yet supported in any client I know of but as the underlying structure (NRPE) supports it I thought Id might add it if someone wants to update check_nt to support SSL. Not implemented in this version |
This is configuration for the NRPE module that controls how the NRPE listener operates.
Option |
Default value |
Description |
port |
5666 |
The port to listen to |
allowed_hosts |
|
A list (coma separated) with hosts that are allowed to poll information from NSClient++ |
use_ssl |
1 |
Boolean value to toggle SSL encryption on the socket connection |
command_timeout |
60 |
The maximum time in seconds that a command can execute. (if more then this execution will be aborted). NOTICE this only affects external commands not internal ones. |
allow_arguments |
0 |
A Boolean flag to determine if arguments are accepted on the incoming socket. If arguments are not accepted you can still use external commands that need arguments but you have to define them in the NRPE handlers below. This is similar to the NRPE “dont_blame_nrpe” option. |
allow_nasty_meta_chars |
0 |
Allow NRPE execution to have “nasty” meta characters that might affect execution of external commands (things like > “ etc). |
This is a list of handlers for NRPE execution this can of course be used by any module (such as NSClient) but for historical reasons they are located in this section especially as NRPE plug-in is the one that does the actual execution.
The handlers can have two different syntaxes:
Either “command[my_command]=/some/executable” or “my_command=/some/executable” The latter is the preferred way as it is shorter.
Here you can set various options to configure the Syetem Check module.
Option |
Default value |
Description |
CPUBufferSize |
1h |
The time to store CPU load. This means you can get averaged values this far back in time. The downside is the buffer might use a lot of memory if the check resolution is high. |
CheckResolution |
10 |
Time between checks in 1/10 of seconds. That means a value of 10 means check every second. A value of 100 means check every 10 seconds and so on. |
CounterPageLimit |
\\\\.\\Memory\\Commit Limit |
Counter to use to check upper memory limit. |
CounterPage |
\\\\.\\Memory\\Committed Bytes |
Counter to use to check current memory usage. |
CounterUptime |
\\\\.\\System\\System Up Time |
Counter to use to check the uptime of the system. |
CounterCPU |
\\\\.\\Processor(_total)\\% Processor Time |
Counter to use for CPU load. |
This is a list of modules to load at startup. All the modules included in this list has to be NSClient++ modules and located in the modules subdirectory. This is in effect the list of plug-ins that will be available as the service is running.
A good idea here is to disable all modules you don’t actually use for two reasons. One less code equals less potential security holes and two less modules means less resource drain.