Tame Those Servers Behaving Badly
We test utilities that manage your essential servers and services.
No matter what kind of shop you have, servers proliferate—often quickly. A small shop might start with a single Web server and quickly find itself with three of them in a Web farm, plus primary and backup database servers, e-mail servers and so on.
As servers multiply, it becomes tougher to keep an eye out for problems. Many harried server administrators start by consolidating information from event logs and writing scripts that poll each server for data, but there are better, more proactive ways to monitor when servers begin to revolt. In fact, server monitoring is such a common need that you'll find dozens of commercial software packages available, with a wide variety of features and price points.
When choosing a solution, there are a couple of factors to consider beyond the obvious issue of price: architecture and how the system handles alarms.
In terms of architecture, your choices are either agent-based or agentless. Agent-based products install software on every monitored server, which increases the complexity of the installation and can pose problems in environments where software distribution is tightly controlled. But these types of solutions also allow for customization of monitored services. Agentless architectures work by querying public interfaces on the monitored machines. These products are usually easier to install and can have you monitoring services in no time.
As for the alarm issue, you can simply be notified of any problems or have the software do some work for you. Just about any product will send e-mail and network notifications, but some will automatically restart a failed service or run a VBScript file to delete archives if a hard drive fills up.
|Products at a Glance
|Alchemy Eye 6.01
From $595 for 5 nodes
GFI Network Server Monitor 5.0
From $375 for 5 servers
Heroix eQ Management Suite 3.0
$1,995 plus $395
per monitored server
Starting at $245
For this roundup, I put five monitoring utilities through their paces. These are representative of a much larger market and should be used as comparisons to help you fine-tune your decision-making as you evaluate other entrants in this crowded field.
Alchemy Lab's Alchemy Eye offers a simple monitoring console that tracks all essential servers. You can group servers into color-coded folders or use the top-level All Servers folder to show everything at a glance. If any server in a folder is dead, the folder turns red, making it easy to spot services requiring investigation. A few clicks give the history of the affected server, to indicate whether the problem is new or chronic. Another view displays servers on a two-dimensional schematic network diagram, allowing IT to see how things are grouped geographically.
Setting up a new server is simple. One click of a toolbar button opens the Server Properties dialog box. Here you assign a name and check interval to the server, and choose from more than 30 types of checks to perform, including TCP/IP connections, standard e-mail protocols, network protocols, Oracle and SQL Server database monitoring, free disk space and event log monitoring.
Alchemy Eye uses an agentless architecture. Installation on the console computer is quick and easy; the entire download is barely two megabytes. The product is also easy to use. You can go from installation to your first monitored server in minutes; the step-by-step setup via dialog boxes is a breeze. If you're running the console on a server operating system you can install it as a service for continuous monitoring; otherwise, you can save and restore groups of servers to monitor from the default user interface.
Alchemy Eye also offers a wide range of ways to respond to server problems. It can send alerts to computers or e-mail addresses, play sounds, execute scripts, shut down and restart services, and more. It features a pluggable architecture with a documented API, letting third parties provide new types of monitors and new ways to respond to problems. The company maintains a page on its Web site with these plug-ins, which cover everything from mobile phone notification to process and memory monitoring. Alchemy Labs will also develop custom plug-ins. For long-term monitoring, the program can also generate a series of HTML reports.
Alchemy Eye has been under active development for years, and new releases come out on a roughly monthly basis. For a single fee you get lifetime registration and can monitor as many servers as you like, which makes Alchemy Eye a cost-effective choice for many shops. The company also offers a 30-day trial version.
Figure 1. Alchemy Eye tracks server response time as well as current status. (Click for larger image)
BirdsEye, from Proteligent, is a client/server system that boasts some cross-platform capabilities. The server runs on Windows, while client systems can be Windows (2000, 2003 or XP), Red Hat Linux (7.2 or above) or Solaris 8.0. (Clients in this case are the systems that you want to monitor.) After installing the server, a discovery process locates other machines on the network. This was where I ran into my first issue with BirdsEye—its automatic discovery found computers in the range 220.127.116.116 through 18.104.22.1685 on my local subnet, but there are no such addresses. Fortunately, the manual discovery process worked, quickly finding my actual machines.
Once you've discovered computers, you must install the client software on the monitored machines. For Linux or Solaris, installation is a manual process. For Windows, you can automatically install and uninstall client software from the central BirdsEye Explorer, provided you have administrative rights to the machine.
From there, by default BirdsEye starts monitoring system logs, network connectivity, CPU, memory and hard drive space. You can also add a small list of applications to be monitored, including SMTP, HTTP, DNS and LDAP. At this point, the BirdsEye Explorer will quickly show a server's vital signs including free disk space and memory and CPU usage.
Figure 2. BirdsEye tracks server status with a drill-down Web interface. (Click for larger image)
IT can choose who should be notified, by either e-mail or SMS message, when there's a problem. IT can also decide which types of events should trigger a notification, but not which servers notify which operators. You can, however, take an individual server and tell it not to send notifications at all. The warning and panic levels are also baked in and not adjustable.
One nice touch is the BirdsEye System Monitor. This is a second interface that shows the status of monitored servers, implemented as a continuously updating Web page with drilldown into the details. If you install the BirdsEye server on a system with IIS, the System Monitor interface is available to anyone on the network with a Web browser. This ability to monitor from anywhere gives the sysadmin much-appreciated flexibility. However, it can take quite a few clicks to find out exactly what's going on.
I like the System Monitor interface, and the product is very easy to install. But it's more limited in customization possibilities than the other choices I tested.
GFI Network Server Monitor
GFI Network Server Monitor is an agentless product that pulls together all monitored servers into a treeview-based interface. You can sort monitored servers into folders, or view all of the servers together at the top level. Right-clicking on a monitored server can put any particular monitor instantly on hold—a nice touch when troubleshooting.
Adding a new server is as easy as clicking a toolbar icon and filling in a property sheet. There are plenty of ways to monitor, including ICMP ping, standard Internet protocols, machine metrics such as CPU usage or disk space, and SQL Server, Oracle or ODBC database queries. It also has built-in monitors for things like Exchange server health. These are implemented in VBScript and use WMI to query the remote machine, so you'll need to be sure your WMI infrastructure is in good shape before using this tool.
If something goes wrong, GFI Network Server Monitor can take steps to fix the problem. It can reboot a machine, run a batch job or VBScript file or restart services. Reporting is available in both HTML and CSV formats, though the list of reports is somewhat limited.
A handy feature is configuring dependencies between rules. You might use it to monitor a single server for free space, the presence of a service and network connectivity. If the network connectivity fails, you can avoid multiple notifications by specifying that the other two rules are dependent on that one.
It's easy to extend Network Server Monitor with your own tests by writing custom VBScript files. The well-written manual contains a whole chapter explaining the interface and showing how to do this. With a bit of scripting knowledge, you can expand the usefulness of this tool considerably. You'll probably find yourself spending more time on customizing things—simply because you can.
Finally, although the software itself only runs on a single server, it supports remote access so users can run the Network Monitor Manager without being logged on to the monitoring server directly. These users can be read-only or read-write, which lets IT maintain a modicum of security over the monitored configuration.
Figure 3. GFI provides plenty of details along with status at a glance. (Click for larger image)
Heroix eQ Management Suite
Heroix eQ Management Suite is the most capable—and most complex—of the products I tested. In addition to a serious financial commitment, you'll also need some serious time for systems administrators to climb up the learning curve. In return, you'll get a system that can be customized almost endlessly, with strong, out-of-the-box monitoring capabilities.
Heroix eQ is strictly an agent-based product. You'll need to install the Management Console and Solutions Studio on a box from which you'll manage the network. Other servers get the Heroix eQ Agent. There's a Web-based interface for actually looking at the data. Somewhere on the network you'll also want a SQL Server box to store the large quantities of data that Heroix eQ can collect.
Heroix starts with sensors, which monitor some specific piece of information or potential error condition. It then adds solutions, which are whole sets of sensors for something like a Novell Netware server, an Exchange server, an Oracle server or your Active Directory installation. Heroix ships an entire development environment named Solutions Studio for creating and customizing solutions. In addition to the Windows-based solutions, it includes monitoring tools for Unix, Linux, Novell and OpenVMS servers.
The Web interface goes one step beyond the solutions to build Line of Business Views. You might have five servers and a batch of infrastructure pieces that are all critical to your
e-commerce application. A Line of Business View might be used to summarize all of those solutions into a single icon, with drilldown to get at problems when and if they occur.
There are many other capabilities in this product: role-based security, on-demand Web reporting and graphing, remote installation, event log consolidation and forwarding, and more. If you commit to this product, you'll be going beyond simple server monitoring to a rich environment for tracking the business impact of any problems that may crop up. In fact, the job of systems administrator will be subtly different. Instead of watching servers directly, admins are likely to spend time setting up and modifying views for others to watch servers. Moving monitoring from an IT activity to a business activity is a winning situation all around.
Figure 4. Heroix eQ's Solutions Studio lets you create a way to monitor almost any service your IT heart desires.
(Click for larger image)
ServerVision is Sunbelt's first foray into the monitoring market, and it has come up with a solution that is unique in several ways. The most innovative feature? There's no particular machine that's identified as the central repository of monitoring knowledge. ServerVision can be installed on as many servers as you like and any of these servers can display the status for any group of monitored machines in a single integrated MMC application.
Once installed, ServerVision sets up a default set of checks for things like errors in the event log, average CPU and memory load, available disk space, failed services and so on. You can adjust the alerting thresholds for all of these, but even without any customization you'll get a decent idea of your network's health.
An overall Status Summary List node in the console shows at a glance which machines have a problem, and the software can drill down from there. There's also a Web interface that duplicates the MMC functionality and is more readily available from non-Microsoft systems.
Figure 5. ServerVision makes it easy to see key indicators for connected systems.
(Click for larger image)
It's simple to integrate multiple ServerVision Consoles into one. A discovery process finds ServerVision-enabled computers and adds them to the console. The console can also perform a remote installation of the software. ServerVision can send e-mail or alerts, start and stop services, restart or shut down computers, and perform other actions in response to various serious conditions.
Sunbelt threw in some other functionality. There's a basic backup capability to protect critical data by backing up files or databases and then using FTP to move them around. ServerVision also integrates with Shavlik's HfNetChk to keep an eye on the security profile of servers, sending a notification when a required patch hasn't been applied.
I like having an MMC snap-in for monitoring; this makes it easy to integrate ServerVision with other management tools. On the downside, this can make for a lot of clicking around to find out what's going on (though the Key Indicators and other summary nodes mitigate this problem somewhat). There is a free evaluation version on the Sunbelt Web site.
Which Product is for You?
For a relatively small number of servers (up to several dozen), I like the simple agentless products from Alchemy Labs and GFI. Web-based monitoring, such as that offered by Proteligent, is a nice touch when you have many administrators or managers needing to keep up with the network. Sunbelt's product is incredibly easy to install, and well-configured by default, though the pricing might be hard to swallow once you go beyond a few servers (although the company does offer volume discounts). Finally, the Heroix product is in a class of its own, with a complex and comprehensive set of monitoring rules that can be extended almost indefinitely; it's the best pick if you need to set up your own rules and responses for difficult situations.
Whatever your monitoring needs, there is a product out there to meet them. Consider these five as a starting point and think about how much effort such a tool could save you in the network center.