Redmond Roundup

Manage Your Network Right

We focus on specialized tools targeting specific areas of network management.

As current IT trends push us to the lofty goal of cloud computing, and Software as a Service (SaaS) is promoted by all the biggest software vendors, now is the time to be sure that your network-management capabilities are as good as money can buy. For many of us this is familiar news, and though these trends are still nascent, they underline how much importance companies need to place on the communication lifelines that make business function. As part of this ongoing value proposition, it's our responsibility as network administrators to ensure we're being proactive in how we manage our networks. A large part of being proactive involves having the formal frameworks, processes and procedures in place to allow us to define key network elements and ensure we have the right tools for the right job.

A common framework used to identify the core elements of network management is the International Organization for Standardization (ISO) Network Management Model (also known as the Open Systems Interconnection [OSI] Network Management Model). In this framework the network is broken down into five core areas:

  • Fault Management: Identify, isolate, log and correct all network faults
  • Configuration Management: Manage and maintain network device configurations
  • Accounting Management: Track network utilization for chargeback or billing of departments or business units
  • Performance Management: Gather, analyze and report on network performance
  • Security Management

Using the OSI Network Management Model as our framework, we reviewed currently available products to determine which tools were the best at what they did. You should note that our goal was to focus on specialized tools that targeted one specific area of network management. We understand that there are some very powerful management frameworks available that encompass the entire range of network-management functionality. There are also some less-monolithic products that attempt to cover multiple network-management functions, but we felt that best-of-breed often means specialized.

Fault and Performance Management
This review aimed to identify products that targeted a specific core element within a network-management framework, but we quickly came to realize that, as it relates to the key areas of fault management and performance management, this wasn't going to be possible. The two areas are so closely tied together that some of the best products are designed to handle your needs in both areas. In fact, the best and brightest in this category, eHealth Performance Manager (eHPM) from CA Inc., has been doing this for a long time.

There are some core considerations when looking for network-management tools, including scalability and ease of use, that are common needs across a lot of enterprise products. In addition, the unique characteristics of network management mean that you should also look for how many types of network devices are supported; which protocols are supported; which topologies are supported; and the level of automation that's built into a product.

eHPM excels in all of these categories. A distributed data collection, analysis and storage architecture allows this product to scale to the largest networks. In fact, the product's modular architecture includes support for integration with and management of carrier-class devices. In addition, a thorough and extensive certification process means there are currently more than 500 products certified. In other words, CA has identified the key monitoring metrics for network devices that you would normally need to spend a great deal of time configuring. This is a big plus when considering ease of deployment.

CA eHealth Performance Manager BMC Configuration Automation for Networks IBM Tivoli Usage and Accounting Manager Prism Microsystems EventTracker
Installation 20%
Features 20%
Ease of use 20%
Administration 20%
Documentation 20%
Overall Rating:

Key: 1: Virtually inoperable or nonexistent  5: Average, performs adequately   10: Exceptional

However, despite all of its strengths, in the past eHPM had one glaring weakness -- the interface -- that often made the product frustrating to work with. Fortunately, CA recently improved the interface, and the latest release actually has some innovative elements that now make working with the product a pleasure. Add in the enhancements available with the latest version, and eHPM is clearly the leader in fault and performance management.

CA has long focused on the real-time aspects of eHPM, and these continue to be one of the product's strengths. The latest release enhances Live Reporting by providing a Web-based, real-time view along with historical reporting on the same screen. Another great feature is the ability to use the concept of Time over Threshold (ToT) to reduce false positives introduced by transient conditions -- which are almost a fact of life in a wide-area network environment. This technology also makes device maintenance much more straightforward, as you can adjust the ToT to match your maintenance window and avoid all those alarms that typically occur when a network device goes down.

Another algorithm that helps introduce a level of intelligence to the performance monitoring offered by eHPM is Deviation from Normal. This functionality uses the historical-trending ability of eHPM to establish baselines over time, which, in turn, make monitoring and reporting more meaningful and enhance the out-of-box capabilities of this product. You really can be up and running quickly and efficiently.

But this is just the beginning. Support for Internet Protocol version 6 (IPv6) -- as well as support for devices running both IPv4 and IPv6 -- is timely, given the increasing movement toward IPv6. Policy-based discoveries allow granular control of the discovery process and provide a high degree of automation. Support for embedded or Cisco NetFlow collectors as well as standard RMON2 probes allow for comprehensive traffic analysis. A strong reporting engine and capacity-planning capabilities are two more features that round out an all-around great tool for fault and performance management.

Configuration Management
The heart of a network is its configuration. Make the wrong kind of mistake here, and you can bring a network to its knees. By some counts, the most significant cause of network outages is incorrect configuration changes. With large networks home to a myriad array of devices -- often from different vendors and serving different functions -- the potential for chaos is enormous.

Keeping this potential under control is the primary goal of a configuration management (CM) application. In effect, the function of a CM app is to simplify the process of change management, and Configuration Automation for Networks (CAM) from BMC Software Inc. is a star in this regard.

By allowing you to track configuration state, apply new configurations uniformly and provide rollback capabilities in the event of problems, CAM is a tremendous toolset that supports the most heterogeneous of network infrastructures. The use of policy-based automation and audit compliance is a must for large networks, and CAM has excellent capabilities in this regard. With built-in support for more than 1,000 devices from 30 different vendors, CAM focuses heavily on automation as a means to avoid errors in configuration due to the elimination of manual steps. By allowing you to stage, assess, review, approve and implement on a schedule, CAM provides a consistent and repeatable management methodology. This capability is further underlined by the product's support for best-practice processes set forth by the Information Technology Infrastructure Library.

Add in a strong reporting engine that tracks changes over time while providing real-time views of device-configuration changes and drift-from-baseline configuration, and you have a comprehensive toolset that will save you a great deal of time and effort.

Accounting Management
In most organizations the IT department is viewed as a cost center, and in the present economic climate, the pressure to cut costs is enormous. It's at these times the value of some type of chargeback system can prove incalculable. However, the effectiveness of such a system is often dependent upon the granularity and accuracy of the numbers that are produced for the business units. Aside from assessing these requirements, we were also interested in the flexibility of financial reporting, because you can gather all the data in the world, but if you can't segment it correctly, it won't be very helpful.

Fortunately, IBM's Tivoli Usage and Accounting Manager (UAM) has the power and flexibility we were looking for and is capable of meeting all of these needs. In order to offer users the granularity they need, UAM supports many classifications of usage including business unit, users, application and device. In addition, there's support for multiple types of costing including flat-rate, service-based pricing, measured usage and direct cost. If you're a service provider or need to generate actual invoicing for an internal accounting system, there's support for numerous methods of billing. There's even a self-service Web-reporting portal that allows users to gain immediate access to their own usage and accounting information without requiring the IT staff's time or support. The product scales well with distributed data collectors and a multi-tier architecture.

Support for NetFlow provides the core traffic-capture capability. NetFlow is widely supported and the current version before the Internet Engineering Task Force should only continue to strengthen this trend.

However, UAM takes a great deal to get up and running. Determining who, what and how to bill is a complex process. Add on the need to gather usage data from disparate devices -- then analyze and report on that data -- and you begin to get the picture. In fact, many chargeback projects have failed for this very reason, so be prepared if you're thinking about implementing chargeback. Recent releases have benefited from IBM's autonomic initiative, which automates many processes, but there's still a lot to do. You should note that this shortcoming isn't limited to UAM, but is rather part and parcel of the entire class of products.

Security Management
Of all the categories, Security Management is the broadest. Therefore, selecting a best-of-breed product seems almost arbitrary. However, we eventually made a decision based on our belief that many network administrators are inundated by mountains of event data that often go unreviewed. This can have critical implications on the ability to proactively identify and remediate potential network issues before they cause problems. As such, it's critical to have a toolset that can not only collect this data, but help you make sense of what's happening on your network. EventTracker from Prism Microsystems Inc. is an excellent choice for doing just that. EventTracker is a security-information and event-management system that supports agent-based and agent-less operations against network devices, such as routers and switches, as well as server-based event logs. With a distributed architecture, this application can support the largest organizations while remaining easy to deploy and manage. In addition, a well-established and conservative return on investment model should help if you're called on to justify the budget requirements of EventTracker.

EventTracker supports a large number of different data sources including Syslog, Syslog NG, Windows Event Logs, SNMP, Solaris BSM, zOS and any other format that supports flat files. The breadth of this coverage should take care of most, if not all, of your monitoring needs. Of course, collecting all this data -- and Prism claims to be able to process more than 300,000 events per minute in real time -- is just the beginning. (Unfortunately, we weren't able to verify this statistic, but we'll take the company at its word.) The real key to a successful event-management system is the ability to make sense of all that data. With support for multiple custom views; rule-based filtering with 800-plus pre-defined rules; and sophisticated correlation capabilities using regular expressions, EventTracker really simplifies the monitoring process. Add in an intuitive interface and incredibly easy agent deployment, and you'll be up and running in no time.

This is not to say that there isn't a great deal of work to be done. With large amounts of data from diverse sources, the key is to be able to tone down the event "noise" to a dull roar so you can identify issues and events that need to be addressed. As part of this process, EventTracker offers an extensive knowledge base that automatically adds descriptions and related resource information to known events. This can really be a time saver and should reduce your reliance on Google somewhat. In addition, a sophisticated reporting engine lets you identify trends over time and offers a thorough forensic capability that allows you to identify root causes and event signatures during any incident reviews you need to complete. All in all, EventTracker is an excellent tool for your event-collection and -reporting needs.

CA eHealth Performance Manager

Pricing starts at $205,000
CA Inc.

BMC Configuration Automation for Networks

Pricing based on configuration; vendor does not reveal specific pricing
BMC Software Inc.

IBM Tivoli Usage and Accounting Manager

Pricing based on configuration; vendor does not reveal specific pricing
IBM Corp.

Prism Microsystems EventTracker

Priced per device; typical price for 50 servers is $20,000
Prism Microsystems Inc.


comments powered by Disqus

Subscribe on YouTube