Beta Man

Windows Goes High Performance

Microsoft joins the supercomputing race with Windows Compute Cluster Server 2003.

What was once old is new again. High-performance computing (HPC) has returned as one of the biggest trends in computing -- with a big difference. Back in the day (the early 1990s) you could drop $40 million on a Cray Y-MP supercomputer.

Now, thanks to cheap, off-the-shelf components (COTS), new Intel- and AMD-based HPC servers make sense from both a financial and technological perspective. For example, you can pick up a four-way, 2.2GHz AMD Athlon64 server with 4GB of RAM for about $4,000. As far as the technology goes, the point of HPC these days is to rely less on a single massive machine and more on compute clusters -- groups of interconnected machines that divide the workload among themselves.

In fact, universities and research institutions have been using Linux-based supercomputing clusters for years. The Beowulf Project can give you some guidance on building clusters of Linux-based servers.

It's little wonder that Microsoft is looking for a piece of the HPC action. I got a good look at Windows Compute Cluster Server 2003 (CCS2003) at a recent Microsoft briefing. Remember that the "C" in COTS stands for cheap. CCS2003 (which is based on Windows Server 2003, hence the name) will actually cost less per socket than other editions of Windows. This won't be a bargain-basement version of Windows, however. It's being put together specifically to address HPC concerns.

As a result, you won't be able to install this special version of Windows on any computer that isn't part of a dedicated computational cluster. It's also only available in an x64 edition -- the theory being that nobody would want to build a computational cluster out of legacy 32-bit hardware.

Windows Compute Cluster Server 2003
Version Reviewed: Beta 2
Current Status: Beta
Expected Release: 2006

What Is a Compute Cluster?
A compute cluster is a single-head node that accepts computing jobs and distributes the workload across at least two attached nodes. CCS2003 won't support high availability for the head node, so make sure it's already running on highly available hardware. This is the brains of your HPC operation, so it has to stay up.

You can have as many attached compute nodes as you can afford. As we've learned from distributed computing projects like SETI@home (which is an excellent real-world example of how you would use a compute cluster), the more compute nodes, the merrier.

To avoid bottlenecks that can limit the number of nodes in your compute cluster, you'll want to use switched gigabit Ethernet as a minimum -- a 10 gigabit Ethernet or Myrinet network is even better. CCS2003 includes Windows Sockets Direct Interface, which is specifically designed to take advantage of these types of high-speed connections.

Unless you have to do some serious number crunching, such as simulating nuclear explosions, modeling fluid dynamics or assessing potential oil deposits, CCS2003 may not be for you.

You'll have to tune your applications to run on a cluster. To give you an idea of the old-school, hardcore nature of this type of computing, look at the programming languages that CCS2003's components support out of the box: Fortran77, Fortran90 and C. Yikes. Configure the system to submit applications to the cluster's scheduler on the head node, and to run completely unattended using only data files (and not keyboard commands or mouse clicks) for input.

You'll also have to be fluent in several new acronyms if you're going to set up a compute cluster. MPI (Message Passing Interface) is an industry-standard application programming interface designed for rapid data exchange between compute nodes in HPC environments. Microsoft's MPI (MSMPI) is a version of the Argonne National Labs Open Source MPI2 implementation that supports more than 160 function calls. Applications submitted to CCS2003's job scheduler need to support this.

As you might expect, CCS2003 makes heavy use of Microsoft's infrastructure components. For example, all nodes have to belong to the same Active Directory domain so you can manage them as a unit and share security information.

What It Isn't
CCS2003 is not the same kind of clustering as Windows Cluster Service. While CCS2003 is designed to have several computers interconnected, those computers work together to solve computationally intensive problems, rather than provide failover or fault tolerance. You won't run Exchange Server on CCS2003. In fact, unless you have some heavy-duty number crunching to do, CCS2003 probably isn't for you.

Beta Man's
Routine Disclaimer:
The software described here is incomplete and still under development; expect it to change before its final release -- and hope it changes for the better.

The thought of deploying and managing a dozen or so compute nodes sends a chill down my spine, and not just because the data center housing them is going to need heavy-duty air conditioning to avoid a meltdown. In an era when everyone's downsizing the data center, CCS2003 heads in the opposite direction.

Microsoft feels your pain. CCS2003 includes a command-line interface to help you to create and submit jobs. You can use Remote Installation Services (RIS) to deploy compute nodes, so deployment to bare-metal machines is easier (CCS2003 includes RIS). Standard backup and restore techniques apply, so whatever you're already using should work fine. Of course, the usual MMC snap-ins will let you control the entire cluster. The setup process for Compute Cluster is also straightforward, using a standard Wizard-based interface.

CCS2003 loves networks and wants to connect to as many as possible. A private network for administrative traffic, the MSMPI network for exchanging cluster communications and data, and a public network like your corporate intranet. This last conduit also lets applications like Systems Management Server (SMS) and Microsoft Operations Manager (MOM) get into the compute cluster's head node for management purposes. So you could have each CCS2003 machine connected to as many as three networks at once.

Too Much Horsepower?
Unless you have to do some serious number crunching, such as simulating nuclear explosions, modeling fluid dynamics or assessing potential oil deposits, CCS2003 may not be for you. Still, CCS2003 makes HPC accessible to organizations that never would have considered it before.

About the Author

Don Jones is a multiple-year recipient of Microsoft’s MVP Award, and is Curriculum Director for IT Pro Content for video training company Pluralsight. Don is also a co-founder and President of PowerShell.org, a community dedicated to Microsoft’s Windows PowerShell technology. Don has more than two decades of experience in the IT industry, and specializes in the Microsoft business technology platform. He’s the author of more than 50 technology books, an accomplished IT journalist, and a sought-after speaker and instructor at conferences worldwide. Reach Don on Twitter at @concentratedDon, or on Facebook at Facebook.com/ConcentratedDon.

Featured

comments powered by Disqus

Subscribe on YouTube