The Multi-Core Muddle

PCs are moving to multi-core as operating systems, applications and languages struggle to keep up.

Intel-compatible dual- and multi-core processors have been around since 2005, and are commonplace in the game console market. In fact, the Sony PlayStation 3 has eight cores, seven of which are fully exploited by state-of-the-art games.

That's why it's a bit surprising that Microsoft didn't make its boldest multi-core client proclamation until November of last year. Even that announcement was a relatively subdued affair in which a Windows exec promised that Windows 7, the follow-on to Windows Vista, would be re-architected in order to exploit multi-core.

Fast-forward to March of this year when Craig Mundie, Microsoft's chief research and strategy officer, followed up that announcement by prophesizing that parallel computing will represent as big a change as the Internet or the invention of the PC itself. That same month, Microsoft and Intel Corp. pledged $20 million for parallel-computing research at the University of Illinois and the University of California, Berkeley.

So three years into the era of multi-core chip technology, critics argue that on the desktop side, Microsoft is just getting started. This is a particularly frustrating issue for some PC users who dash out to buy the latest multi-core technology only to sit and wonder why screens take so long to refresh, or for tasks to complete.

Multi-Core Macs
The multi-core issue is not limited to Windows desktops. Mac users, especially in the early days of Mac dual-core-based systems, complained that dual-, quad- and now eight-core machines are not showing the expected performance gains. This caused a stir as Apple Inc. promoted multi-threaded benchmarks showing a linear increase in performance when the reality was far different.

Apple, though, takes pains to explain how its newest OS, Leopard, takes advantage of the new chips. It points to the new Leopard scheduler that ensures tasks are spread across cores, as well as a new multi-core-ready network stack. For developers, Apple is promoting NSOperation, an API that queues up operations to be processed by multi-core systems. The company claims this gets rid of the need to hard-code apps.

Here's the problem in a nutshell: Windows XP and Vista are fine multitasking operating systems, but neither one was ever designed specifically for the rigors of high-end multiprocessing, experts contend. Compounding the problem on the software side is that most programming languages and environments aren't either.

A multi-core processor is more than just one processor rolled onto one mega-chip. One reason Advanced Micro Devices (AMD) Inc. and Intel moved away from this model is that increasing the raw gigahertz creates too much heat, and ultimately wastes electricity as cooling needs rise. Multi-core promises to end this escalating gigahertz race, replacing it with a race for more cores. In fact, the cores in today's multi-cores generally run more slowly gigahertz-wise than advanced single-core processors. Unfortunately, many Windows-based applications are built serially with a single processor in mind. When run on multi-core systems, often the extra cores are doing little to nothing.

Theoretically, having more processors should increase performance by the exact number of extra processors. But even Vista, in development for a full six years, was built largely with a single, central processor in mind. That means that some programs actually run slower on a multi-core XP or Vista machine than on a single core, argues Mike McCool, chief scientist for RapidMind Inc., a firm that sells a development system for parallel programming. "Microsoft hasn't really spent time looking at what to do about this major disruption," adds Ray DePaul, CEO of RapidMind.

Fun Fact:
Last year Microsoft hired Dan Reed, a supercomputing guru from the University of Illinois, to be director of scalable and multi-core computing. Reporting to Microsoft Research chief Rick Rashid, Reed will work on pure research and with product groups.

User Results Vary
Adding to the mystery is the fact that end users report very different results with their new machines. In talking with users, some report nice gains running XP or Vista on a dual- or multi-core, especially software developers. Others see little benefit as the extra cores are generally idle.

Redmond reader Dean Slindee moved from a single processor to a quad-core system he built himself for his software development system. "[It's the] best bang for the buck in a long time. I can run many Visual Studio instances, compile them and open others where I used to wait to 'manually' single-thread them in the past," he says.

Slindee has been carefully tracking performance. "I have two 'meters.' One is a quad-core Vista gadget bar graph. It shows me that the cores are all being used, approximately equally, all usually at 10 percent or less most of the time. When they exceed that percentage, it's usually in tandem," Slindee says. "The other meter is the one that got me to build the new PC. I call it my 'frustration' meter. It resides in my head, and I was constantly pegging the meter. I'm a VB Windows developer, and at home I can be programming and compiling several distinct project sessions in Visual Studio concurrently -- loading one project, changing a second project, compiling a third project concurrently. When I have to wait on the PC because it's trying to catch up with my requests, the frustration meter goes off. Frustration leads to attention deficit, which leads to dropping the ball when making code changes. The frustration meter has been quiet since quad-core. It may not be rigorously scientific, but it's certainly working for me. Less frustration equals less bugs."

Others aren't so pleased.

What Is Many-Core?
According to Microsoft, a multi-core has eight or fewer cores, where a many-core has more than eight. Microsoft expects many-core systems to be common in a few years' time. Because it will take a while for Microsoft's parallel plans to fully materialize, the company is aiming for the high, many-core ground.

"The newer desktop machines that we've been deploying the last couple of years have been dual-core machines [running XP]. It almost doesn't seem worth it, as it doesn't appear that Windows fully utilizes multi-threading technology. That could be because of how our image is written, or it could be the OS -- I'm really not sure," says Redmond reader Steve Durham.

Reader Michael Mayer agrees. "I've largely been disappointed with the multi-core processors. The concept is great but it doesn't appear that Windows, or applications in general, are exploiting the architecture. Not even Wintel can coordinate their efforts so that the end user can reap the benefits of the latest hardware and software simultaneously," Mayer says.

And finally there's this from reader Vicke Denniston: "I love my 'new' dual-core computer -- it kicks butt and takes names. I can move between applications easily and quickly, which is something I need to be able to do frequently. It starts up faster, reboots faster and is generally a happy computer. Of course it runs Leopard," Denniston remarks.

GigaSpaces Technologies Inc.
Microsoft is working with GigaSpaces, an Israeli company with an HPC development platform that runs on .NET, Java and C++. Microsoft is adding the GigaSpaces eXtreme Application Platform (XAP) to the Microsoft Compute Cluster.

The Microsoft View
Microsoft's Web site has very little specific information about XP and Vista's multi-core support, and what roles the OS plays versus applications. Meanwhile, a Microsoft spokesperson declined to set up interviews on this subject, arguing that hardware technologies like multi-core and 64-bit "are still more [of a] server play than desktop right now." The spokesperson added: "We don't have information to share at this time regarding the role of the desktop operating system in exploiting multi-core processors. However, we can say that we keep looking and listening to customers to see what interests them in this area as we think about future technologies we develop."

Future Windows
This disconnect is a big deal in the world of dual-core processors where one whole core may be underutilized. But it will become a crisis as AMD and Intel up the ante, pushing out quad-core PCs and then moving on to eight-cores, 16-cores and so on. Given Microsoft's track record, it takes about five years to build a new desktop OS. That puts the successor to Vista at around 2012, with some believing the OS could be out around 2010. Now compare this to where Intel intends to bring hardware. By 2015, Intel predicts that it will produce processors with hundreds of cores.

Fun Fact:
Like the PlayStation 3, which uses an eight-core IBM Cell Processor, the Xbox 360 is a multi-core machine that can handle many tasks in the background. The Xbox 360 uses a triple-core PowerPC processor from IBM Corp.

What's surprising is that in 2004, Microsoft threw out all the old Vista code and started fresh. By then, the multi-core handwriting was on the wall, yet Vista largely ignored it. If the next version of Windows is to exploit multiprocessing the same way it supports multitasking today, Redmond coders had better be working on it right now.

Microsoft has let out precious few details about Windows 7. However, Microsoft Director of Technical Strategy Ty Carlson told attendees at the Future in Review 2007 conference that Windows would be re-architected with multi-core in mind. In that speech, Carlson said that Vista was only designed with "one, two, maybe four processors" in mind.

The Great Multi-Core Debate
Multi-core is a complex issue, and even experts are debating how effectively today's operating systems -- such as Vista -- and today's apps exploit it. Some argue that Microsoft has had decent multi-core/multiprocessing support ever since Windows NT, which could assign tasks to up to 64 processors in the 64-bit version, and up to 32 in the 32-bit version.

Web poster Andy Cadley explains that "Vista will assign application threads to multiple cores dynamically, dependent upon the load, just as it would do under XP. Applications still need to be written to separate multiple tasks into separate threads to take advantage of more than one processor at any one time. This is no different under Vista than it was under previous versions of Windows."

RapidMind Inc.
RapidMind is on the third version of its RapidMind Multi-Core Development Platform, which lets programmers exploit both GPUs and multi-core processors. Developers can port existing apps or write new software from scratch for parallel environments.

AMD published a white paper with a fairly detailed explanation of Vista and multi-core chips. According to AMD's Larry O'Brien, Vista supports multi-core and, just as importantly, the latest graphics processing units (GPUs). For multi-core, O'Brien calls Vista support "incremental," arguing that "if you take an even longer view, more substantive changes in the OS architecture will undoubtedly prove necessary. As the number of cores goes up, every sub-system will eventually have to be designed to truly take advantage of an asynchronous world. In particular, the operating system must be increasingly modularized if it's to spread its burden among many cores."

O'Brien indicates that much of the work falls on developers to exploit functions such as I/O prioritization and threading.

An article by Shelley Gretlein of National Instruments Corp. reaches a far different conclusion: "From an overall performance perspective, it's a little sad that Vista doesn't take more advantage of multi-core processors. On both Windows XP and Windows Vista, multi-core processors are certainly more efficient than their single-core predecessors. But they both handle multi-core processors identically. It will take a future re-engineered Windows version to truly take better advantage of the unique capabilities offered by that kind of hardware," Gretlein writes.

Redmond reader Russ Ramirez, an operating system architecture veteran with Digineer Inc., walks us through his view: "For the most part they're independent of each other in the sense that both the OS and the applications have the resources available to them to leverage parallelism to the degree it's possible in each case. OSes will generally leverage multiple execution units for the performance-sensitive issues like how much processor resources to give the Windows manager -- to improve the apparent performance/response to the user -- versus the IP stack, for example. The applications are dependent on the OS for scheduling and resource allocation, which ultimately affects how well they run, but in most OSes the user/sysop has the ability to grant greater priority to processes running in the user space, versus within the OS itself -- like kernel mode processes, for example."

Finally, there's this explanation from an anonymous, though well-informed Web poster: "Contemporary versions of Windows offer SMP that are a little different. Rather than the kernel scheduling the threads, the threads invoke the scheduler in a round-robin fashion themselves. It doesn't use multiple queues or tiered scheduling, but it still works well for low numbers of processors."

France-based GPU-Tech has another approach to exploiting GPUs. The company's Ecolib is a set of libraries that tap into graphics processors from NVIDIA Corp., ATI Technologies Inc. and Advanced Micro Devices Inc.

The real answers come in the form of benchmarks and actual user experiences. Testing has shown that dual- and multi-core systems do usually increase desktop performance, but there are diminishing returns. In general, a single app takes the least advantage of the extra processing -- especially applications with little multithreading. When you run multiple applications, some of the applications can take some advantage of the extra cores, especially if these apps include multi-threaded elements.

Jeff Atwood of the blog Coding Horrors examined published benchmarks that compared dual- and quad-core AMD Althon-based systems. For Photoshop and Windows Media encoder, the increase in performance between two and four cores was almost negligible. The biggest increase he found was with Cinebench 2003 Rendering, which showed a 1.8 percent time boost. Servers -- though far from doing a perfect job -- generally do exploit multi-core. That's because server apps are designed to be multi-user and tend to be more highly threaded.

It All Starts with Apps?
In its white paper, "The Manycore Shift," Microsoft argues that all aspects, from development tools to systems and platforms, must be adapted to parallel computing. Microsoft promises to adapt its operating systems and other aspects of software infrastructure.

While Microsoft professes that its platforms, including operating systems, must do more to support multi-core, its current mission is to convince application developers to press the parallel envelope. Unfortunately, all of the Microsoft parallel-processing development tools are in their infancy; in fact, most are still in gestation and will take "several years" to reach fruition. In an interview with the computer trade publication EE Times last year, Microsoft said that building a proper parallel computing model and set of development tools could take from five to 10 years, according to Burton Smith with the company's research division.

Meanwhile, third parties such as PeakStream (now owned by Google Inc.), RapidMind and GPU-Tech have shipped middleware for months or even years. While each offering differs, they all present developers with middleware they can write to that automatically exploits multi-processing and, in some cases, GPU back-ends. In many cases, these systems offer fairly straightforward porting of existing single-core apps to exploit multi-core.

One approach is to write highly threaded software. While threading is an answer, it brings along its own set of problems and isn't even necessary, say RapidMind execs. Multithreading is labor intensive, often has to be processor specific, doesn't automatically use extra cores and can make software error prone and unstable. A layer of software -- middleware if you will -- that distributes processing is a better answer, the company argues.

Most of those using such middleware are building scientific, technical, engineering and graphics-intensive apps such as 3-D rendering, animation and ray tracing. On the business side, business intelligence, data mining and monster database applications top the list.

Business machines need the same help. Vista, Office and other apps are taxing today's snappiest machines. If we want to move to the next level, all these cores must be put to use. And when they are, Microsoft believes, it will open up new opportunities including natural language interfaces, more realistic video and graphics, and applications developers haven't even thought of.

Microsoft Developer Tools
In Microsoft's view, the onus is largely on developers. They are the ones who need to use new tools to code applications for parallelization.

In its many-core white paper, Microsoft argues that "software developers need new programming models, tools and abstraction by the operating system to handle concurrency and complexity of numerous processors."

Microsoft, as expressed through the goals of its Parallel Computing Initiative, believes that exploiting multi-core involves all aspects of computing, including:

  • Applications: Apps must be built with multi-core in mind
  • Domain Libraries: Platform vendors must offer developers libraries to handle functions such as image-processing and math
  • Languages and programming models: Developers need environments and languages built for parallel programming.

Finally, there's the OS itself. Here, Microsoft promises to "evolve the operating system to more effectively budget and arbitrate among competing requests for available resources in the face of parallelism and quality-of-service demands."

The first step, unless you're already a dedicated HPC programmer, is to pick up .NET 3.5 and download the Parallel Extensions (ParallelFX). Still in beta, ParallelFX not only works with the latest rev of .NET, but the most current languages including C# 3.0 and Visual Basic 9.

The Concur Project
Microsoft Research is working on multi-core and many-core issues. One approach is the Concur Project.

Here's what else Microsoft has accomplished so far: Sequential apps can be redone to work in parallel by using the Task Parallel Library (TPL). Written with the help of Microsoft Research, TPL is part of the ParallelFX Library. Then there's Parallel Language Integrated Query (LINQ). LINQ itself is a new query tool for developers. Parallel LINQ, part of the ParallelFX library, allows queries to run across cores, dramatically speeding the manipulation of large data sets. While this seems intended for data-intensive business applications, the same technology can be applied to applications such as speeding up work on audio files.

Microsoft has a tool called Accelerator that parallelizes single-threaded x86-based code. This isn't aimed at multi-core CPUs, but multi-core GPUs. Also on the dev side is a new functional language, F#, that's multi-core-aware. Unfortunately, many developers are wary of leaping to new languages and frameworks until they're thoroughly debugged.

The GPU Angle
Multi-core is one way to boost potential horsepower. State-of-the-art GPUs represent another. Clearly GPUs are not as general-purpose as Intel and AMD CPUs, but for graphical applications the gains can be enormous. Companies such as RapidMind and GPU-Tech have middleware that can automatically exploit GPUs, and companies such as NVIDIA Corp. have tools to exploit specific GPUs. In fact, NVIDIA promotes its Tesla line of GPUs, boards and subsystems as desktop supercomputers, offering up to four teraflops in performance.

Virtualization to the Rescue?
If you have a multi-core desktop and want to utilize all the cores, you might want to start virtualizing. While not perfect, virtualization can often exploit those cores. This way, processor-intensive tasks can run in virtual machines which run against specific cores. Most of this action, however, is happening on servers today.

In an interview with Redmond Editor Ed Scannell, Citrix Systems Inc.'s Peter Levine, senior VP and general manager for the Virtualization and Management Division, had this to say about virtualization: "[Virtualization] will unlock the development of larger and larger systems, [where] organizations want to buy one server and carve it up into much smaller pieces. Our product line, the Xen Server product line, is well suited toward very large multi-core systems and we can carve them up and do all sorts of arbitrary things."


comments powered by Disqus

Subscribe on YouTube