Nichols Carr has a post on a Financial Times interview.
The coming of the megacomputer
March 06, 2009
Here's an incredible, and telling, data point. In a talk yesterday, reports the Financial Times' Richard Waters, the head of Microsoft Research, Rick Rashid, said that about 20 percent of all the server computers being sold in the world "are now being bought by a small handful of internet companies," including Microsoft, Google, Yahoo and Amazon.
Recently, total worldwide server sales have been running at around 8 million units a year. That means that the cloud giants are gobbling up more than a million and a half servers annually. (What's not clear is how Google fits into these numbers, since last I heard it was assembling its own servers rather than buying finished units.)
Waters says this about Rashid's figure: "That is an amazing statistic, and certainly not one I’d heard before. And this is before cloud computing has really caught on in a big way." What we're seeing is the first stage of a rapid centralization of data-processing power - on a scale unimaginable before. At the same time, of course, the computing power at the edges, ie, in the devices that we all use, is also growing rapidly. An iPhone would have qualified as a supercomputer a few decades ago. But because the user devices draw much of their functionality (and data) from the Net, it's the centralization trend that's the key one in reshaping computing today.
I’ve taken it for granted that Microsoft, Google, Amazon, and Yahoo are the server buyers and it was known these guys are the setting the server standards now.
Rick Rashid mentions a shift.
Rashid also pointed out, according to Waters, that "every time there’s a transition to a new computer architecture, there’s a tendency simply to assume that existing applications will be carried over (ie, word processors in the cloud).
But other shifts are in what is the megacomputer. This week I got a chance to see Jason Banfelder and Vanessa Borcherding from Cornell Medical Biomedicine department, and we were talking about what they were planning next to purchase for their HPC lab.
Jason had tried the Nvidia Tesla and found a 40 times performance gain vs. the x86 based solution.
The world’s first teraflop many-core processor
NVIDIA® Tesla™ GPU computing solutions enable the necessary transition to energy efficient parallel computing power. With 240 cores per processor and based on the revolutionary NVIDIA® CUDA™ parallel computing architecture, Tesla scales to solve the world’s most important computing challenges—more quickly and accurately.
Another interesting choice is the Intel Atom based SGI concept server.
SGI was showing off a new supercomputer project at SC08 last week, it's codenamed Project Molecule and the special thing about it is that it uses the Intel Atom N330 processor. The aim of Project Molecule is to create a supercomputer with ultra high-density, low power consumption and low cost using the ultimate commodity processor that can be easily programmed. SGI says a 3U-high rack can house more than 90 blocks with two dual-core Intel Atom 1.6GHz processor in each block, good for a total of 360 cores. The total power consumption of this system is below 2kW. More info at Tech-On!
The Register has some more info about the performance of Project Molecule:
The concept machine at the SC08 show was a 3U rack that contained 180 of the Atom boards, for a total of 360 cores. These boards would present 720 virtual threads to a clustered application, and have 720 GB of main memory (using 512 MB DDR2 DIMMs mounted on the board) and a total of 720 GB/sec of memory bandwidth. The important thing to realize, explained Brown, is that if the interconnect was architected correctly, the entire memory inside the chassis could be searched in one second. That memory bandwidth, Brown explained, was up to 15 TB/sec per rack, or about 20 times that of a single-rack cluster these days. This setup would be good for applications where cache memory or out-of-order execution don't help, but massive amounts of threads do help. (Search, computational fluid dynamics, seismic processing, stochastic modeling, and others were mentioned).
The other advantages that the Molecule system might have are low energy use and low cost. The aggregate memory bandwidth in a rack of these machines (that's 10,080 cores with 9.8 TB of memory) would deliver about 7 times the GB per second per watt of a rack of x64 servers in a cluster today, according to Brown. On applications where threads rule, the Molecule would do about 7 times the performance per watt of x64 servers, and on SPEC-style floating point tests, it might even deliver twice the performance per watt. On average, SGI is saying performance per watt should be around 3.5 times that of a rack of x64 servers.
And GigaOm today has another processor design.
Daniel Reed, Microsoft’s scalable and multicore computing strategist, calls this a hidden problem that’s just as big as the challenges of developing code that optimizes multicore chips.
Chip firms are aware of the issue. Intel’s Nehalem processor for servers adds more memory on chip for the multiple cores and tried to improve communications on the chip. Firms such as Texas Instruments (a TXN) have tweaked the designs of their ARM-based chips for cell phones to address the issue as well. Freescale has created a “fabric” inside some of its multicore embedded chips so they can share information more efficiently across a variety of cores.
But it’s possible that a straight redesign on the processor side is what’s needed. SiCortex, which makes a specially designed chip for the high-performance computing market, questions whether merely adding more memory, as Intel is doing, is the way to solve the issue. Its solution is closer to creating a communication fabric inside the device that scales with the number of cores that are added.
Combine all these ideas, and it is possible for Google, Amazon, Yahoo, or Microsoft to completely change computing by making the transition to a new platform.
Is the megacomputer going to be driven by one of these companies?