Beyond FLOPS: The co-evolving world of computer benchmarking

Benchmarks have been evolving along with the hardware they measure, and both are getting more complex.

It used to be simple: Multiply the microprocessor's clock rate by four, and you could measure a computer's computational power in megaFLOPS (millions of floating point operations per second) or gigaFLOPS (billions of FLOPS.)

No more. Today they're talking about teraFLOPS (trillions) and petaFLOPS (quadrillions) -- which brings up an important question: How do you benchmark these much-more-powerful systems?

"The majority of modern processors are systems on a chip and that has completely muddied the water," says Gabe Gravning, director of product marketing at AMD. An x86 microprocessor may actually include multiple processor cores, multiple graphics co-processors, a video encoder and decoder, an audio co-processor and an ARM-based security co-processor, he explains.

"For a longest time we built single-core processors and pushed the frequency as hard as possible, as frequency was the clearest correlation to performance," agrees Rory McInerney, vice president of Intel's Platform Engineering Group and director of its Server Development Group. "Then came dual cores, and multiple cores, and suddenly 18 cores, and power consumption became more of a problem, and benchmarks had to catch up."

But at the same time, benchmarks are integral to the systems-design processes, McInerney explains. When a new chip is considered, a buyer will "provide snippets of applications that best model performance in their environment -- they may have a certain transaction or algorithm they want optimized," he says.

"From there we need a predictive way to say that if we take option A we will improve B by X percent," McInerney says. "For that we develop synthetic or internal benchmarks, 30 to 50 of them. These benchmarks tend to stay with the same CPU over the life of the product. Then we see how the [internal] benchmarks correlate to standard [third-party] benchmarks that we can quote."

Gravning adds, "There is no perfect benchmark that will measure everything, so we rely on a suite of benchmarks," including both internal and third-party benchmarks; this part of the process hasn't really changed over the years.

As for the nature of those benchmarks, "The internal ones are proprietary, and we don't let them out," McInerney notes. "But for marketing we also need ones that can be replicated by a third party. If you look bad on an external benchmark all the internal ones in the world won't make you look good. Third-party benchmarks are vital to the industry, and are vital to us."

As a third-party benchmark for desktop and consumer devices, sources regularly mention the PCMark and 3DMark benchmarks, both from Futuremark Corp. in Finland. The first is touted for assessing Windows-based desktops, and the second for benchmarking game performance on Windows, Android, iOS and Windows RT devices.

Read the original:

Beyond FLOPS: The co-evolving world of computer benchmarking

Related Posts

Comments are closed.