Nvidia Trounces Google And Huawei In AI Benchmarks, Startups Nowhere To Be Found – Forbes

Nvidia uses the new Ampere A100 GPU and the Selene Supercomputer to break MLPerf performance records

Artificial Intelligence (AI) training numbers based on the suite of the new MLPerf 0.7 benchmark performance numbers were released today (7/29/20) and once again, Nvidia takes the performance crown. Eight companies submitted numbers on systems based on both AMD and Intel CPU processors and using a variety of AI accelerators from Google, Huawei, and Nvidia. The increase in peak performance for each MLPerf benchmark by the leading platform was 2.5x or more. The new benchmark also added new tests for additional emerging AI workloads.

As a brief background, MLPerf is an organization established to develop benchmarks for effectively and consistently testing systems running a wide range of AI workloads, including training and inference processing. The organization has gained wide industry support from semiconductor and IP companies, tools vendors, systems vendors, and the research and academic communities. First launched in 2018, updates and new benchmarking results have been announced for training about once a year, even though the goal is once a quarter.

The benefit of the MLPerf benchmark is not only seeing the advancements by each vendor, but the overall advancements of the industry, especially as new workloads are added. For the latest training version 0.7, new workloads were added for Natural Language Processing (NLP) using Bidirectional Encoder Representations from Transformers (BERT), recommendation systems using Deep Learning Recommendation Machines (DRLM), and reinforcement learning using Minigo. Note that using Minigo for re-enforced may also serve as a baseline for AI gaming applications. The benchmark results are reported as either commercially available (on-premise or in the cloud), preview (products coming to the market in the next six months, or research and development (systems still in earlier development). The most important near-term results are those that are commercially available or in preview. There is an open division, but that had no material impact on the overall result.

The companies and institutions that submitted results included Alibaba, Dell EMC, Fujitsu, Google, Inspur, Intel, Nvidia, and the Shenzhen Institute of Advanced Technology. The largest number of submissions came from Nvidia, which is not surprising given that the company recently built its own supercomputer (ranked #7 in the TOP500 supercomputer list and #2 in the Green500 supercomputer list), which is based on its latest Ampere A100 GPUs. This system, called Selene allows the company considerable flexibility in test different workloads and system configurations. In the MLPerf test results, the number of GPU accelerators range from two to 2048 in the commercially available category and 4096 in the research and development category.

All of the systems were based on AMD and Intel CPUs paired with one of the following accelerators: the Google TPU v3, the Google TPU v4, the Huawei Ascend910, the Nvidia Tesla V100 (in various configurations), or the Nvidia Ampere A100. Noticeably absent were the chip startups like Cerebras, Esperanto, Groq, Graphcore, Habana (an Intel company), and SambaNova. This is especially surprising because all of these companies are listed as contributors or supporters of MLPerf. There is a long list of other AI chips startups that are also not represented. Intel submitted performance numbers but only in the preview category for its upcoming Xeon Platinum processors, not for its recently acquired Habana AI accelerators. With only Intel submitting processor-only numbers, there is nothing to compare them to and the performance is well below the systems using accelerators. It is also worth noting that Google and Nvidia were the only companies that submitted performance numbers for all the different benchmark categories, but Google only submitted complete benchmark numbers for the TPU v4, which is in the preview category.

Each benchmark is ranked in terms of the execution time of the benchmark. Because of the high number of system configurations, the best way to compare the result is to normalize the execution time to each AI accelerator by dividing the execution time by the number of accelerators. This is not perfect because the performance per accelerator does typically increase with the number of accelerators and/or some workloads appear to have optimized performance around certain system configuration, but the results appear relatively consistent even when comparing the performance numbers of systems with relatively similar numbers of accelerators. The clear winner was Nvidia. Nvidia-based systems dominated all eight benchmarks for commercially available solutions. If considering all categories, including preview, the Google TPU v4 had the fastest per accelerator execution time for recommendations.

The platforms with the top performance results for each MLPerf 0.7 benchmark

Overall, the benchmarks increased from 2.5x to 3.3x from the 0.6 version benchmark categories, which include image classification, object detection, and translation. Interestingly, Nvidias previous generation GPU, the Tesla V100 scored best in three categories non-recurrent translation, recommendation, and reinforcement learning, the latter two being new MLPerf categories. This is not completely surprising because the Ampere had significant changes in the architecture that will also improve performance in inference processing. It will be interesting to see how the Ampere A100 systems score in the next generation of inference benchmarks that should be released later this year. Another development to note is the emergence of AMD Epyc processors in the top performance benchmarks because of their presence in the new Nvidia DGX A100 systems and DGX SuperPods with Nvidias new Ampere A100 accelerators.

Summary of the top MLPerf benchmark results and the performance improvements from version 0.6 to ... [+] versions 0.7

Nvidia continues to lead the pack, not just because of its lead in GPUs, but also its leadership in complete systems, software, libraries, trained models, and other tools for AI developers. Yet, every other company offering AI chips and solutions offers comparisons to Nvidia without the supporting benchmark numbers. MLPerf is not perfect. The results should be published more than once a year and the results should include an efficiency ranking (performance/watt) for the system configurations, two points the organization is working to achieve. However, MLPerf was developed as an industry collaboration and represents the best method of evaluating AI platforms. It is time for everyone else to submit MLPerf numbers to support their claims.

Visit link:

Nvidia Trounces Google And Huawei In AI Benchmarks, Startups Nowhere To Be Found - Forbes

Related Posts

Comments are closed.