Editors Note: Users often ask What separates HPC from AI, they both do a lot of number crunching? While this statement is true, one big difference is the precision required for a valid answer. HPC often requires the highest possible precision (i.e. 64-bit double precision floating point), while many AI applications actually work with 8-bit integers or floating point numbers. The use of less precision often allows faster CPU/GPU mathematics and a good enough result for many AI applications. The following article explains the trend toward lower precision computing in AI.
A grand competition of numerical representation is shaping up as some companies promote floating point data types in deep learning, while others champion integer data types.
Artificial intelligence (AI) is proliferating into every corner of our lives. The demand for products and services powered by AI algorithms has skyrocketed alongside the popularity of large language models (LLMs) like ChatGPT, and image generation models like Stable Diffusion. With this increase in popularity, however, comes an increase in scrutiny over the computational and environmental costs of AI, and particularly the subfield of deep learning.
The primary factors influencing the costs of deep learning are the size and structure of the deep learning model, the processor it is running on, and the numerical representation of the data. State-of-the-art models have been growing in size for years now, with the compute requirements doubling every 6-10 months [1] for the last decade. Processor compute power has increased as well, but not nearly fast enough to keep up with the growing costs of the latest AI models. This has led researchers to delve deeper into numerical representation in attempts to reduce the cost of AI. Choosing the right numerical representation, or data type, has incredible implications on the power consumption, accuracy, and throughput of a given model. There is, however, no singular answer to which data type is best for AI. Data type requirements vary between the two distinct phases of deep learning: the initial training phase and the subsequent inference phase.
When it comes to increasing AI efficiency, the method of first resort is quantization of the data type. Quantization reduces the number of bits required to represent the weights of a network. Reducing the number of bits not only makes the model smaller, but reduces the total computation time, and thus reduces the power required to do the computations. This is an essential technique for those pursuing efficient AI.
AI models are typically trained using single precision 32-bit floating point (FP32) data types. It was found, however, that all 32 bits arent always needed to maintain accuracy. Attempts at training models using half precision 16-bit floating point (FP16) data types showed early success, and the race to find the minimum number of bits that maintains accuracy was on. Google came out with their 16-bit brain float (BF16), and models being primed for inference were often quantized to 8-bit floating point (FP8) and integer (INT8) data types. There are two primary approaches to quantizing a neural network: Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT). Both methods aim to reduce the numerical precision of the model to improve computational efficiency, memory footprint, and energy consumption, but they differ in how and when the quantization is applied, and the resulting accuracy.
Post-Training Quantization (PTQ) occurs after training a model with higher-precision representations (e.g., FP32 or FP16). It converts the models weights and activations to lower-precision formats (e.g., FP8 or INT8). Although simple to implement, PTQ can result in significant accuracy loss, particularly in low-precision formats, as the model isnt trained to handle quantization errors. Quantization-Aware Training (QAT) incorporates quantization during training, allowing the model to adapt to reduced numerical precision. Forward and backward passes simulate quantized operations, computing gradients concerning quantized weights and activations. Although QAT generally yields better model accuracy than PTQ, it requires training process modifications and can be more complex to implement.
The AI industry has begun coalescing around two preferred candidates for quantized data types: INT8 and FP8. Every hardware vendor seems to have taken a side. In mid 2022, a paper by Graphcore and AMD[2] floated the idea of an IEEE standard FP8 datatype. A subsequent joint paper with a similar proposal from Intel, Nvidia, and Arm[3] followed shortly. Other AI hardware vendors like Qualcomm[4, 5] and Untether AI[6] also wrote papers promoting FP8 and reviewing its merits versus INT8. But the debate is far from settled. While there is no singular answer for which data type is best for AI in general, there are superior and inferior data types when it comes to various AI processors and model architectures with specific performance and accuracy requirements.
Floating point and integer data types are two ways to represent and store numerical values in computer memory. There are a few key differences between the two formats that translate to advantages and disadvantages for various neural networks in training and inference.
The differences all stem from their representation. Floating point data types are used to represent real numbers, which include both integers and fractions. These numbers can be represented in scientific notation, with a base (mantissa) and an exponent.
On the other hand, integer data types are used to represent whole numbers (without fractions). The representations result in a very large difference in precision and dynamic range. Floating point numbers have a wider dynamic range then their integer counterparts. Integer numbers have a smaller range and can only represent whole numbers with a fixed level of precision.
In deep learning, the numerical representation requirements differ between the training and inference phases due to the unique computational demands and priorities of each stage. During the training phase, the primary focus is on updating the models parameters through iterative optimization, which typically necessitates higher dynamic range to ensure the accurate propagation of gradients and the convergence of the learning process. Consequently, floating-point representations, such as FP32, FP16, and even FP8 lately, should be employed during training to maintain sufficient dynamic range. On the other hand, the inference phase is concerned with the efficient evaluation of the trained model on new input data, where the priority shifts towards minimizing computational complexity, memory footprint, and energy consumption. In this context, lower-precision numerical representations, such as 8-bit integer (INT8) become an option in addition to FP8. The ultimate decision depends on the specific model and underlying hardware.
The best data type for inference will vary depending on the application and the target hardware. Real-time and mobile inference services tend to use the smaller 8-bit data types to reduce memory footprint, compute time, and energy consumption while maintaining enough accuracy.
FP8 is growing increasingly popular, as every major hardware vendor and cloud service provider has addressed its use in deep learning. There are three primary flavors of FP8, defined by the ratio of exponents to mantissa. Having more exponents increases the dynamic range of a data type, so FP8 E3M4 consisting of 1 sign bit, 3 exponent bits, and 4 mantissa bits, has the smallest dynamic range of the bunch. This FP8 representation sacrifices range for precision by having more bits reserved for mantissa, which increases the accuracy. FP8 E4M3 has an extra exponent, and thus a greater range. FP8 E5M2 has the highest dynamic range of the trio, making it the preferred target for training, which requires greater dynamic range. Having a collection of FP8 representations allows for a tradeoff between dynamic range and precision, as some inference applications would benefit from the increased accuracy offered by an extra mantissa bit.
INT8, on the other hand, effectively has 1 sign bit, 1 exponent bit, and 6 mantissa bits. This sacrifices much of its dynamic range for precision. Whether or not this translates into better accuracy compared to FP8 depends on the AI model in question. And whether or not it translates into better power efficiency will depend on the underlying hardware. Research from Untether AI research[6] shows that FP8 outperforms INT8 in terms of accuracy, and for their hardware, performance and efficiency as well. Alternatively, Qualcomm research [5] had found that the accuracy gains of FP8 are not worth the loss of efficiency compared to INT8 in their hardware. Ultimately, the decision for which data type to select when quantizing for inference will often come down to what is best supported in hardware, as well as depending on the model itself.
References
[1] Compute Trends Across Three Eras Of Machine Learning, https://arxiv.org/pdf/2202.05924.pdf [2] 8-bit Numerical Formats for Deep Neural Networks, https://arxiv.org/abs/2206.02915 [3] FP8 Formats for Deep Learning, https://arxiv.org/abs/2209.05433 [4] FP8 Quantization: The Power of the Exponent, https://arxiv.org/pdf/2208.09225.pdf [5] FP8 verses INT8 for Efficient Deep Learning Inference, https://arxiv.org/abs/2303.17951 [6] FP8: Efficient AI Inference Using Custom 8-bit Floating Point Data Types, https://www.untether.ai/content-request-form-fp8-whitepaper
About the Author
Waleed Atallah is a Product Manager responsible for silicon, boards, and systems at Untether AI. Currently, he is rolling out Untether AIs second generation silicon product, the speedAI family of devices. He was previously a Product Manager at Intel, where he was responsible for high-end FPGAs with high bandwidth memory. His interests span all things compute efficiency, particularly the mapping of software to new hardware architectures. He received a B.S. degree in Electrical Engineering from UCLA.
Read more:
The Great 8-bit Debate of Artificial Intelligence - HPCwire
- 3 Artificial Intelligence (AI) Stocks to Buy With $1,150 and Hold for Decades - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- This combines the power of artificial intelligence with human insights - finews.com - March 22nd, 2024 [March 22nd, 2024]
- Cohere Targets $5 Billion Valuation for ChatGPT Rival - PYMNTS.com - March 22nd, 2024 [March 22nd, 2024]
- IMF: Artificial Intelligence (AI) Will Transform 40% of Jobs. Can Investors Capitalize? - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- Cathie Wood Is Selling These 2 Artificial Intelligence (AI) Stocks. Should You? - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- iShares Robotics and Artificial Intelligence Multisector ETF (NYSEARCA:IRBO) Shares Acquired by Creative Financial ... - Defense World - March 22nd, 2024 [March 22nd, 2024]
- Inaugural Plenary Meeting of States Endorsing the Political Declaration on Responsible Military Use of Artificial ... - Department of State - March 22nd, 2024 [March 22nd, 2024]
- AMD Fell Today -- Is This a Chance to Buy the Artificial Intelligence (AI) Stock? - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- Artificial Intelligence (AI) Stocks Are Red-Hot, but Here's 1 to Avoid (for Now) - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- A Once-in-a-Generation Investment Opportunity: 1 Artificial Intelligence (AI) Growth Stock to Buy and Hold Forever - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- 2 Artificial Intelligence (AI) Stocks That Could Go Parabolic - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- Artificial intelligence will radically improve health care, but only if managed carefully - The Hill - March 22nd, 2024 [March 22nd, 2024]
- Tennessee Makes A.I. an Outlaw to Protect Its Country Music and More - The New York Times - March 22nd, 2024 [March 22nd, 2024]
- When Using Artificial Intelligence In Pharma R&D, Start With Identifying Problem To Solve - HBW Insight - March 22nd, 2024 [March 22nd, 2024]
- Why Super Micro Computer, Advanced Micro Devices, and Other Artificial Intelligence (AI) Stocks Tumbled on Tuesday - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- 3 Billionaires Are Selling Artificial Intelligence (AI) Stock Nvidia and Buying These 10 AI Stocks Instead - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- Nvidia Just Bought 5 Artificial Intelligence (AI) Stocks. These 2 Stand Out the Most. - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- UN adopts first global artificial intelligence resolution - ARMENPRESS - March 22nd, 2024 [March 22nd, 2024]
- This New Artificial Intelligence (AI) Chip Is a Massive Game Changer for Nvidia Stock - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- Nvidia Led the First Phase of Artificial Intelligence (AI), but These 2 Growth Stocks Will Lead the Next Phases ... - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- Meet Wall Street's Newest Stock-Split Stock, Along With the Artificial Intelligence (AI) Stock Likeliest to Follow in Its ... - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- Amid Rumors of a Deal With Rivian, Apple Acquired This Artificial Intelligence (AI) Start-Up Instead - Yahoo Finance - March 22nd, 2024 [March 22nd, 2024]
- UN adopts first global artificial intelligence resolution - CGTN - March 22nd, 2024 [March 22nd, 2024]
- 1 No-Brainer Artificial Intelligence (AI) Stock to Buy With $25 and Hold for 10 Years - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- 2 Artificial Intelligence (AI) Stocks That Could Make You a Millionaire - The Motley Fool - March 22nd, 2024 [March 22nd, 2024]
- Is Intel the Best Artificial Intelligence (AI) Semiconductor Stock to Buy Before It Skyrockets? - The Motley Fool - February 20th, 2024 [February 20th, 2024]
- The greater social implications of artificial intelligence - New Day NW - KING5.com - February 20th, 2024 [February 20th, 2024]
- AI comes to the world of beauty as eyelash robot uses artificial intelligence to place fake lashes - Fox News - February 20th, 2024 [February 20th, 2024]
- Beyer Appointed To Bipartisan Task Force On Artificial Intelligence - Falls Church News Press - February 20th, 2024 [February 20th, 2024]
- Why Did Nvidia Invest in These 5 Artificial Intelligence (AI) Stocks? Should You Buy Them, Too? - The Motley Fool - February 20th, 2024 [February 20th, 2024]
- Three key themes on artificial intelligence - Research Information - February 20th, 2024 [February 20th, 2024]
- How Artificial Intelligence is transforming consumerism - WFLA - February 20th, 2024 [February 20th, 2024]
- SAP names Philipp Herzig as chief artificial intelligence officer - CIO - February 20th, 2024 [February 20th, 2024]
- Unleashing the Power of Artificial Intelligence: Transforming Web-Based Applications for Enhanced Efficiency and User ... - Financialbuzz.com - February 20th, 2024 [February 20th, 2024]
- Koch Industries continues to accelerate its artificial intelligence initiative - The Business Journals - February 20th, 2024 [February 20th, 2024]
- Forget Nvidia: These 3 Artificial Intelligence (AI) Stocks Can Be the Next Stock-Split Stocks - The Motley Fool - February 20th, 2024 [February 20th, 2024]
- Artificial Intelligence for small business focus of upcoming JWCC Lunch and Learn on Feb. 28 Muddy River News - Muddy River News - February 20th, 2024 [February 20th, 2024]
- Opponents Highlight the Environmental Impact of Artificial Intelligence - News-Press Now - February 20th, 2024 [February 20th, 2024]
- Generative AI's environmental costs are soaring and mostly secret - Nature.com - February 20th, 2024 [February 20th, 2024]
- Chapter Summary: Genesis of Artificial Intelligence and a Scientific Revolution: 1950-1979 - EIN News - February 20th, 2024 [February 20th, 2024]
- What would Thomas Aquinas make of Artificial Intelligence? - ACI Africa - February 20th, 2024 [February 20th, 2024]
- ChatGPT Predicted Bitcoin Price Will "Skyrocket" - Cryptonews - February 20th, 2024 [February 20th, 2024]
- Worried About an Artificial Intelligence (AI) Stock Bubble? Consider This Billionaire Investor's Advice. - Yahoo Finance - February 20th, 2024 [February 20th, 2024]
- This Super Artificial Intelligence (AI) Stock Could Be at the Beginning of a Terrific Bull Run - Yahoo Finance - February 20th, 2024 [February 20th, 2024]
- 5 Artificial Intelligence (AI) Stocks That Could Make You a Millionaire - Yahoo Finance - February 20th, 2024 [February 20th, 2024]
- Will AI replace Colorado teachers? Here's what experts say. - The Colorado Sun - February 20th, 2024 [February 20th, 2024]
- Nvidia Could Be About to Counter a Big Artificial Intelligence (AI) Threat With This Move - Yahoo Finance - February 20th, 2024 [February 20th, 2024]
- The Healthiest U.S. Pharma Companies: Ranked by RealRate's Impressive Artificial Intelligence - Medium - February 20th, 2024 [February 20th, 2024]
- Down 84%, Is This Artificial Intelligence (AI) Stock a Buy After an Earnings Pop? - The Motley Fool - February 20th, 2024 [February 20th, 2024]
- 'AI for Humans' may be the most entertaining way to learn about artificial intelligence - Fast Company - February 20th, 2024 [February 20th, 2024]
- This Generative Artificial Intelligence (AI) Growth Stock Has Jumped 86% in a Year. Here's Why It Can Skyrocket ... - The Motley Fool - February 20th, 2024 [February 20th, 2024]
- This Is What Vitalik Buterin Thinks About Artificial Intelligence (AI) - BeInCrypto - February 20th, 2024 [February 20th, 2024]
- AI researchers discuss risks and potential regulations suggest putting the brakes on the compute hardware as one ... - Tom's Hardware - February 20th, 2024 [February 20th, 2024]
- How AI-generated deepfakes threaten the 2024 election - Journalist's Resource - February 20th, 2024 [February 20th, 2024]
- OpenAI's new text-to-video tool, Sora, has one artificial intelligence expert "terrified" - CBS News - February 20th, 2024 [February 20th, 2024]
- Clarivate Launches Enhanced Search Powered by Generative ... - Clarivate - August 10th, 2023 [August 10th, 2023]
- University of North Florida Launches Artificial Intelligence & Machine ... - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- Why Hawaii Should Take The Lead On Regulating Artificial ... - Honolulu Civil Beat - August 10th, 2023 [August 10th, 2023]
- WCTC To Offer New Certificates In Artificial Intelligence And Data ... - Patch - August 10th, 2023 [August 10th, 2023]
- Artificial Intelligence related patent filings increased in the ... - Pharmaceutical Technology - August 10th, 2023 [August 10th, 2023]
- The Role of Artificial Intelligence and Machine Learning in ... - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- Boston Struggles to Lead in Generative AI as Landscape of Artificial ... - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- Cleveland City Schools adds artificial intelligence technology at high ... - Cleveland Daily Banner - August 10th, 2023 [August 10th, 2023]
- Artificial intelligence isn't transforming retail accounting, yet - Thomson Reuters - August 10th, 2023 [August 10th, 2023]
- The Role of Artificial Intelligence in Enhancing Forensic and ... - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- Is artificial intelligence the future of pets? - WISH TV Indianapolis, IN - August 10th, 2023 [August 10th, 2023]
- Artificial Intelligence Takes on the World's Deadliest Infectious Disease - Good Good Good - August 10th, 2023 [August 10th, 2023]
- Panel of Experts to Discuss the Impact of Artificial Intelligence on ... - InsiderNJ - August 10th, 2023 [August 10th, 2023]
- Blending Virtual Reality and Artificial Intelligence Raises Privacy ... - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- The Impact of Artificial Intelligence in the Retail Sector - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- Solving IT Challenges with Artificial Intelligence: A Comprehensive ... - Fagen wasanni - August 10th, 2023 [August 10th, 2023]
- The role of artificial intelligence in the world of the CFO starts with ... - TechHQ - June 30th, 2023 [June 30th, 2023]
- Majority of Venture Capitalists Invested in Artificial Intelligence ... - Investopedia - June 30th, 2023 [June 30th, 2023]
- Opinion | Beyond the Matrix Theory of the Human Mind - The New York Times - May 30th, 2023 [May 30th, 2023]
- CyberArk Supercharges Identity Security Platform with Automation ... - CXOToday.com - May 30th, 2023 [May 30th, 2023]
- Photoshop Is Getting Artificial Intelligence -- Why That's a Big Deal ... - The Motley Fool - May 30th, 2023 [May 30th, 2023]
- Slack CEO looks to artificial intelligence for help in rolling out new ... - The Boston Globe - May 30th, 2023 [May 30th, 2023]
- AI set to transform construction industry - Fox Business - April 27th, 2023 [April 27th, 2023]
- Opinion | Artificial generative intelligence could prove too much for democracy - The Washington Post - April 27th, 2023 [April 27th, 2023]
- How Artificial Intelligence is Accelerating Innovation in Healthcare - Goldman Sachs - April 27th, 2023 [April 27th, 2023]