Recently, Wenxin Yige, the first product of Baidu’s “AI painting”, was officially launched, setting off a wave of “you say I paint”: users only need to enter a piece of text, and they can generate all kinds of magnificent and gorgeous pictures with one click. painting.
Not only “one-click poetry” and “one-click painting”, the AI craze is sweeping, from “Alpha Dog” to L4-level autopilot training, and even molecular dynamics simulation of new crown drugs and analysis of virus mechanisms, all are far away. Without the help of AI technology.
However, U.S. chip giant Nvidia announced on August 31 that it was requested by the U.S. government to restrict the export of the latest two generations of flagship GPU computing chips—A100 and H100—to China for accelerating AI training tasks. Another chip giant AMD’s data center-level GPUs, the MI100 and MI200, are also limited.
So, what exactly are AI acceleration chips and GPUs, and why are they so important?
Nvidia drives GPU development
GPU is a graphics processor, which was proposed by NVIDIA when it released the GeForce 256 in 1999, and it has been used until now. Corresponding to the name of the CPU “central processing unit”, the prototype “graphics accelerator card” of the GPU was born on the Atari 2600 game console in the 1980s to make up for the performance bottleneck encountered by the CPU in graphics processing. pregnancy.
The design concept of the CPU is based on the von Neumann architecture. It processes data through the steps of memory read-translation-operation-output, etc. It is oriented towards low latency and is optimized for serial processing. This design concept makes the number of cores in the CPU less, and most of the transistors are used in the control circuit and cache, and only a few transistors are used to complete the actual operation. This limits its performance for massively parallel computing.
In 3D graphics operations, it is often necessary to perform the same coordinate transformation on each vertex of the model, or calculate the color value for each vertex according to the same lighting model. Although this operation is simple, the number of calculations required is very large. Let the early single-core CPU complain, and the GPU design concept optimized for graphics computing was born.
Unlike CPUs striving to reduce latency, GPUs are data throughput-oriented and consist of thousands of smaller, more efficient “little cores” designed to handle parallel tasks. Figuratively speaking, the core of the CPU is like a “chef” in a restaurant, responsible for the processing and deployment of various complex tasks; while the core of the GPU is like a “little worker”, which efficiently handles various simple tasks with human sea tactics. Both perform their roles and form the basis of today’s high-performance computers.
The history of GPU development is basically the history of Nvidia.
In 1993, Huang Renxun, whose ancestral home was in Zhejiang, China, was born in Taiwan, a treasure island, just in his thirties. During his studies at Stanford University, in order to pursue the girl he liked, he made a promise to set up his own company at the age of 30. Then his girlfriend became his wife, and Huang Renxun also co-founded Nvidia with two other partners in August 1993 and served as CEO.
The core of the CPU is like a “chef” in a restaurant, and the core of the GPU is like a “little worker”, which efficiently handles various simple tasks with crowd tactics.
2021 ChinaJoyAMD booth
What he didn’t know was that the company’s market value would surpass TSMC and Samsung in early 2022 to become the world’s most valuable semiconductor company, equivalent to four Intel or Qualcomm. As one of the few chip companies that started from scratch, Nvidia’s start was undoubtedly extremely difficult. Huang Renxun said in a speech: “When I founded the company, I clearly remember that I only had $200 in my pocket and there were already 250 competitors in the market.”
After hitting a wall, cater to the mainstream of the market
At that time, ATI (later acquired by AMD), the big brother in the field of graphics display, had been established for many years, and rising stars such as 3dfx also emerged one after another. Many venture capital firms have judged that the graph shows that the market is basically saturated, and the prospect of establishing similar companies is uncertain.
However, the fledgling NVIDIA has been working hard for two years and has launched its first display chip NV1 for game consoles. NV1 integrated the largest and most complete game solution at the time, not only supporting 2D and 3D processing capabilities at the same time, but also integrated audio processing functions, bringing a “nanny-level” one-stop solution to the game console market, which was approaching fierce competition at that time. The solution, in theory, should become the heart of a legendary game console to shine.
World Artificial Intelligence Conference 2022
A successful hardware product must first meet the mainstream technical standards and specifications in the market.
However, in order to achieve smoother 3D effects with less computation, NV1 opted for a square drawing rendering architecture. Unfortunately, in the same year that NV1 was released, Microsoft released Direct3D, the predecessor of the DirectX API graphics standard that is still in use today, plus the previous OpenGL (application programming interface for rendering 2D and 3D vector graphics). Triangular drawing rendering, which means that the NV1 is completely incompatible with the industry’s common standards, resulting in lukewarm sales.
The NV1, which was “applauded but not popular”, made Nvidia unfavorable for its apprenticeship, and the book capital was only enough to maintain the company’s operation for 30 days. “Remember, the company is only 30 days away from closing down.” This has also become Huang Renxun’s mantra to encourage employees not to relax.
When the company was in a desperate situation, fortunately, the Japanese-based Sega Game Company took a fancy to NVIDIA’s technical strength. It purchased the NV1 chip for its own Saturn game console, and then asked NVIDIA to develop the display of the next-generation game console DC for it. chip.
Although this cooperation once again diverged from Sega because of Nvidia’s insistence on the concept of square display, which eventually led to the miscarriage of the NV2 chip, but at that time, Sega, which was rich and powerful, did not recover the development funds of 7 million US dollars, which gave Nvidia a crucial opportunity. The experience of the first two failures also made NVIDIA realize that a successful hardware product must first meet the mainstream technical standards and specifications in the market.
Next, NVIDIA launched the riva128 graphics card with NV3 chip in 1997, which uses triangle drawing and rendering, and supports mainstream application programming interfaces (APIs) such as Direct X and OpenGL, and has won the favor of the market with its extremely high cost performance. It subsequently released TNT and TNT2, and at a low price and updated API, beat the Voodoo series of graphics cards owned by the display giant 3dfx at that time.
Finally, in 1999, under the name of GeForce 256, NVIDIA released the world’s first GPU, which completely transferred the 3D computing that originally depended on the CPU to the graphics card, and began to compete with ATI in the following years. Road until 2006, when ATI was acquired by AMD.
Although AMD is still continuing ATI’s Radeon graphics card product line after this, Nvidia has already secured the top spot in the graphics card field.
“Leather Swordsman” Huang Renxun
In 2006, NVIDIA released the general parallel computing architecture, also known as CUDA. It also enables the GPU to perform general computing and programming in addition to computing 3D models.
The 8800 GTX released in the same year is one of NVIDIA’s most classic graphics cards. It not only introduced the concept of stream processor that is still used today, but also adopted the Tesla architecture, which became the prefix of the first-generation AI accelerator card C870 released in the same year, and has been used for many years since then.
Since then, NVIDIA has gone farther and farther in general computing and CUDA programming software platforms, and stood out in the AI boom in 2013, with a model training speed several times stronger than CPUs at the same price, and excellent software adaptation, so that the original A display chip only used in the field of games and modeling, it is unrivaled on the artificial intelligence track.
Looking at NVIDIA’s product lines, we can see clear ideas for its business development, such as the GeForce series focusing on games and personal consumption, the Quadro series focusing on 3D modeling and rendering, and the protagonist that is restricted from exporting this time— Tesla series focusing on the field of AI acceleration.
Interestingly, when Nvidia released the Tesla series accelerator cards in 2020, it was easy to cause unnecessary misunderstandings because of the “collision” with the famous electric vehicle brand Tesla, and gave up this prefix, and the subsequent products only used the architecture. Name abbreviation + number naming, such as the A100 with the Ampere architecture and the H100 with the Hopper architecture.
The biggest difference between GeForce and Quadro series lies in the different drivers provided. The drivers of the GeForce series focus more on the optimization of game performance, while the Quadro series focus on the optimization of the performance of professional graphics design and rendering software. There is not much difference in the hardware specifications between the two, but more of a difference at the software level. The Tesla series is different.
First of all, the floating-point calculations commonly used by computers in the GPU need to be completed by different types of cores, which are mainly divided into FP32 single-precision computing cores and FP64 double-precision computing cores. At the same time, there is also the “half-precision” of FP16, and the recently implemented FP8 format to further simplify the precision requirements required for AI computing to improve efficiency and reduce energy consumption.
The world’s first GPU GeForce256
Even so, the ultra-high-precision FP64 is still a computational requirement that cannot be ignored in many scientific research work. After all, in certain fields, such as military industry, atmospheric and virus structure analysis, and other industries that require high accuracy of calculation results, sometimes the difference between one or two significant digits may lead to a thousand miles in error. For the computing needs used in these professional situations, NVIDIA has equipped Tesla series chips with a large number of FP64 computing units.
In the GA100 core of the A100, the ratio of FP64 to FP32 is 1 to 2, and this number is only 1 to 64 on the GA102 core of today’s consumer flagship 3090ti – the difference between game rendering and AI tasks is clear at a glance.
This also makes the FP32 computing power of the 3090ti reach 45TFLOPS, but the FP64 has less than 0.7TFLOPS. The FP32 computing power of the A100 is only 19.5TFLOPS, but the FP64 computing power has reached a terrifying 10TFLOPS, which is 14.3 times that of the 3090ti. The FP64 computing power of the next-generation product H100 recently released by NVIDIA can even reach as much as 30TFLOPS.
Such precise “knife technique” is not uncommon in NVIDIA’s products, and because Huang Renxun wore a black leather jacket at every press conference, he was also awarded the title of “Leather Swordsman” by gamers.
The market value has dropped by nearly 60%
”Any NVIDIA product with future peak performance and chip I/O performance equal to or greater than A100, and any system containing these circuits, will be subject to the new licensing requirements,” Nvidia said in its Aug. 26 SEC filing. .”
Huawei MDC810 autonomous driving chip
NVIDIA Tesla series chips
Such precise “knife technique” is not uncommon in NVIDIA’s products.
Although Nvidia issued a statement two days later that it could continue to fulfill orders for A100 and H100 before September next year, its stock price still fell 22% in 5 trading days due to the previous news. Due to the expected volatility, its market value has fallen by nearly 60% from last year’s peak of US$830 billion.
A dead skinny camel is bigger than a horse. In recent years, the research and development of domestic GPUs has been frequently put on the agenda. For example, the “Fenghua” series GPUs released by Innosilicon have also caused quite a stir in the market, but they are still far from the world’s advanced level represented by NVIDIA. small gap.
In the field of autonomous driving chips, the MDC810 launched by Huawei and the upcoming Journey 6 chip from “Horizon” have a smaller gap with Nvidia. However, Atlan, the next-generation product of NVIDIA’s intelligent driving chip Orin, is also aggressively facing the market with its 1000Tops int8 computing power (1TOPS means that the processor can perform 1 trillion calculations per second).
Although Nvidia’s market value has fallen to around $350 billion under multiple rounds of declines this year, its price-to-earnings ratio is still as high as 46 times. This also shows the potential of this company and the market’s great confidence in the future of the AI industry. Where will Huang Renxun, who will be over 60 next year, lead this giant ship? let us wait and see.