Position: Home|News List

How can various types of computing chips thrive and grow?

Author:Semiconductor Industry ProfilePublish:2024-05-06

With the rise of ChatGPT, everyone can clearly feel the widespread attention to generative artificial intelligence technology in the whole society. With the increasing number and model parameters of large models, the demand for computing power is also growing.

According to the definition in the "China Computing Power Development Index White Paper," computing power is the ability of devices to process data and achieve specific results output through calculation.

The core of computing power implementation lies in various computing chips such as CPU, GPU, etc., and is carried by computers, servers, and various intelligent terminals. Massive data processing and various digital applications all rely on the processing and calculation of computing power.

So, what kind of application scenarios are suitable for different computing chips, and what are the differences between different computing chips?

01 Different scenarios require different computing chips

From small devices like earphones, mobile phones, and PCs to larger ones like cars, the internet, artificial intelligence, data centers, supercomputers, and space rockets, "computing power" plays a core role in all of them, and different computing scenarios have different requirements for chips.

The data center, as the core infrastructure of the digital era, bears a large number of data processing, storage, and transmission tasks. Therefore, they require powerful computing power to meet various complex computing needs. Data centers and supercomputers require high computing power chips exceeding 1000 TOPS. Currently, the computing power of supercomputing centers has entered the era of exascale computing (hundreds of billions of operations per second) and is moving towards Z (thousands of exascale) level computing power. Data centers have extremely high requirements for chips in terms of low power consumption, low cost, reliability, and versatility.

Intelligent autonomous driving involves many aspects such as human-computer interaction, visual processing, and intelligent decision-making. The increasing number of onboard sensors (such as LiDAR, cameras, and millimeter-wave radar) and the increasing real-time, complexity, and accuracy requirements of data processing all place higher demands on the onboard computing power. Generally, it is believed within the industry that achieving L2-level automated driving requires computing power below 10 TOPS, L3-level requires 30-60 TOPS, L4-level requires over 300 TOPS, L5-level requires over 1000 TOPS, and even 4000+ TOPS. Therefore, the onboard computing power in the field of autonomous driving is far greater than the computing capabilities commonly seen in daily life such as smartphones and computers. For example, the processor power of the NIO ET5 reaches 1016 TOPS, and the processor power of the XPeng P7 reaches 508 TOPS. Safety is crucial in intelligent driving, so this scenario places extremely high demands on the reliability of the computing power chips and relatively high demands on the versatility, while the requirements for power consumption and cost are relatively less stringent.

To meet the challenges of complex tasks such as video processing, facial recognition, and anomaly detection, while ensuring that the system has sufficient computing resources for future technological upgrades and expansions, intelligent security systems require approximately 4-20 TOPS of computing power. Although this value is much smaller compared to data centers, it is sufficient to ensure the efficient and stable operation of intelligent security systems. As AI security enters the second half, the importance of computing power becomes more prominent, and this value continues to rise. Intelligent security has relatively high demands for low cost and reliability, while the requirements for power consumption and versatility are relatively moderate.

In intelligent mobile terminals, the demand for computing power in small products such as wearable devices is relatively low, but the demand for computing power in smartphones, laptops, and other products is increasing significantly. For example, the A14 chip in the iPhone 12 from a few years ago had a computing power of about 11 TOPS, while the Snapdragon 865 chip equipped in the Xiaomi 10 phone had a computing power of 15 TOPS. However, with the increasing integration and popularization of AI technology in smartphones, the computing power of the Snapdragon 888 has reached 26 TOPS, and subsequent chips such as 8Gen1 and 8Gen2 have made significant improvements in computing power. Intelligent mobile terminals also have high requirements for low power consumption and low cost, relatively high requirements for reliability, and not many restrictions on versatility.

02 Mainstream Computing Power Chips and Their Characteristics

The current basic computing power is mainly provided by servers based on CPU chips, targeting basic general-purpose computing. Intelligent computing power is mainly provided by accelerated computing platforms based on GPU, FPGA, ASIC, and other chips, targeting artificial intelligence computing. High-performance computing power is mainly provided by computing clusters built on a combination of CPU chips and GPU chips, mainly targeting scientific and engineering computing and other application scenarios.

The CPU is the king of traditional general-purpose computing, including the arithmetic logic unit, control unit, and memory. Data is stored in the memory, and the control unit retrieves the data from the memory and hands it over to the arithmetic logic unit for computation. After the computation is completed, the result is returned to the memory. The CPU is characterized by its strong versatility and ability to handle various types of computing tasks, but its computational efficiency is not as high as chips specifically designed for particular tasks.

The GPU was initially used to accelerate graphics rendering and is also known as a powerful tool for graphics processing. In recent years, the GPU has shown outstanding performance in fields such as deep learning and has been widely used in artificial intelligence computing. The GPU is characterized by a large number of parallel computing units, capable of processing a large amount of data simultaneously, making it highly efficient in parallel computing tasks. However, the GPU is less versatile than the CPU and is only suitable for specific types of computing tasks.

ASIC is a chip designed specifically for particular tasks. It implements algorithms through hardware, achieving extremely high computational efficiency and energy efficiency in specific tasks. ASIC is highly targeted and only suitable for specific tasks, but its computational efficiency and energy efficiency far exceed those of the CPU and GPU, making it suitable for large-scale or mature products.

FPGA utilizes gate circuits for direct computation and has a faster speed. Compared to the GPU, FPGA has higher processing speed and lower energy consumption, but it still falls short compared to ASIC under the same process conditions. However, FPGA can be programmed and is more flexible compared to ASIC. FPGA is suitable for rapid iteration or small-batch products. In the field of AI, FPGA chips can be used as accelerator cards to speed up the computation of AI algorithms.

GPGPU, or General-Purpose Graphics Processing Unit, where the first "GP" stands for general purpose and the second "GP" represents graphics processing, aims to accelerate general computing tasks using the parallel computing capability of the GPU. GPGPU can be colloquially understood as a tool for non-graphics-related program computation, serving as an auxiliary to the CPU. It is suitable for large-scale parallel computing scenarios, such as scientific computing, data analysis, and machine learning.

03 GPUs are the optimal solution for AI, but not necessarily the only solution

In the AI frenzy sparked by ChatGPT, the most popular choice is undoubtedly the GPU. To develop AI, leading global tech giants are competing to hoard NVIDIA's GPUs. Why is the GPU favored by many manufacturers in the AI era?

The reason is simple: AI computing is similar to graphics computing, involving a large number of highly intensive parallel computing tasks.

To explain specifically, training and inference are the cornerstones of AI large models. During training, a complex neural network model is trained using a large amount of data. During inference, the trained model is used to infer various conclusions using a large amount of data.

The training and inference processes of neural networks involve a series of specific algorithms, such as matrix multiplication, convolution, recurrent layer processing, and gradient computation. These algorithms can often be highly parallelized, meaning they can be decomposed into a large number of small tasks that can be executed simultaneously.

The GPU has a large number of parallel processing units, allowing it to quickly perform the matrix operations required in deep learning, thereby accelerating the training and inference of models.

Currently, most enterprises' AI training uses NVIDIA's GPU clusters. With proper optimization, a single GPU card can provide computing power equivalent to dozens or even hundreds of CPU servers. Companies like AMD and Intel are also actively enhancing their technical capabilities to compete for market share. Leading Chinese manufacturers include Jingjia Micro, Loongson, Hygon, Cambricon, and Horizon Robotics.

It can be seen that in the field of AI, GPUs are far ahead, just as NVIDIA defines itself as a leader in artificial intelligence. Almost all applications related to artificial intelligence in the industry currently rely on the presence of GPUs.

At this point, some may ask, in the current prevalence of AI, is it sufficient to rely solely on GPUs? Will GPUs dominate the future AI market and become the undisputed favorite?

The author believes not. While GPUs are currently the optimal solution, they are not necessarily the only solution.

CPUs can play a greater role

Although GPUs currently dominate the AI field, they also face some challenges and limitations. For example, supply chain issues with GPUs have led to price increases and supply shortages, which are burdensome for AI developers and users. CPUs, on the other hand, have more competitors and partners, which can promote technological progress and cost reduction. Moreover, CPUs have more optimization techniques and innovative directions, allowing them to play a greater role in the field of AI.

Some more streamlined or compact models can demonstrate excellent operational efficiency on traditional CPUs, and are often more cost-effective and energy-efficient. This proves that when choosing hardware, the advantages of different processors need to be weighed based on specific application scenarios and model complexity. For example, HuggingFace's Chief AI Evangelist Julien Simon demonstrated a language model Q8-Chat based on Intel Xeon processors. This model has 70 billion parameters and can run on a 32-core CPU, providing a chat interface similar to OpenAI's ChatGPT, and is much faster than ChatGPT.

In addition to running extremely large-scale language models, CPUs can also run smaller and more efficient language models. These language models, through innovative techniques, can significantly reduce computational complexity and memory usage, adapting to the characteristics of CPUs. This also means that CPUs have not been completely marginalized in the field of AI, but have undeniable advantages and potential.

The global CPU market is dominated by the duopoly of Intel and AMD, with a combined market share of over 95%. Currently, six major domestic CPU manufacturers, including Loongson, Shenwei, Hygon, Zhaoxin, Kunpeng, and Feiteng, are rapidly rising, accelerating the development of domestic CPUs.

CPU + FPGA, CPU + ASIC also have potential

Not only that, due to the heterogeneous characteristics of AI accelerated servers, in addition to the combination of CPU+GPU on the market, there are also various other architectures, such as: CPU+FPGA, CPU+ASIC, CPU+various acceleration cards.

The technological revolution is rapid, and it is indeed possible to see more efficient and suitable new technologies for AI computing in the future. CPU+FPGA and CPU+ASIC are among the possibilities for the future.

CPUs are good at logic control and serial processing, while FPGAs have parallel processing capabilities and hardware acceleration features. By combining the two, the overall performance of the system can be significantly improved, especially in handling complex tasks and large-scale data. The programmability of FPGAs allows for flexible configuration and customization based on specific application scenarios. This means that the CPU+FPGA architecture can adapt to various different needs, from general computing to accelerated specific applications, all of which can be achieved by adjusting the configuration of the FPGA.

ASICs are integrated circuits designed specifically for particular applications, so they are usually highly optimized in terms of performance and power consumption. When used in combination with CPUs, they can ensure outstanding performance and efficiency in handling specific tasks. In addition, the design of ASICs is fixed, once manufactured, their function will not change. This makes ASICs perform well in scenarios requiring long-term stable operation and high reliability.

The global FPGA chip market is mainly dominated by Xilinx and Intel, with a combined market share of up to 87%. Major domestic manufacturers include Fudan Microelectronics, SMIC, and Alchip. Foreign giants such as Google, Intel, and NVIDIA have successively released ASIC chips. Domestic companies such as Cambricon, Huawei HiSilicon, and Horizon Robotics have also launched ASIC chips for deep neural network acceleration.

GPGPU can use higher-level programming languages, and is also one of the mainstream choices for AI accelerated servers at present in terms of performance and versatility. The core manufacturers of GPGPU include NVIDIA, AMD, Bitmain, Cambricon, and Horizon Robotics.

04 China's computing power, what is the scale?

According to IDC's forecast, the global volume of new data in the next 3 years will exceed the total volume of the past 30 years. By 2024, the global data volume will grow at a compound annual growth rate of 26% to 142.6ZB. These will lead to exponential growth in the demand for data storage, data transmission, and data processing, continuously increasing the demand for computing resources. In addition, for scenarios such as artificial intelligence, large-scale model training and inference also require powerful high-performance computing resources.

In recent years, China's computing power infrastructure construction has achieved significant results.

By the end of 2023, the total scale of data center racks in use nationwide will exceed 8.1 million standard racks, and the total computing power will reach 23 EFLOPS, and computing power is accelerating its penetration into various industries and fields such as government affairs, industry, transportation, and medical care. At the same time, under the layout of the "Eastern Numbers and Western Computing" project and the national integrated computing power network, the first phase of China's computing power network—Intelligent Computing Network has been launched, and the national computing power "network" has taken shape.

On the policy front, China has successively introduced a series of documents to promote the construction of computing power infrastructure, such as the "National Integrated Big Data Center Collaborative Innovation System Computing Power Hub Implementation Plan", "High-Quality Development Action Plan for Computing Power Infrastructure", and the "14th Five-Year Plan for Digital Economy Development". In addition, the country is promoting the construction of intelligent computing centers in multiple regions, gradually expanding from east to west. Currently, more than 30 cities in China are building or proposing to build intelligent computing centers. According to the policy requirements issued by the Ministry of Science and Technology, "In public computing power platforms with mixed deployment, the nominal computing power provided by self-developed chips accounts for no less than 60%, and domestic development frameworks are given priority, with a usage rate of no less than 60%." The penetration rate of domestic AI chips is expected to increase rapidly. According to IDC data, China's intelligent computing power will grow rapidly in the future, with a compound annual growth rate of 52.3% from 2021 to 2026.


Copyright © 2024 newsaboutchina.com