Due to the differences in process, packaging, and demand between processors and memory, the performance gap between the two has been widening since 1980 to the present. Data shows that the speed mismatch between processors and memory is increasing at a rate of 50% per year.
The memory data access speed cannot keep up with the processor's data processing speed, and data transmission is like being in a huge funnel. No matter how much the processor pours in, the memory can only "trickle". The narrow data exchange channel between the two and the high energy consumption it causes have built a "memory wall" between storage and computation.
With the explosive growth of data, the impact of the memory wall on computing speed is becoming more and more apparent. To reduce the impact of the memory wall, improving memory bandwidth has always been a key issue in the focus of storage chips.
For a long time, the value proposition of the memory industry has been largely oriented towards system-level needs, and has already broken through the current limits of system performance. It is obvious that there will be an inflection point in the improvement of memory performance, as more and more people begin to question whether system performance can always be improved through trade-offs at the memory level (such as power consumption, heat dissipation, board space, etc.).
Advertisement
Based on research on advanced technologies and solutions, the memory industry has conducted deeper explorations in new fields. As an important part of the memory market, DRAM technology is constantly upgrading and evolving. DRAM has developed from 2D to 3D technology, with HBM being the main representative product.
HBM (High Bandwidth Memory, high bandwidth memory) is a new type of CPU/GPU memory chip, which is actually a combination of many DDR chips stacked together and packaged with the GPU to achieve a large capacity and high bandwidth DDR combination array.
By increasing bandwidth and expanding memory capacity, larger models and more parameters can be kept closer to the core computation, thereby reducing the latency brought by memory and storage solutions.
From a technical perspective, HBM enables DRAM to transition from traditional 2D to three-dimensional 3D, making full use of space and reducing area, in line with the development trend of miniaturization and integration in the semiconductor industry. HBM has broken through the bottleneck of memory capacity and bandwidth, and is regarded as a new generation of DRAM solutions. The industry believes that this is a new path opened up by DRAM through the diversification of the memory hierarchy, and it has revolutionarily improved the performance of DRAM.
In the memory field, a competition about HBM has quietly started.
Giant leads, the era of HBM3 is coming.It is understood that HBM mainly achieves chip stacking through Silicon Through-Hole (TSV) technology to increase throughput and overcome the bandwidth limitations within a single package, stacking several DRAM chips vertically like floors.
SK Hynix stated that TSV is a technology that drills thousands of fine holes on the DRAM chip and connects the upper and lower chips with vertically penetrating electrodes. This technology stacks several DRAM chips on a buffer chip and transmits signals, instructions, and current through columnar channels that penetrate all chip layers. Compared to traditional packaging methods, this technology can reduce the volume by 30% and reduce energy consumption by 50%.
Thanks to the TSV method, HBM has greatly improved capacity and data transfer rates. Compared with traditional memory technology, HBM has higher bandwidth, more I/O numbers, lower power consumption, and smaller size. As the amount of stored data increases, the market demand for HBM is expected to increase significantly.
The high bandwidth of HBM is inseparable from the support of various basic technologies and advanced design processes. Since HBM stacks a logic die with 4-16 DRAM dies in a 3D structure, the development process is extremely complex. Given the technical complexity, HBM is recognized as the flagship product that best demonstrates the technical strength of manufacturers.
In 2013, SK Hynix applied TSV technology to DRAM and successfully developed HBM for the first time in the industry.
The working frequency of HBM1 is about 1600 Mbps, the drain power supply voltage is 1.2V, and the chip density is 2Gb (4-hi). The bandwidth of HBM1 is higher than that of DDR4 and GDDR5 products, and it consumes less power with a smaller appearance size, which can better meet the bandwidth requirements of processors with high bandwidth requirements such as GPUs.
Subsequently, storage giants such as SK Hynix, Samsung, and Micron have launched an upgrade competition in the HBM field.
In January 2016, Samsung announced the mass production of 4GB HBM2 DRAM and produced 8GB HBM2 DRAM in the same year; in the second half of 2017, SK Hynix, which was overtaken by Samsung, began mass production of HBM2; in January 2018, Samsung announced the mass production of the second-generation 8GB HBM2 "Aquabolt."
At the end of 2018, JEDEC launched the HBM2E specification to support increased bandwidth and capacity. When the transfer rate rises to 3.6Gbps per pin, HBM2E can achieve a memory bandwidth of 461GB/s per stack. In addition, HBM2E supports a stack of up to 12 DRAMs, with a memory capacity of up to 24GB per stack. Compared with HBM2, HBM2E has more advanced technology, a wider range of applications, faster speed, and larger capacity.In August 2019, SK Hynix announced the successful development of the new generation "HBM2E"; in February 2020, Samsung also officially announced the launch of its 16GB HBM2E product "Flashbolt", which began mass production in the first half of 2020.
According to Samsung, its 16GB HBM2E Flashbolt, by vertically stacking 8 layers of 10-nanometer 16GB DRAM chips, can provide a memory bandwidth level of up to 410GB/s and a data transfer rate of 3.2 GB/s per pin.
SK Hynix's HBM2E processes at a speed of 3.6Gbps per pin, capable of handling over 460GB of data per second, including 1024 data I/O. By using TSV technology to vertically stack 8 16GB chips, its HBM2E has a single capacity of 16GB.
In 2020, another memory giant, Micron, announced its entry into this competition.
Micron stated at the financial report meeting at that time that it would start providing HBM2 memory/video memory for high-performance graphics cards and server processor products, and it is expected that the next generation of HBMNext will be released at the end of 2022. However, there has been no news about Micron's relevant product dynamics to date.
In January 2022, the JEDEC organization officially released the standard specifications for the new generation of high-bandwidth memory HBM3, continuing to expand and upgrade in terms of storage density, bandwidth, channels, reliability, and energy efficiency, including:
The main interface uses 0.4V low-swing modulation, and the operating voltage is reduced to 1.1V, further improving energy efficiency.
The data transfer rate is doubled again on the basis of HBM2, with a transfer rate of 6.4Gbps per pin, and with a 1024-bit width, the highest bandwidth per piece can reach 819GB/s.
If four are used, the total bandwidth is 3.2TB/s, and six can reach 4.8TB/s.
The number of independent channels is doubled from 8 to 16, and with virtual channels, a single piece supports 32 channels.Supports 4-layer, 8-layer, and 12-layer TSV stacking, and is prepared for future expansion to 16-layer TSV stacking.
Each storage layer capacity is 8/16/32Gb, with a starting single capacity of 4GB (8Gb 4-high) and a maximum capacity of 64GB (32Gb 16-high).
Supports platform-level RAS reliability, integrated ECC (Error-Correcting Code) error detection and correction, and supports real-time error reporting and transparency.
JEDEC states that HBM3 is an innovative approach, offering a solution for higher bandwidth, lower power consumption, and capacity per unit area, which is crucial for applications requiring high data processing rates, such as graphics processing and high-performance computing servers.
HBM Performance Evolution (Source: Rambus)
SK Hynix developed the world's first HBM3 as early as October 2021, mass-produced the HBM3 DRAM chip in June 2022, and will supply it to NVIDIA, continuing to consolidate its market-leading position. With NVIDIA's use of HBM3 DRAM, data centers may usher in a new round of performance revolution.
According to previous information, SK Hynix offers two capacity products, one is a 24GB (196Gb) with 12-layer silicon through-hole technology vertical stacking, and the other is a 16GB (128Gb) with 8-layer stacking, both providing a bandwidth of 819 GB/s, and the former chip height is only 30 micrometers. Compared with the previous generation HBM2E's bandwidth of 460 GB/s, HBM3's bandwidth has increased by 78%. In addition, HBM3 memory also has built-in chip error correction technology, improving the reliability of the product.
SK Hynix has always been very active in the development of HBM. To meet the increasing expectations of customers, it is imperative to break through the existing framework and develop new technologies. SK Hynix is also working closely with participants in the HBM ecosystem (customers, foundries, and IP companies, etc.) to enhance the ecosystem level. The transformation of the business model is also a general trend. As a leader in HBM, SK Hynix is committed to continuous progress in the field of computing technology and strives to achieve the long-term development of HBM.
Samsung is also actively following up. At the 2022 technology conference, Samsung released a memory technology development roadmap that covers the evolution of memory interfaces in different fields. First, in the field of high-performance cloud servers, HBM has become the standard for high-end GPUs, which is also one of the fields that Samsung is focusing on investing in. The characteristic of HBM is the use of advanced packaging technology, multi-layer stacking to achieve ultra-high IO interface width, while matching a higher interface transfer rate, thereby achieving high energy efficiency and ultra-high bandwidth.
In Samsung's released roadmap, the 2022 HBM3 technology has been mass-produced, with a single chip interface width of up to 1024 bits and an interface transfer rate of up to 6.4Gbps, which is 1.8 times higher than the previous generation, thus achieving a single chip interface bandwidth of 819GB/s. If a 6-layer stack is used, a total bandwidth of 4.8TB/s can be achieved.In 2024, it is expected that the interface speed of HBM3p will reach up to 7.2Gbps, thereby further increasing the data transfer rate by 10% compared to the current generation, and thus increasing the stacked total bandwidth to over 5TB/s. In addition, these calculations do not yet take into account the high multi-layer stacking and memory width enhancement brought by advanced packaging technology, and it is expected that both single-chip and stacked chips of HBM3p will achieve more total bandwidth enhancement in 2024. This will also become an important driving force for artificial intelligence applications, and it is expected to see the use of HBM3p in the next generation of flagship cloud GPUs after 2025, thereby further strengthening the computing power of cloud artificial intelligence.
From HBM1 to HBM3, SK Hynix and Samsung have always been the leading enterprises in the HBM industry.
Future potential and evolution direction of HBM
For the next planning strategy and technological progress, the industry aims to break through the current limits of HBM in terms of speed, density, power consumption, and board space occupation.
Factors affecting HBM performance
Firstly, in order to break through the speed limit, SK Hynix is evaluating the pros and cons of traditional methods of increasing the pin data rate, as well as the I/O bus width of more than 1024 data to achieve better data parallelism and backward design compatibility. In simple terms, it means obtaining higher bandwidth performance with the least trade-offs.
In response to the higher memory density requirements needed for larger data sets and training workloads, storage manufacturers have begun to study the expansion of Die stacking layers and physical stacking height, as well as increasing the core Die density to optimize stacking density.
On the other hand, efforts are also being made to improve power efficiency by evaluating the memory structure and operation schemes from the lowest microstructure level to the highest Die stacking concept, to minimize the absolute power consumption for each bandwidth expansion. Due to the physical limitations of the existing interposer photomask size and other related technologies supporting the processing unit and HBM Cube, it is particularly important to minimize the total memory Die size. Therefore, industry manufacturers need to increase the number of storage units and functions without expanding the existing physical size, thereby achieving a leap in overall performance.
However, from the perspective of the industrial development history, the premise of completing the above tasks is: storage manufacturers need to work together and cooperate with upstream and downstream ecosystem partners to expand the scope of HBM use from existing systems to potential next-generation applications.
In addition, the new HBM-PIM (in-memory computing) chip introduces an AI engine into each storage unit, thereby transferring processing operations to HBM.Under traditional architectures, the power consumption required to transfer data from memory units to computing units is about 200 times that of the computation itself. The power consumption spent on data movement is much greater than that of computation, hence the energy and time proportion actually used for computation is very low. The frequent migration of data between the storage and the processor brings a serious transmission power consumption issue, known as the "power wall." New types of memory are designed to alleviate the burden of moving data between memory and processors.
In conclusion,
Over the past few years, the bandwidth of High Bandwidth Memory (HBM) products has increased several times and has now approached or reached the milestone of 1 terabyte per second. Compared to other products during the same period, which only saw a two to three times increase in bandwidth speed, the rapid development of HBM is attributed to the competition and rivalry among memory manufacturers.
Memory bandwidth refers to the amount of data that can be transferred within a unit of time. To increase bandwidth, the simplest method is to increase the number of data transmission lines. In fact, each HBM consists of up to 1024 data pins, and the internal data transmission paths of HBM have significantly increased with the development of each generation of products.
Data transmission path configurations of each generation of HBM products
Looking back at the evolution of HBM, the first generation of HBM had a data transmission rate of about 1 Gbps; the second generation product, HBM2, launched in 2016, had a maximum data transmission rate of 2 Gbps; in 2018, the third generation product, HBM2E, had a maximum data transmission rate of 3.6 Gbps. Now, SK Hynix and Samsung have developed the fourth generation product, HBM3, which is expected to continue to make significant progress in data transmission rate.
In terms of performance, HBM is undoubtedly outstanding, with significant advantages in data transmission rate, bandwidth, and density. However, HBM is still mainly used in applications such as servers and data centers, and its biggest limiting factor is cost. For consumer fields that are more sensitive to cost, the threshold for using HBM is still relatively high.
Although HBM has evolved to the fourth generation, it is still in a relatively early stage and has a long way to go in the future.
What is foreseeable is that with the rise of application markets such as artificial intelligence, machine learning, high-performance computing, and data centers, the complexity of memory product design is rapidly increasing and putting forward higher requirements for bandwidth. The continuously rising bandwidth demand continues to drive the development of HBM. Market research firm Omdia predicts that the total revenue of the HBM market will reach 2.5 billion US dollars by 2025.
In this process, as storage giants continue to exert effort and upstream and downstream manufacturers join the game, HBM will receive more and more attention and favor.
Post Comment