<div style="width:100%;text-align:center;">
![](././../../images_dir/1676718480/1.png)
</div>

Until a few years ago, RISC-V processors were considered as auxiliary processors with specific functions. Now they seem to be playing a completely different role - high-performance computing.

It is still under discussion. The question about the software ecosystem still exists, or whether the chips, circuit boards and systems are reliable enough. There are both business problems and technical problems, of which business problems are the most difficult. However, this shows the development momentum of RISC-V architecture. Because of its open ISA, the adoption and testing of RISC-V have increased significantly. This in turn gives the industry the freedom to innovate.

Rupert Baines, chief marketing officer of Codasip, said: "ISA (instruction set architecture) itself is not so attractive." "The key is what you have done around it. Therefore, the working group on security and publishing best practices, guidelines and reference architectures is very important. Open Titan is the root of open source trust, and it is very important because it is a very good reference architecture. People can refer to Open Titan instead of changing direction, and may make mistakes."

The biggest question now is how far this architecture can go in a new direction. The chip industry has firmly entered the era of computing in specific fields. In this era, processors can be highly customized to meet specific tasks, and then outperform other fixed architectures in completing these tasks. However, if the software needs to be optimized for these customized cores, it will also make software migration more difficult.

For RISC-V, high-performance computing and supercomputing may represent a huge leap. Supercomputers are defined as computers with high performance compared with general-purpose computers. They are usually floating point computers with vector extension. The current leader Frontier can run at a speed of about 110 billion times on the LINPACK benchmark. It has 8730112 processing cores based on x86 ISA.

Nevertheless, with the widespread popularity of other alternative products, the demand for such behemoths is also changing. The high-performance computer used to be a general-purpose computer with customized design. Today, everyone who deploys high-speed server clusters (whether locally or in the cloud) can use very similar functions.

Whether RISC-V is likely to play a role here needs to be examined from different perspectives. Who may need a supercomputer based on RISC-V architecture, and who is willing to pay for it? Does RISC-V ISA and extension have all the functions required to create supercomputers? Has anyone created a kernel with proper performance? Are all necessary software in place?

**Follow ARM's footsteps**

Until recently, most supercomputers were based on Intel's x86 architecture. ARM hopes to improve its popularity in the field of high-performance computing and is ready to provide basic hardware around 2016.

Rob Aitken, a researcher at Synopsys, said: "When the first ARM supercomputer program started, ARM was not ready, because its ecosystem was not all ready, or all problems were not solved." "More importantly, some people say that it is close enough, and I am willing to take the risk. I am willing to try. What I want to say is that RISC-V has reached or is close to the critical point, that is, someone is willing to gamble and develop something for supercomputers."

On June 22, 2020, the Japanese Fugaku supercomputer driven by the Fujitsu 48-core A64FX SoC became the first ARM driven supercomputer, at least temporarily the fastest computer in the world. The most powerful high-performance computers can be found in the Top 500 list.

Performance is not the only consideration. David Lecomber, senior director of high-performance computing and tools for ARM infrastructure business, said: "To become a successful high-performance computing processor, it is necessary to support the application ecosystem and important cutting-edge server standards while providing performance, efficiency and security." "In terms of design flexibility, it is important to provide this flexibility where developers need it most. For example, a stable and consistent ISA is crucial for commercial high-performance computing developers, but it is very flexible to design in your own memory subsystem (DDR5, HBM, CXL add-on) or accelerator (bare chip or PCIe/CXL add-on)."

**What does the fastest mean?**

In the past few years, the performance indicators of the industry have been changing. Although absolute performance still occupies the most important position, the system is often limited by power, which leads to an optimized architecture for specific tasks. But this also raises the question of how to measure performance, because no machine can be the fastest in every task.

For many years, the industry has been using the LINPACK benchmark, but this is becoming increasingly controversial and cannot provide a simple answer. One way is to extend the benchmark, which is called the HPC Challenge benchmark suite. One of the sponsors, Jack Dongara, a professor of computer science at the University of Tennessee, was commissioned by the United States government to conduct research on this issue. But solving one problem will lead to another. The benchmark no longer generates a single number, which makes comparison difficult.

Performance is difficult to measure for other reasons. Throughput and latency are usually antagonistic, which is not limited to supercomputers. Therefore, one system can generate an answer faster, but another system can generate a series of answers in a shorter time, even if you need to wait for the first answer for a little longer.

With applications now able to scale to more than 1 million cores on the business cloud, building a high-performance computer of the right size is no longer a problem. Now is the time to generate results, especially for those tasks that need to produce results in real time as possible. This means that high-performance computing may continue to be used for tasks such as financial transactions. In this case, even if you beat your opponent with the weakest advantage, it means that you won and the other party lost - sometimes it means a huge amount of money.

**Balancing system**

Making any computer requires many factors to maintain an appropriate balance. Frank Schirrmeister, vice president of IP solutions and business development of Arteris, said: "When you study high-performance computing, it usually focuses on issues such as clock frequency, number of cores, scalability of cores and related interconnections." "But memory bandwidth, power efficiency, and the ability to add its own vector instructions are also important."

It must be considered as a data flow problem. Aitken of Synopsys said: "Data must be loaded from memory to the processor, processed by the processor or accelerator, and then returned to memory from somewhere." "This is the whole path, in which there are bottlenecks. 'Non kernel' is the key part, and memory system is the key part. When solving a specific task, you must determine the bottleneck in the architecture. This has nothing to do with CPU. In the enterprise field, the world is studying the bottleneck of RISC-V, but it has not been found yet."

In many cases, real innovation occurs in non-kernel. "When you look at a cluster, there are many processors connected to each other," said Schirrmeister of Arteris "This requires considering the scalability of the kernel, which means optimizing the kernel and interconnection together. RISC-V gives you the freedom to innovate at this level, which may be a little better than some standard licenses. But this requires a lot of work, and of course it is not an easy thing. This is one of the secret weapons of how to work after cluster integration."

Today, many tasks, such as AI/machine learning, are driven by custom accelerators, while the common kernel may only schedule and coordinate tasks. Travis Lanier, vice president of Ventana, said: "You will have to accelerate in specific areas, or use various accelerators to handle the increasing computing volume of these data centers." "Universal CPU cannot do this."

Others agree. Lecomber of Arm said: "Kernel performance is a bet." "A CPU with high-performance computing needs good vector performance, and each kernel needs good memory bandwidth. Finally, it is also important that a CPU with high-performance computing needs to improve efficiency. Developers need to improve programming efficiency in order to obtain the maximum performance from available kernels and accelerators. The power efficiency of rack-level and data centers is becoming a factor limiting design and operation."

The performance of the chip is not only related to ISA or RTL. Schirrmeister said: "If you observe any IP, their success is often based on the connection with physical tools, that is, the physical perception of things." "Even for us, interconnection is a part of the system, which also requires the joint optimization of IP and implementation process to obtain appropriate performance and power. The same is true for RISC-V to have the ability of high-performance computing. This is not easy, but there have been some announcements indicating that the processor seems to conflict directly with some other cores in the data center."

Performance is not just hardware dependent. Migrating and optimizing software for specific hardware may take a long time, which requires an appropriate ecosystem. Schirrmeister added: "Arm is very smart about how to arrange the ecosystem." "The ecosystem is centered on different architectures, such as x86, ArmV9 and RISC-V. These ecosystems always take some time to prepare and get all the support. It takes time to develop and stabilize the ecosystem. I would like to say that RISC-V may be too early. Yes, the momentum is great, and we may realize RISC-V faster than in the past. RISC-V benefits from the development of ARM, because you can learn from the process of their firm foothold See that. "

**Industry support**

There is obviously a lot of work to be done to make RISC-V capable of high-performance computing. In order to help promote discussion and necessary work, RISC-V Alliance created a Special Interest Group on High Performance Computing (SIG-HPC). The goal of this group is to meet the needs of the high-performance computing community and coordinate RISC-V ISA. According to their website, they start from the definition of scope, and the interests of SIG-HPC are arranged in the order of discovery, gap analysis and implementation to provide very influential results. In order to achieve this, we need to do two things - plan a competitive road, expand this road, and lead the community with new features and capabilities.

Many things have also happened in the industry, indicating the development direction of several companies. Intel invested heavily in Barcelona Supercomputing Center. It announced that it would invest 400 million euros to establish a new laboratory dedicated to the development of RISC-V processors and supercomputing. However, Jeff McVeigh, vice president and general manager of Intel Supercomputing Group, said in the relevant press release, "RISC-V for high-performance computing still has many years to go."

Their goal is to build a zetta scale system within five years, which is several orders of magnitude faster than the current supercomputer.

MIPS, another high-performance processor developer, announced last year that it had turned to develop processors using RISC-V. MIPS announced the launch of the first core based on RISC-V ISA, which is currently licensed for applications such as automobile driving assistance system and automatic driving. But MIPS said that the processor core can also be used for data center, storage and high-performance computing.

Just like software development, 90% of the workload is only half completed. Tom Cargill of Bell Labs once said a famous saying: "The first 90% of the code takes up the first 90% of the development time. The remaining 10% of the code takes up the other 90% of the development time."

Source:爱集微 2023-02-17 For academic sharing only, please indicate the source for reprinting. In case of infringement, please contact email:lvzhiqiang@perfxlab.com Delete or modify!