High-performance computing

High-performance computing (HPC) is the use of large-scale, parallel computer systems—often Supercomputers and clusters—to perform simulations, analytics, and AI workloads that exceed the capacity of standard machines, typically by distributing work across many processors and accelerators with high-speed interconnects and parallel software. Authoritative definitions emphasize parallel processing across many servers, enabled by clustered infrastructure and, increasingly, cloud-based HPC options (AWS). Britannica characterizes supercomputers as the fastest systems available at a given time, a concept closely intertwined with HPC (Encyclopaedia Britannica). (aws.amazon.com)

History and milestones

The U.S. High-Performance Computing Act of 1991 created a coordinated federal program to sustain leadership in HPC, catalyzing national networks and computing facilities (High-Performance Computing Act of 1991 via Wikipedia overview). A major performance milestone was the first verified exascale system—Frontier at Oak Ridge National Laboratory—announced in May 2022 with an HPL result exceeding 1 exaflop (OLCF/ORNL; HPE). (en.wikipedia.org)

By June 2025, multiple exascale-class systems were officially ranked on the TOP500, led by El Capitan (HPE Cray EX255a, AMD EPYC + Instinct MI300A) at Lawrence Livermore National Laboratory with 1.742 exaflops (HPL), followed by Frontier at 1.353 exaflops and Aurora at 1.012 exaflops (TOP500 – June 2025). (top500.org)

Architectures and system components

HPC systems aggregate thousands to millions of CPU cores and increasingly rely on accelerators such as GPUs to achieve energy-efficient floating‑point throughput; Frontier, for example, couples AMD EPYC CPUs with Instinct GPUs, while other leading systems use NVIDIA or Intel accelerators (OLCF/ORNL; TOP500 – June 2025). Interconnects provide low‑latency, high‑bandwidth communication among nodes; widely used options include NVIDIA’s InfiniBand with in‑network computing features and HPE’s Slingshot Ethernet-based fabric, both designed for large-scale Parallel computing and AI clusters (NVIDIA Networking; HPE Slingshot). (olcf.ornl.gov)

Parallel file systems sustain aggregate I/O bandwidth at scale. Lustre is widely deployed for simulation and data assimilation (e.g., ECMWF), while IBM Spectrum Scale (formerly GPFS) offers a high-performance, shared-disk clustered filesystem used in many scientific and enterprise HPC sites (ECMWF supercomputer facility; IBM Docs – Spectrum Scale overview). (ecmwf.int)

Cooling and power delivery are critical design constraints; modern systems increasingly employ direct liquid cooling and heat‑recovery strategies to improve energy efficiency, as documented for operational European weather systems and U.S. exascale machines (ECMWF facility; OLCF/ORNL Frontier). (ecmwf.int)

Software ecosystem and programming models

HPC application parallelism is typically expressed with the Message Passing Interface (MPI) for distributed-memory scaling and OpenMP for node-level shared-memory parallelism. The MPI Forum released the MPI 5.0 standard in 2025, and the OpenMP Architecture Review Board released OpenMP 6.0 in 2024 (MPI Forum; OpenMP.org). GPU programming models include CUDA for NVIDIA GPUs, AMD’s HIP within the ROCm stack, and the vendor‑neutral SYCL standard from the Khronos Group (NVIDIA CUDA; AMD ROCm docs; Khronos SYCL). Numerics libraries provide high-level building blocks; BLAS and LAPACK from Netlib remain foundational for dense linear algebra and are tuned or wrapped by vendor libraries (Netlib BLAS; Netlib LAPACK). Batch scheduling and resource management are commonly handled by Slurm in Linux-based clusters (SchedMD – Slurm overview). Complementary references include standard texts on computer architecture and HPC software engineering (book://John L. Hennessy|Computer Architecture: A Quantitative Approach|Morgan Kaufmann|2017; book://Georg Hager|Introduction to High Performance Computing for Scientists and Engineers|CRC Press|2010). (mpi-forum.org)

Benchmarks, rankings, and efficiency

The community’s most visible performance metric is the High-Performance Linpack (HPL) benchmark, used to produce the twice‑yearly TOP500 list of fastest systems. HPL solves large dense linear systems using distributed LU factorization (Netlib’s HPL implementation) and reports sustained FLOPS (Netlib HPL; TOP500 Overview). To better reflect memory access patterns and communication bottlenecks seen in real applications, the High‑Performance Conjugate Gradient (HPCG) benchmark complements HPL; in June 2025, El Capitan also led HPCG with 17.4 HPCG-PFLOPS (HPCG technical report; HPCG list – June 2025). Energy efficiency is tracked by the Green500; as of June 2025, the JEDI system led with 72.73 GFLOPS/W using Grace Hopper architecture, while large leadership systems such as El Capitan and Frontier ranked well for efficiency at unprecedented absolute performance (Green500 – June 2025). (netlib.org)

Deployment models and operations

HPC is delivered via national and regional leadership facilities, university and enterprise clusters, and increasingly via cloud platforms that offer on‑demand clusters with fast interconnects, tuned VM images, and managed schedulers. Public examples include Microsoft’s Azure HPC offerings, Google Cloud’s HPC Toolkit for turnkey cluster deployment, and AWS guidance on HPC concepts and architectures (Azure HPC; Google Cloud HPC Toolkit; AWS – What is HPC?). (azure.microsoft.com)

Applications

HPC underpins numerical weather prediction and climate modeling. The European Centre for Medium‑Range Weather Forecasts operates multi‑petascale clusters for global forecasts, using direct liquid cooling, HDR InfiniBand, and Lustre storage; the U.S. National Weather Service similarly relies on supercomputers to assimilate observational data and run atmospheric, ocean, and space‑weather models (ECMWF – Supercomputer facility; NWS – About Supercomputers). Across domains, national exascale initiatives report advances in materials, energy, fusion, and biomedical applications enabled by co‑designed software stacks and accelerators (Exascale Computing Project). (ecmwf.int)

Key internal topics include Parallel computing models and programming standards such as Message Passing Interface and OpenMP, heterogeneous acceleration through the Graphics processing unit, performance leadership at Exascale computing levels, community rankings via TOP500, and consumption models through Cloud computing.

History and milestones

Architectures and system components

Software ecosystem and programming models

Benchmarks, rankings, and efficiency

Deployment models and operations

Applications

Related concepts and standards

History and milestones

Architectures and system components

Software ecosystem and programming models

Benchmarks, rankings, and efficiency

Deployment models and operations

Applications

Related concepts and standards