High performance cloud computing

High Performance Cloud Computing (HPC2) is a term coined by Robert L. Clay of Sandia National Labs to refer to a body of work focused on providing a scalable application runtime environment based on core notions from Cloud Computing (specifically, extreme hardware fault tolerance through software) applied for use on high performance machine architectures (high cross-section bandwidth).

Work on HPC2 emerged as a response to the perceived breakdown of several core assumptions in traditional high performance computing (HPC) at extreme scale (exascale and beyond). These assumptions include: 1. That compute nodes persist for the duration of a job. 2. That the MPI programming model will scale to arbitrary size. 3. That we can build hardware that is sufficiently reliable (to not require fault oblivious software). 4. That capability machines are fundamentally different than capacity machines.

It should be noted that these assertions were meant as much for rhetorical purposes as strictly technical observations.

An alternative set of assumptions was offered to replace this set, based on the perception that (at least the first three of) the above assumptions were in effect failing as we scaled machines and applications.

This alternate set of assumptions include: 1. That compute nodes will fail during execution of job, and so will other hardware components. 2. That the MPI cooperative computing model will not scale far enough. 3. That sufficiently reliable hardware is too expensive and impractical at scale. 4. That capability machines of the future May Be similar to capacity machines.

This fourth assertion was posed as a question.

The core notions driving HPC2 are oriented around building an application runtime system that can scale to arbitrary size, and that is not specific to any one hardware system design or configuration.

🪦 Wikipedia History

3.5 yearsage

1editors

1edits

Archive Provenance

Created: December 10, 2009

Deleted: June 20, 2013

Article size: 2.1 KB

Technical Metadata

Wikipedia page ID: 24064037

Metadata captured: May 8, 2026 8:03 PM

Metadata updated: May 8, 2026 8:03 PM

Subject Tags

Cloud computingDistributed data storage

Why Deleted

PROD

Created by single-purpose account four years ago, never asserted notability of this neologism

by Salix alba

Expired PROD, concern was: Created by single-purpose account four years ago, never asserted notability of this neologism

Archive Inventory

View stored source record counts

Revision rows stored: 0

Outgoing links stored: 7

External links stored: 0

Templates stored: 1

Talk exports stored: 0

AfD exports stored: 0

Raw API payloads stored: 0

Image records stored: 0

View full source metadata

Outgoing Wikipedia links (7)

Fault-tolerant systemHigh-performance computingParallel programming modelRun-time systemRuntimeSandia National LaboratoriesSystems design

Templates (1)

Article issues

High performance cloud computing

See Also

Similarity Enhanced Transfer

Intercloud

Nivio

AirSet