NVIDIA Platforms

CUDA · Tensor Cores · NVLink

Blackwell

Blackwell Architecture (2024–2025)

sm_100 (B100/B200) / sm_100a (GB200)

📦 Products / SKUs

  • NVIDIA B100 SXM SXM5 · 192 GB HBM3e
  • NVIDIA B200 SXM SXM5 · 192 GB HBM3e
  • NVIDIA B200 NVL NVLink · 192 GB HBM3e
  • NVIDIA GB200 Grace Blackwell NVL72 NVL72 Rack · 384 GB HBM3e (GPU die)
  • NVIDIA GB200 NVL2 NVL2 Node · 192 GB HBM3e (per GPU)

🖥️ Software Requirements

CUDA 12.4+
Driver (Linux) 550.54.14
Linux Kernel 5.15+
OFED 23.10-0.5.5.0+
Open MPI 4.1.6+

🔗 Interconnect

  • InfiniBand NDR400 (400 Gb/s)
  • PCIe Gen 6
  • NVLink-C2C 900 GB/s (Grace-Blackwell only)

📋 Notes

GB200 Grace Blackwell pairs an ARM Neoverse V2 CPU with a Blackwell GPU via NVLink-C2C. Requires CUDA 12.4+ and driver >= 550.54.14. DOCA 2.6+ required for full NIC/DPU feature support with ConnectX-7.

Hopper

Hopper Architecture (2022–2023)

sm_90 (H100) / sm_90a (GH200)

📦 Products / SKUs

  • NVIDIA H100 SXM5 SXM5 · 80 GB HBM2e
  • NVIDIA H100 PCIe PCIe · 80 GB HBM2e
  • NVIDIA H200 SXM5 SXM5 · 141 GB HBM3e
  • NVIDIA H200 NVL NVLink · 141 GB HBM3e
  • NVIDIA GH200 Grace Hopper Superchip SXM5 / NVL2 · 96 GB HBM3 (GPU) + 480 GB LPDDR5x (CPU)

🖥️ Software Requirements

CUDA 11.8+
Driver (Linux) 520.61.05
Linux Kernel 5.4+
OFED 5.8-3.0.7.0+
Open MPI 4.1.4+

🔗 Interconnect

  • InfiniBand NDR200 / NDR400 (200–400 Gb/s)
  • PCIe Gen 5
  • NVLink-C2C 900 GB/s (GH200 only)

📋 Notes

H100/H200 require CUDA 11.8+ for basic support; 12.0+ recommended. GH200 Grace Hopper requires ARM-native toolchain (aarch64). Transformer Engine (FP8) available from CUDA 11.8+ with cuDNN 8.7+.

Ada Lovelace (Enterprise/HPC)

Ada Lovelace Architecture (2022–2023)

sm_89

📦 Products / SKUs

  • NVIDIA L40S PCIe · 48 GB GDDR6
  • NVIDIA L40 PCIe · 48 GB GDDR6
  • NVIDIA RTX 6000 Ada PCIe · 48 GB GDDR6

🖥️ Software Requirements

CUDA 11.8+
Driver (Linux) 520.61.05
Linux Kernel 5.4+
OFED 5.7-1.0.2.0+
Open MPI 4.1.4+

🔗 Interconnect

  • PCIe Gen 4
  • InfiniBand HDR/NDR (host NIC)

📋 Notes

L40S targets AI inference and training; no HBM memory. NVLink not available on PCIe form factor.

Ampere

Ampere Architecture (2020–2021)

sm_80 (A100/A800) / sm_86 (A30)

📦 Products / SKUs

  • NVIDIA A100 SXM4 80GB SXM4 · 80 GB HBM2e
  • NVIDIA A100 SXM4 40GB SXM4 · 40 GB HBM2
  • NVIDIA A100 PCIe 80GB PCIe · 80 GB HBM2e
  • NVIDIA A800 SXM4 80GB SXM4 · 80 GB HBM2e
  • NVIDIA A30 PCIe · 24 GB HBM2

🖥️ Software Requirements

CUDA 11.0+
Driver (Linux) 450.36.06
Linux Kernel 4.18+
OFED 5.2-2.2.3.0+
Open MPI 4.0.7+

🔗 Interconnect

  • InfiniBand HDR (200 Gb/s)
  • PCIe Gen 4

📋 Notes

A100 is widely deployed in HPC clusters. TF32 and BF16 introduced. A800 is the export-controlled variant of A100 (NVLink bandwidth reduced). CUDA 11.0+ required; 11.4+ recommended for full bf16 support.

Volta

Volta Architecture (2017–2018)

sm_70

📦 Products / SKUs

  • NVIDIA V100 SXM2 32GB SXM2 · 32 GB HBM2
  • NVIDIA V100 SXM2 16GB SXM2 · 16 GB HBM2
  • NVIDIA V100 PCIe 32GB PCIe · 32 GB HBM2
  • NVIDIA V100S PCIe PCIe · 32 GB HBM2

🖥️ Software Requirements

CUDA 9.0+
Driver (Linux) 384.81
Linux Kernel 3.10+
OFED 4.7-1.0.0.1+
Open MPI 3.1.6+

🔗 Interconnect

  • InfiniBand EDR / HDR100 (100 Gb/s)
  • PCIe Gen 3

📋 Notes

Tensor Cores first introduced in Volta. Still widely deployed in production HPC clusters. CUDA 9.0+ required; CUDA 11.x highly recommended. V100 PCIe lacks NVLink; only SXM2 form factor supports NVLink 2.0.

Next Generation (Rubin)

🔭 Next Gen / Roadmap

Rubin Architecture (2026+ expected)

sm_110+ (estimated)

📦 Products / SKUs

  • NVIDIA R100 / GR200 (Rubin / Rubin Ultra) SXM6 / NVL (expected) · HBM4 (TBD)

🖥️ Software Requirements

CUDA 13.0 (estimated)+
Driver (Linux) TBD
Linux Kernel 6.0 (estimated)+
OFED 24.01+ (estimated)+
Open MPI 5.0+ (estimated)+

🔗 Interconnect

  • InfiniBand XDR (800+ Gb/s, estimated)
  • PCIe Gen 7 (estimated)

📋 Notes

Rubin follows Blackwell on NVIDIA's two-year cadence. Details are based on public roadmap disclosures (GTC 2024). Software requirements are estimates and will be updated when GA toolchain versions are released.

AMD Platforms

ROCm · CDNA · Infinity Fabric

MI300 Series (CDNA3)

CDNA3 Architecture (2023–2024)

gfx940 (MI300A) / gfx941 / gfx942 (MI300X)

📦 Products / SKUs

  • AMD Instinct MI300X OAM · 192 GB HBM3
  • AMD Instinct MI300A (APU) OAM · 128 GB HBM3 (unified CPU+GPU)
  • AMD Instinct MI308X OAM · 128 GB HBM3 (export-controlled variant)

🖥️ Software Requirements

ROCm 6.0+
Driver (Linux) 6.3.0 (amdgpu-dkms)
Linux Kernel 5.15+
OFED 23.10-0.5.5.0+
Open MPI 4.1.6+

🔗 Interconnect

  • InfiniBand NDR200 / NDR400
  • PCIe Gen 5
  • Infinity Fabric (MI300A CPU-GPU)

📋 Notes

MI300X has the largest HBM capacity (192 GB) of any GPU as of 2024. MI300A is an APU combining 24 Zen 4 CPU cores with CDNA3 GPU dies on one package via Infinity Fabric. ROCm 6.0+ required for full MI300 support; ROCm 6.1+ recommended for MI300X production workloads.

MI250 Series (CDNA2)

CDNA2 Architecture (2021–2022)

gfx90a

📦 Products / SKUs

  • AMD Instinct MI250X OAM · 128 GB HBM2e (2× 64 GB GCDs)
  • AMD Instinct MI250 OAM · 128 GB HBM2e (reduced bandwidth)
  • AMD Instinct MI210 PCIe · 64 GB HBM2e

🖥️ Software Requirements

ROCm 5.0+
Driver (Linux) 5.13.0 (amdgpu-dkms)
Linux Kernel 5.15+
OFED 5.4-3.5.8.0+
Open MPI 4.1.3+

🔗 Interconnect

  • InfiniBand HDR (200 Gb/s)
  • PCIe Gen 4
  • AMD Infinity Fabric (inter-GCD)

📋 Notes

MI250X consists of two GCDs (Graphics Compute Dies) per OAM module. The Frontier supercomputer (1st exascale system) uses MI250X. ROCm 5.0+ required; 5.4+ recommended for stable production use. GPU-aware MPI requires Open MPI 4.1.3+ with ROCm-aware UCX.

MI100 (CDNA1)

CDNA1 Architecture (2020)

gfx908

📦 Products / SKUs

  • AMD Instinct MI100 PCIe · 32 GB HBM2

🖥️ Software Requirements

ROCm 4.0+
Driver (Linux) 5.10.0 (amdgpu-dkms)
Linux Kernel 5.4+
OFED 5.2-2.2.3.0+
Open MPI 4.0.7+

🔗 Interconnect

  • InfiniBand HDR (200 Gb/s)
  • PCIe Gen 4

📋 Notes

First CDNA-architecture GPU; matrix cores introduced. CDNA1 uses a separate die architecture from RDNA consumer GPUs. ROCm 4.0+ required; ROCm 5.x provides improved HIP compatibility.

Next Generation (MI350 / CDNA4)

🔭 Next Gen / Roadmap

CDNA4 Architecture (2025+ expected)

gfx950+ (estimated)

📦 Products / SKUs

  • AMD Instinct MI350X (Estimated) OAM (expected) · HBM3e 288 GB+ (estimated)
  • AMD Instinct MI355X (Estimated) OAM (expected) · HBM3e (estimated)

🖥️ Software Requirements

ROCm 6.2 (estimated)+
Driver (Linux) TBD
Linux Kernel 6.1 (estimated)+
OFED 24.01+ (estimated)+
Open MPI 5.0+ (estimated)+

🔗 Interconnect

  • InfiniBand XDR (800 Gb/s, estimated)
  • PCIe Gen 5/6 (estimated)

📋 Notes

MI350 series follows MI300 on AMD's annual cadence. Based on public roadmap disclosures (AMD Financial Analyst Day 2023, CES 2024). All software requirements are estimates pending GA toolchain release.

🔌 API Access

Hardware platform data is also available via the REST API (requires API key):

GET /api/v1/hardware
GET /api/v1/hardware/NVIDIA
GET /api/v1/hardware/AMD
GET /api/v1/hardware/NVIDIA/Blackwell

Add Accept: application/yaml to receive YAML output.