GPU Architecture |
Turing |
CUDA Parallel Processing cores |
4608 |
NVIDIA Tensor Cores |
576 |
NVIDIA RT Cores |
72 |
Frame Buffer Memory |
48 GB GDDR6 |
RTX-OPS |
80T |
Rays Cast |
10 Giga Rays/Sec |
Peak Single Precision (FP32) Performance |
14.9 TFLOPS |
Peak Half Precision (FP16) Performance |
29.9 TFLOPS |
Peak Integer Operation (INT8) Performance |
238.9 TOPS |
Deep Learning TeraFLOPS1 |
119.4 Tensor TFLOPS |
Memory Interface |
384-bit |
Memory Bandwidth |
624 GB/s |
Max Power Consumption |
250 W |
Graphics Bus |
PCI Express 3.0 x16 |
Form Factor |
4.4” H x 10.5” L Dual Slot |
Product Weight |
1200 g |
Thermal Solution |
Passive |
NVLink Interconnect |
100 GB/s |
1 FP16 matrix multiply with FP16 or FP32 accumulate
Supported Platforms
- Microsoft Windows Server 2019
- Microsoft Windows Server 2016
- Microsoft Windows Server 2012 R2 (64-bit)
- Microsoft Windows Server 2008 R2 (64-bit)
- Microsoft Windows 10 (64-bit)
- Microsoft Windows 8 and 8.1 (64-bit)
- Microsoft Windows 7 (64-bit)
- Linux® - Full OpenGL implementation, complete with NVIDIA and ARB extensions (64-bit)
3D Graphics Architecture
- Scalable geometry architecture
- Hardware tessellation engine
- NVIDIA® GigaThread™ engine with 7 async copy engines
- Shader Model 5.1 (OpenGL 4.5 and DirectX 12)
- Up to 32K x 32K texture and render processing
- Transparent multisampling and super sampling
- 16x angle independent anisotropic filtering
- 32-bit per-component floating point texture filtering and blending
- 64x full scene antialiasing (FSAA)/128x FSAA in SLI Mode
- Decode acceleration for MPEG-2, MPEG-4 Part 2 Advanced Simple Profile, H.264, HEVC, MVC, VC1, DivX (version 3.11 and later), and Flash (10.1 and later)
- Dedicated H.264 & HEVC Encoder
- Blu-ray dual-stream hardware acceleration (supporting HD picture-in-picture playback)
- NVIDIA GPU Boost (Automatically improves GPU engine throughput to maximize application performance)
NVIDIA CUDA Parallel Processing Architecture
- New RT (Ray Tracing) Core per SM
- Turing SM Architecture (streaming multi-processor design that delivers greater processing efficiency)
- Dynamic Parallelism (GPU dynamically spawns new threads without going back to the CPU)
- Mixed-precision (1-, 4-, 8-, 16-, 32- and 64-bit) computing
- API support includes:
- CUDA C, CUDA C++, DirectCompute 5.0, OpenCL, Java, Python and Fortran
- Error correction codes (ECC) on graphics memory
- Configurable up to 96 KB of RAM (dedicated shared memory size per SM)