NVIDIA RTX PRO 6000 instead of H200 for ZeroGPU

Spaces seems upon restarting today switch to NVIDIA RTX PRO 6000 for ZeroGPU spaces instead of running on H200 with lower VRAM, which breaks some of the Spaces from running properly

GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition
Capability: (12, 0)
CUDA: 12.8

Yeah. I encountered the same issue in one of my Spaces. Since the NVIDIA RTX PRO 6000 is a Blackwell architecture GPU, I had to update PyTorch to version 2.7.1 or later and use cu128 or later, so I did just that.

This might be difficult to resolve in Spaces where updates have stopped or that require an older version of PyTorch…

Part of container log:

/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py:235: UserWarning: NVIDIA RTX PRO 6000 Blackwell Server Edition MIG 2g.48gb with CUDA capability sm_120 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_50 sm_60 sm_70 sm_75 sm_80 sm_86 sm_90. If you want to use the NVIDIA RTX PRO 6000 Blackwell Server Edition MIG 2g.48gb GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

Edit:
Oh… Seems 2.8.0 or later is necessary now…

configuration error
torch version in requirements.txt is not compatible with ZeroGPU. Supported versions: 2.11.0, 2.10.0, 2.9.1, 2.8.0

It goes for all my other Spaces as well, it’s an issue across the board, which I managed to solve some of them but downgrading the Spaces to slower or lower resolution inferences, it just seems backwards to downgrade all Spaces to PRO 6000 from H200, is there no way to enforce H200 for ZeroGPU or is that a permanent downgrade for all Spaces that will happen now? Also upgrading to 2.10, Still gives me PRO 6000 as the GPU and not H200

2.10.0+cu128
12.8
NVIDIA RTX PRO 6000 Blackwell Server Edition
(12, 0)
3.10.13 (main, Mar 12 2024, 12:16:25) [GCC 12.2.0]

It isn’t mentioned in the documentation, and I don’t know anything for sure… These are just lessons learned from observing how it works…

Does upgrading to Torch 2.8.0+ bring H200 support back?

No. This is merely a workaround to prevent errors from occurring even if you happen to get a Blackwell-generation GPU.

Edit:
For example, many LTX-related Spaces use torch==2.8.0, but there is a possibility that they are causing generation errors due to insufficient VRAM: LTX 2.3 Spaces Not Working!

RTX PRO 6000 has lower VRAM especially with the ZeroGPU overhead in comparison to H200, hence majority of big models that were running on H200 before wouldn’t be easily run, unless some offloading gets engineered in the code, in turn means more time for inference on the ZeroGPU quota

Just a heads-up. The ZeroGPU docs are being updated to reflect this hardware change.

Thank you!

When’s the updates happening? I’m hanging to get generating

For now, I’ve created ZeroGPU Blackwell Runtime Drift Recovery Guide v1 based on the publicly available information.