Thanks @muellerzr, could you also take a look at a related problem I have ZeRO uses more RAM than DDP??
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Using device_map='auto' for training | 5 | 39689 | January 24, 2025 | |
| Infer_auto_device_map returns empty | 2 | 3420 | March 15, 2023 | |
| Trainer API for Model Parallelism using AutoModelForQuestionAnswering | 1 | 202 | June 5, 2024 | |
| Would PyTorch's FSDP work with a model loaded using device_map='auto'? | 0 | 299 | April 17, 2024 | |
| How to load model on multiple GPUs for inference? | 0 | 817 | September 28, 2023 |