I think that method is correct, but there seem to be reports of tensor mismatch issues when training with FSDP.
John6666
2
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| How to load a checkpoint model with SHARDED_STATE_DICT? | 5 | 2214 | January 11, 2024 | |
| Transformers Trainer + Accelerate FSDP: How do I load my model from a checkpoint? | 3 | 16768 | June 22, 2025 | |
| Key errors when trying to load an accelerate-FSDP model checkpoint | 1 | 708 | September 2, 2024 | |
| Loading a peft model which is saved on multiple nodes using sharded_state_dict? | 0 | 73 | August 2, 2024 | |
| Unable to load a model trained via FSDP | 3 | 282 | October 12, 2024 |