If we are using an IterableDataset, does the HuggingFace trainer automatically split the datasets by node, or is it something one should do manually? See: How to handle IterableDataset with HuggingFace trainer and num_workers in DDP setup
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| How to use split_dataset_by_node and shuffle on iterable dataset | 5 | 1069 | September 15, 2025 | |
| How to handle IterableDataset with HuggingFace trainer and num_workers in DDP setup | 8 | 4188 | September 16, 2025 | |
| MutliGPU Training using split_dataset_per_node with PyTorch Lightning | 1 | 839 | May 24, 2024 | |
| Trainer default distributed training behaviour | 2 | 140 | May 15, 2025 | |
| How to handle streaming datasets with DDP? | 1 | 651 | January 28, 2024 |