Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2
7.5
Rating
0
Installs
Machine Learning
Category
This skill provides comprehensive PyTorch FSDP guidance extracted from official documentation. The description clearly identifies when to use the skill (FSDP-related tasks), and the content includes detailed API documentation, initialization patterns, distributed communication primitives, and debugging guidance. The structure is reasonable with quick reference patterns, though the SKILL.md is somewhat cluttered with large documentation blocks that could be better organized into separate reference files. The skill offers moderate novelty - while it consolidates scattered FSDP documentation, much of this information could be retrieved through standard documentation queries, though having it pre-organized does save tokens and time for complex FSDP setups involving parameter sharding, mixed precision, and CPU offloading.
Loading SKILL.md…