TacoSkill LAB
TacoSkill LAB
HomeSkillHubCreatePlaygroundSkillKit
© 2026 TacoSkill LAB
AboutPrivacyTerms
  1. Home
  2. /
  3. SkillHub
  4. /
  5. training-llms-megatron
Improve

training-llms-megatron

7.6

by zechenzhangAGI

94Favorites
261Upvotes
0Downvotes

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on H100), or require tensor/pipeline/sequence/context/expert parallelism. Production-ready framework used for Nemotron, LLaMA, DeepSeek.

llm-training

7.6

Rating

0

Installs

AI & LLM

Category

Quick Review

Exceptional skill for training large-scale LLMs using Megatron-Core. The description is comprehensive and clearly articulates when to use this skill (>1B parameters, need for GPU efficiency, specific parallelism strategies). Task knowledge is outstanding with three complete, actionable workflows covering standard training, MoE models, and performance optimization, plus extensive troubleshooting. The structure is clean with a well-organized SKILL.md providing practical examples and checklists, while deferring deep-dive details to reference files. Novelty is extremely high—training 70B-462B parameter models with 47% MFU and complex 4D parallelism (TP/PP/DP/CP/EP) would require extensive research and many tokens for a CLI agent to figure out independently. The skill provides production-ready configurations, hardware requirements, and real-world optimization strategies that significantly reduce implementation complexity and cost. Minor improvement possible: could add a decision tree for parallelism selection based on model size/GPU count.

LLM Signals

Description coverage9
Task knowledge10
Structure9
Novelty10

GitHub Signals

891
74
19
2
Last commit 0 days ago

Publisher

zechenzhangAGI

zechenzhangAGI

Skill Author

Related Skills

rag-architectprompt-engineerfine-tuning-expert

Loading SKILL.md…

Try onlineView on GitHub

Publisher

zechenzhangAGI avatar
zechenzhangAGI

Skill Author

Related Skills

rag-architect

Jeffallan

7.0

prompt-engineer

Jeffallan

7.0

fine-tuning-expert

Jeffallan

6.4

mcp-developer

Jeffallan

6.4
Try online