TacoSkill LAB
TacoSkill LAB
HomeSkillHubCreatePlaygroundSkillKit
© 2026 TacoSkill LAB
AboutPrivacyTerms
  1. Home
  2. /
  3. SkillHub
  4. /
  5. training-llms-megatron
Improve

training-llms-megatron

8.7

by davila7

80Favorites
445Upvotes
0Downvotes

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on H100), or require tensor/pipeline/sequence/context/expert parallelism. Production-ready framework used for Nemotron, LLaMA, DeepSeek.

llm-training

8.7

Rating

0

Installs

AI & LLM

Category

Quick Review

Exceptional skill for large-scale LLM training. The description clearly articulates when to use this skill (>1B parameters, GPU efficiency needs, advanced parallelism). SKILL.md provides comprehensive, production-ready workflows with concrete examples for LLaMA training, MoE models, and performance optimization. Task knowledge is outstanding with detailed code snippets, parallelism configuration tables, troubleshooting guides, and clear step-by-step checklists. Structure is clean with a well-organized overview and appropriate delegation to reference files for deep dives. Novelty is extremely high—training 70B-405B parameter models with advanced 5D parallelism (TP/PP/DP/CP/EP) and achieving 47% MFU would require hundreds of thousands of tokens for a CLI agent to figure out independently, making this skill highly cost-effective. Minor improvement possible: could slightly expand the 'When to use vs alternatives' section, but overall this is a production-grade, highly valuable skill.

LLM Signals

Description coverage9
Task knowledge10
Structure9
Novelty10

GitHub Signals

18,073
1,635
132
71
Last commit 0 days ago

Publisher

davila7

davila7

Skill Author

Related Skills

rag-architectprompt-engineerfine-tuning-expert

Loading SKILL.md…

Try onlineView on GitHub

Publisher

davila7 avatar
davila7

Skill Author

Related Skills

rag-architect

Jeffallan

7.0

prompt-engineer

Jeffallan

7.0

fine-tuning-expert

Jeffallan

6.4

mcp-developer

Jeffallan

6.4
Try online