Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.
8.1
Rating
0
Installs
Data & Analytics
Category
Excellent Spark optimization skill with comprehensive coverage of production patterns including partitioning, joins, caching, memory tuning, and shuffle optimization. The description clearly identifies when to use this skill. Task knowledge is strong with detailed code examples, configuration templates, and practical patterns for common scenarios (skew joins, broadcast joins, bucketing). Structure is good with clear sections and a helpful quick start, though a single-file format is acceptable given Spark's cohesive optimization domain. Novelty is solid: while Spark documentation exists, this skill consolidates production patterns, provides decision matrices, and includes diagnostic utilities that would require substantial tokens for an agent to synthesize from scratch. Minor improvements could include more cross-references between patterns and additional troubleshooting workflows.
Loading SKILL.md…

Skill Author