Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that don't fit in memory.
8.7
Rating
0
Installs
Data & Analytics
Category
Excellent skill for big data processing with Vaex. The description clearly articulates when to use this skill (large tabular datasets exceeding RAM), and SKILL.md provides comprehensive guidance with a well-organized structure. The six capability areas are logically presented with clear pointers to reference files, enabling a CLI agent to easily determine which references to load for specific tasks. The Quick Start Pattern and Common Patterns sections provide actionable code examples. Task knowledge is thorough, covering DataFrames, processing, performance, visualization, ML, and I/O with best practices. Structure is exemplary - concise overview in SKILL.md with detailed content delegated to references. Novelty is strong: processing billion-row datasets efficiently requires specialized knowledge that would consume many tokens if a CLI agent attempted this without the skill. Minor deduction on novelty only because some basic DataFrame operations overlap with pandas, though the out-of-core and performance optimization aspects are clearly differentiated and valuable.
Loading SKILL.md…

Skill Author