State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2 (d_state=128, multi-head). Models 130M-2.8B on HuggingFace.
8.1
Rating
0
Installs
AI & LLM
Category
Excellent skill documentation for Mamba architecture. The description clearly articulates the key value proposition (O(n) complexity, 5× faster inference, no KV cache) enabling effective CLI invocation. Task knowledge is comprehensive with complete code examples covering installation, basic usage, pretrained models, Mamba-1 vs Mamba-2 comparison, and benchmarking workflows. Structure is clean with logical progression from quick start to advanced topics, though all content is in SKILL.md (references are mentioned but would contain supplementary details). Novelty is moderate-to-good: while Mamba model usage could be done via standard CLI tools, the skill consolidates version-specific differences (Mamba-1 vs Mamba-2), provides ready-to-use configuration templates, and offers comparative benchmarking scripts that would otherwise require extensive documentation review. The troubleshooting section and hardware requirements add practical value. Minor improvement could come from more complex workflows (e.g., fine-tuning pipelines, hybrid architectures) to increase novelty score.
Loading SKILL.md…

Skill Author