Implements the NOWAIT technique for efficient reasoning in R1-style LLMs. Use when optimizing inference of reasoning models (QwQ, DeepSeek-R1, Phi4-Reasoning, Qwen3, Kimi-VL, QvQ), reducing chain-of-thought token usage by 27-51% while preserving accuracy. Triggers on "optimize reasoning", "reduce thinking tokens", "efficient inference", "suppress reflection tokens", or when working with verbose CoT outputs.
8.1
Rating
0
Installs
AI & LLM
Category
Excellent skill implementing the NOWAIT technique for reasoning optimization in R1-style LLMs. The description clearly communicates when and how to use the skill, with specific trigger phrases and model types. Task knowledge is comprehensive with concrete code examples, integration patterns for popular frameworks (HuggingFace, vLLM), and empirical results showing 27-51% token reduction. Structure is well-organized with clear sections, though the main SKILL.md is slightly lengthy but remains readable. Novelty is strong—suppressing reflection tokens during inference is a specialized technique that would require significant token overhead for a CLI agent to implement from scratch, and the skill encapsulates non-trivial research findings (RL vs distilled model behavior, keyword mapping strategies). Minor deductions for structure density and the fact that some production systems might need additional customization, but overall this is a high-quality, production-ready skill that meaningfully reduces inference costs for reasoning models.
Loading SKILL.md…

Skill Author