hqq-quantization

8.6

285

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.

quantization

8.6

Rating

Installs

AI & LLM

Quick Review

Excellent skill for HQQ quantization with comprehensive coverage of capabilities, workflows, and integrations. The description clearly conveys when to use HQQ vs alternatives. Task knowledge is outstanding with complete code examples for quantization, HuggingFace/vLLM integration, PEFT fine-tuning, and multiple backends. Structure is clean with logical progression from basics to advanced workflows, though slightly dense in the main file. Novelty is strong - HQQ's calibration-free approach and multi-backend support would require significant token usage and research for a CLI agent to replicate, making this skill meaningfully cost-effective. Minor improvement possible by moving some advanced backend details to referenced files.