TacoSkill LAB
TacoSkill LAB
HomeSkillHubCreatePlaygroundSkillKit
© 2026 TacoSkill LAB
AboutPrivacyTerms
  1. Home
  2. /
  3. SkillHub
  4. /
  5. awq-quantization
Improve

awq-quantization

7.6

by zechenzhangAGI

155Favorites
219Upvotes
0Downvotes

Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster inference than GPTQ with better accuracy preservation, or for instruction-tuned and multimodal models. MLSys 2024 Best Paper Award winner.

quantization

7.6

Rating

0

Installs

Machine Learning

Category

Quick Review

Excellent skill documentation for AWQ quantization. The description clearly articulates when to use AWQ versus alternatives (GPTQ, bitsandbytes), making it easy for a CLI agent to decide invocation. Task knowledge is comprehensive with complete code examples for loading pre-quantized models, quantizing custom models, multi-GPU deployment, vLLM integration, and various kernel backends. Structure is well-organized with clear sections, comparison tables, and performance benchmarks. The skill addresses a genuinely complex task (4-bit LLM quantization with activation-aware weight protection) that would require significant research and experimentation for a CLI agent to implement from scratch. Minor points: references to advanced-usage.md and troubleshooting.md suggest additional depth exists. The deprecation notice is important context but doesn't diminish the skill's current utility. Overall, this is a high-quality skill that meaningfully reduces the token cost and complexity of deploying quantized LLMs.

LLM Signals

Description coverage9
Task knowledge10
Structure9
Novelty8

GitHub Signals

891
74
19
2
Last commit 0 days ago

Publisher

zechenzhangAGI

zechenzhangAGI

Skill Author

Related Skills

ml-pipelinesparse-autoencoder-traininghuggingface-accelerate

Loading SKILL.md…

Try onlineView on GitHub

Publisher

zechenzhangAGI avatar
zechenzhangAGI

Skill Author

Related Skills

ml-pipeline

Jeffallan

6.4

sparse-autoencoder-training

zechenzhangAGI

7.6

huggingface-accelerate

zechenzhangAGI

7.6

moe-training

zechenzhangAGI

7.6
Try online