TacoSkill LAB
TacoSkill LAB
HomeSkillHubCreatePlaygroundSkillKit
© 2026 TacoSkill LAB
AboutPrivacyTerms
  1. Home
  2. /
  3. SkillHub
  4. /
  5. gptq
Improve

gptq

7.6

by zechenzhangAGI

109Favorites
172Upvotes
0Downvotes

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.

quantization

7.6

Rating

0

Installs

AI & LLM

Category

Quick Review

Excellent skill for GPTQ quantization with comprehensive coverage. The description clearly articulates when to use GPTQ vs alternatives (AWQ, bitsandbytes) with specific criteria. SKILL.md provides extensive task knowledge including installation, quantization configs, kernel backends, multi-GPU deployment, and QLoRA integration with complete working code examples. Structure is well-organized with clear sections and references to separate files for calibration, integration, and troubleshooting. The skill is highly novel - implementing 4-bit quantization manually would require deep expertise in Hessian-based optimization, group-wise quantization math, and CUDA kernel integration, easily consuming thousands of tokens for a CLI agent to discover and implement correctly. Minor improvement possible in making trade-off decision trees even more explicit, but overall this is a production-ready, high-value skill that enables deployment of 70B+ models on consumer hardware.

LLM Signals

Description coverage9
Task knowledge10
Structure9
Novelty8

GitHub Signals

891
74
19
2
Last commit 0 days ago

Publisher

zechenzhangAGI

zechenzhangAGI

Skill Author

Related Skills

rag-architectprompt-engineerfine-tuning-expert

Loading SKILL.md…

Try onlineView on GitHub

Publisher

zechenzhangAGI avatar
zechenzhangAGI

Skill Author

Related Skills

rag-architect

Jeffallan

7.0

prompt-engineer

Jeffallan

7.0

fine-tuning-expert

Jeffallan

6.4

mcp-developer

Jeffallan

6.4
Try online