TacoSkill LAB
TacoSkill LAB
HomeSkillHubCreatePlaygroundSkillKit
© 2026 TacoSkill LAB
AboutPrivacyTerms
  1. Home
  2. /
  3. SkillHub
  4. /
  5. gptq
Improve

gptq

8.7

by davila7

188Favorites
313Upvotes
0Downvotes

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.

quantization

8.7

Rating

0

Installs

AI & LLM

Category

Quick Review

Exceptional skill for GPTQ quantization with comprehensive, production-ready guidance. The description clearly specifies use cases (deploying 70B/405B models on consumer GPUs, 4× memory reduction, 3-4× speedup), enabling precise invocation. Task knowledge is outstanding with complete code examples for loading pre-quantized models, custom quantization, QLoRA fine-tuning, multi-GPU deployment, and batch inference. Structure is excellent: concise SKILL.md with clear overview and decision trees, with detailed topics properly delegated to reference files. The skill provides significant value by consolidating complex quantization workflows, kernel selection (ExLlamaV2, Marlin, Triton), configuration trade-offs, and integration patterns that would otherwise require extensive documentation searches and experimentation. Benchmark data and pre-quantized model sources add immediate practical utility. Minor improvement possible: could explicitly mention compute requirements for quantization itself.

LLM Signals

Description coverage9
Task knowledge10
Structure9
Novelty8

GitHub Signals

18,073
1,635
132
71
Last commit 0 days ago

Publisher

davila7

davila7

Skill Author

Related Skills

rag-architectprompt-engineerfine-tuning-expert

Loading SKILL.md…

Try onlineView on GitHub

Publisher

davila7 avatar
davila7

Skill Author

Related Skills

rag-architect

Jeffallan

7.0

prompt-engineer

Jeffallan

7.0

fine-tuning-expert

Jeffallan

6.4

mcp-developer

Jeffallan

6.4
Try online