Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.
8.1
Rating
0
Installs
AI & LLM
Category
Excellent skill documentation for LlamaGuard content moderation. The description clearly covers capabilities (6 safety categories, 94-95% accuracy, deployment options). Task knowledge is comprehensive with 5 detailed workflows covering input/output filtering, vLLM deployment, API serving, and NeMo integration, plus troubleshooting for common issues. Structure is well-organized with quick start, workflows, alternatives comparison, and hardware specs. Novelty is moderate-to-good: while content moderation exists via APIs, self-hosted LlamaGuard with GPU optimization (vLLM, quantization) and integration frameworks provides meaningful value for production LLM apps requiring detailed safety categories and control. Minor improvement areas: could benefit from more concrete cost/token comparisons and the skill references three external files (custom-categories.md, benchmarks.md, deployment.md) that would enhance completeness if present, though per instructions these are assumed to exist.
Loading SKILL.md…

Skill Author