unsloth-orpo

1.3

One-step preference alignment using Odds Ratio Preference Optimization (ORPO) (triggers: ORPO, preference optimization, alignment, ORPOTrainer, log_odds_ratio, binary preference).

preference-optimization

1.3

Rating

Installs

Machine Learning

Quick Review

No summary available.

LLM Signals

Description coverage-

Task knowledge-

Structure-

Novelty-

GitHub Signals

Last commit 0 days ago

Publisher

majiayu000

Skill Author

Loading SKILL.md…

Try onlineView on GitHub

Publisher

majiayu000

Skill Author

Related Skills

ml-pipeline

Jeffallan

6.4

sparse-autoencoder-training

zechenzhangAGI

7.6

huggingface-accelerate

zechenzhangAGI

7.6

moe-training

zechenzhangAGI

7.6

Try online

Improve

unsloth-orpo

1.3

by majiayu000

One-step preference alignment using Odds Ratio Preference Optimization (ORPO) (triggers: ORPO, preference optimization, alignment, ORPOTrainer, log_odds_ratio, binary preference).

preference-optimization

1.3

Rating

Installs

Machine Learning