One-step preference alignment using Odds Ratio Preference Optimization (ORPO) (triggers: ORPO, preference optimization, alignment, ORPOTrainer, log_odds_ratio, binary preference).
1.3
Rating
0
Installs
Machine Learning
Category
No summary available.
Loading SKILL.md…