New options for preference tuning: rpo alpha, logprobs normalization, reference-free, simpo gamma#327
Merged
Merged
Commits
Commits on Jun 12, 2025
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jun 13, 2025
- committed
- committed
- committed
- committed