MT-Bench
Canonical26papers using it
2024first seen
Papers using MT-Bench (26)
- Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference OptimizationToken-weighted Direct Preference Optimization with AttentionBridging Draft Policy Misalignment: Group Tree Optimization for Speculative DecodingTACOS: Open Tagging and Comparative Scoring for Instruction Fine-Tuning Data SelectionStrategic Over-Parameterization for Generalizable Low-Rank AdaptationMMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-AgentSCOPE: Selective Conformal Optimized Pairwise LLM JudgingDYCP: Dynamic Context Pruning for Long-Form Dialogue with LLMsDropping Just a Handful of Preferences Can Change Top Large Language Model RankingsWeights-Rotated Preference Optimization for Large Language ModelsNot All Preferences are What You Need for Post-Training: Selective Alignment Strategy for Preference OptimizationMaPPO: Maximum a Posteriori Preference Optimization with Prior KnowledgeAlignment through Meta-Weighted Online Sampling: Bridging the Gap
between Data Generation and Preference OptimizationComPO: Preference Alignment via Comparison OraclesReviewInstruct: A Review-Driven Multi-Turn Conversations Generation Method for Large Language ModelsMTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning EvaluationAlignment Data Map for Efficient Preference Data Selection and DiagnosisRSPO: Regularized Self-Play Alignment of Large Language ModelsCrowdSelect: Synthetic Instruction Data Selection with Multi-LLM WisdomDiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language ModelsLeveraging Uncertainty Estimation for Efficient LLM RoutingRIDE: Enhancing Large Language Model Alignment through Restyled
In-Context Learning Demonstration ExemplarsCapturing Nuanced Preferences: Preference-Aligned Distillation for Small
Language ModelsStar-Agents: Automatic Data Optimization with LLM Agents for Instruction
TuningMargin Matching Preference Optimization: Enhanced Model Alignment with Granular FeedbackTODO: Enhancing LLM Alignment with Ternary Preferences