← all papers · overview

Chinatravel: An Open-ended Travel Planning Benchmark With Compositional Constraint Validation For Language Agents

·2026

Abstract

Travel planning stands out among real-world applications of *Language Agents* because it couples significant practical demand with a rigorous constraint-satisfaction challenge. However, existing benchmarks primarily operate on a slot-filling paradigm, restricting agents to synthetic queries with pre-defined constraint menus, which fails to capture the open-ended nature of natural language interaction, where user requirements are compositional, diverse, and often implicitly expressed. To address this gap, we introduce *ChinaTravel*, with four key contributions: 1) a practical sandbox aligned with the multi-day, multi-POI travel planning, 2) a compositionally generalizable domain-specific language (DSL) for scalable evaluation, covering feasibility, constraint satisfaction, and preference comparison 3) an open-ended dataset that integrates diverse travel requirements and implicit intent from 1154 human participants, and 4) fine-grained

Related papers

Ranked by semantic similarity — how closely each paper's abstract matches this one (100% = near-identical topic).