Test-time Scaling Strategies For Generative Retrieval In Multimodal Conversational Recommendations
2025 Β· Hung-Chun Hsu, Yuan-Ching Kuo, Chao-Han Huck Yang, et al.
Abstract
The rapid evolution of e-commerce has exposed the limitations of traditional product retrieval systems in managing complex, multi-turn user interactions. Recent advances in multimodal generative retrieval -- particularly those leveraging multimodal large language models (MLLMs) as retrievers -- have shown promise. However, most existing methods are tailored to single-turn scenarios and struggle to model the evolving intent and iterative nature of multi-turn dialogues when applied naively. Concurrently, test-time scaling has emerged as a powerful paradigm for improving large language model (LLM) performance through iterative inference-time refinement. Yet, its effectiveness typically relies on two conditions: (1) a well-defined problem space (e.g., mathematical reasoning), and (2) the model's ability to self-correct -- conditions that are rarely met in conversational product search. In this setting, user queries are often ambiguous and evolving, and MLLMs alone have difficulty grounding
Authors
(none)
Tags
Stats
Related papers
- Indexing Multimodal Language Models For Large-scale Image Retrieval (2026)0.00
- Generative Giants, Retrieval Weaklings: Why Do Multimodal Large Language Models Fail At Multimodal Retrieval? (2025)0.00
- MLLM Is A Strong Reranker: Advancing Multimodal Retrieval-augmented Generation Via Knowledge-enhanced Reranking And Noise-injected Training (2024)9.18
- Scalingnote: Scaling Up Retrievers With Large Language Models For Real-world Dense Retrieval (2024)0.00
- Improving Tool Retrieval By Leveraging Large Language Models For Query Generation (2024)0.00
- SLQ: Bridging Modalities Via Shared Latent Queries For Retrieval With Frozen Mllms (2026)0.00
- Lamra: Large Multimodal Model As Your Advanced Retrieval Assistant (2024)7.50
- An Interactive Multi-modal Query Answering System With Retrieval-augmented Large Language Models (2024)5.84