Abstract
arXiv:2603.24226v3 Announce Type: replace Abstract: Scaling studies for industrial search, advertising, and recommendation have largely emphasized enlarging model capacity or refining architectures. Yet in real-world systems, performance is constrained not only by model size but also by the quality and distribution of training data. Our empirical analysis shows two key bottlenecks: increasing parameters alone yields progressively smaller gains, and the challenges introduced by heterogeneous, large-scale behavior data cannot be fully resolved by architecture tuning in isolation. To address this issue, we present UniScale, a unified framework that couples data scaling with model design. UniScale consists of two components. First, ES$^3$, an entire-space sample construction system, broadens supervision beyond conventional sampled training data by enriching intra-domain search contexts with globally attributed supervisory signals and introducing cross-domain examples that reflect user decisions under comparable content exposure conditions. Second, HHSFT, a heterogeneous hierarchical fusion transformer, is tailored to exploit the resulting large-scale heterogeneous data through hierarchical feature interaction and user-interest fusion across the entire behavior space. Together, these components enable more effective scaling than structure-centric optimization alone. Experiments show that UniScale consistently improves offline performance and demonstrates favorable scaling behavior. In online A/B tests on a large e-commerce search platform, it delivers a 1.70% increase in purchases and a 2.04% lift in GMV.