← all papers Β· overview

The Best of the Two Worlds: Harmonizing Semantic and Hash IDs for Sequential Recommendation

Abstract

arXiv:2512.10388v2 Announce Type: replace Abstract: Conventional Sequential Recommender Systems (SRS) typically assign unique hash IDs (HID) to construct item embeddings, which mainly capture collaborative signals from historical user-item interactions. However, such embeddings are vulnerable in long-tail scenarios where most items are rarely consumed. Recent methods that incorporate auxiliary information often face noisy collaborative sharing from co-occurrence signals or semantic homogeneity caused by flat dense embeddings. In contrast, Semantic IDs (SID), with their support for code sharing and multi-granular semantic modeling, offer a promising alternative. Nevertheless, SID-based methods are hindered by a collaborative overwhelming phenomenon: commonly adopted quantization mechanisms compromise the identifier uniqueness needed to model head items, resulting in a performance trade-off between head and tail items. To address this challenge, we propose H2Rec, a novel framework that harmonizes SID and HID. We design a dual-branch modeling architecture that simultaneously captures the multi-granular semantics of SID while preserving the unique collaborative identity provided by HID. Moreover, we introduce a dual-level alignment strategy to bridge the two representations, enabling effective knowledge transfer and robust preference modeling. Extensive offline experiments on three public benchmarks and online experiments on a large-scale commercial platform demonstrate that H2Rec achieves a better balance between head and tail recommendation quality and consistently outperforms existing baselines.

Related papers