Abstract
Real-world datasets often exhibit long-tailed distributions, where a few dominant "Head" classes have abundant samples while most "Tail" classes are severely underrepresented, leading to biased learning and poor generalization for the Tail. We present a theoretical framework that reveals a previously undescribed connection between Long-Tailed Recognition (LTR) and Continual Learning (CL), the process of learning sequential tasks without forgetting prior knowledge. Our analysis demonstrates that, for models trained on imbalanced datasets, the weights converge to a bounded neighborhood of those trained exclusively on the Head, with the bound scaling as the inverse square root of the imbalance factor. Leveraging this insight, we introduce Continual Learning for Long-Tailed Recognition (CLTR), a principled approach that employs standard off-the-shelf CL methods to address LTR problems by sequentially learning Head and Tail classes without forgetting the Head. Our theoretical analysis further suggests that CLTR mitigates gradient saturation and improves Tail learning while maintaining strong Head performance. Extensive experiments on CIFAR100-LT, CIFAR10-LT, ImageNet-LT, and Caltech256 validate our theoretical predictions, achieving strong results across various LTR benchmarks. Our work bridges the gap between LTR and CL, providing a principled way to tackle imbalanced data challenges with standard existing CL strategies.