LVIS
CanonicalProgress on object detection is enabled by datasets that focus the research community's attention on open challenges. This process led us from simple images to complex scenes and from bounding boxes to segmentation masks. In this work, we introduce LVIS (pronounced `el-vis'): a new dataset for Large Vocabulary Instance Segmentation. We plan to collect ~2 million high-quality instance segmentation masks for over 1000 entry-level object categories in 164k images. Due to the Zipfian distribution of categories in natural images, LVIS naturally has a long tail of categories with few training samples. Given that state-of-the-art deep learning methods for object detection perform poorly in the low-sample regime, we believe that our dataset poses an important and exciting new scientific challenge.
Papers using LVIS (24)
- Grounding DINO: Marrying DINO With Grounded Pre-training For Open-set Object DetectionYolo-world: Real-time Open-vocabulary Object DetectionEqualization Loss For Long-tailed Object RecognitionFrustratingly Simple Few-shot Object DetectionOpen-vocabulary Object Detection Via Vision And Language Knowledge DistillationMosaicos: A Simple And Effective Use Of Object-centric Images For Long-tailed Object DetectionX-DETR: A Versatile Architecture For Instance-wise Vision-language TasksScaling Open-vocabulary Object DetectionDiffusionInst: Diffusion Model for Instance SegmentationClassification Calibration For Long-tail Instance SegmentationBridging Images And Videos: A Simple Learning Framework For Large Vocabulary Video Object DetectionDetect Everything With Few ExamplesRegion-centric Image-language Pretraining For Open-vocabulary DetectionLanguage-conditioned Detection TransformerReal-time Transformer-based Open-vocabulary Detection With Efficient Fusion HeadSOS: Segment Object System For Open-world Instance Segmentation With Object PriorsEnhancing Novel Object Detection Via Cooperative Foundational ModelsEqualization Loss For Large Vocabulary Instance SegmentationUnsupervised Discovery of the Long-Tail in Instance Segmentation Using
Hierarchical Self-SupervisionEnhancing Open-Vocabulary Object Detection through Multi-Level Fine-Grained Visual-Language AlignmentMoondream Segmentation: From Words to MasksSearch And Detect: Training-free Long Tail Object Detection Via Web-image RetrievalGen2det: Generate To DetectLogit Normalization For Long-tail Object Detection