ADE20K
Canonical51papers using it
2016first seen
Papers using ADE20K (43)
- Masked-attention Mask Transformer For Universal Image SegmentationContext Encoding For Semantic SegmentationSemantic Understanding Of Scenes Through The ADE20K DatasetSegnext: Rethinking Convolutional Attention Design For Semantic SegmentationTopformer: Token Pyramid Transformer For Mobile Semantic SegmentationMulti-scale High-resolution Vision Transformer For Semantic SegmentationK-net: Towards Unified Image SegmentationSeaformer++: Squeeze-enhanced Axial Transformer For Mobile Visual RecognitionConv2Former: A Simple Transformer-Style ConvNet for Visual RecognitionYou Only Segment Once: Towards Real-time Panoptic SegmentationMVP: Multimodality-guided Visual Pre-trainingDsnet: A Novel Way To Use Atrous Convolutions In Semantic SegmentationSemantic Segmentation Via Highly Fused Convolutional Network With Multiple Soft Cost FunctionsContent-aware Token Sharing For Efficient Semantic Segmentation With Vision TransformersIn Defense Of Lazy Visual Grounding For Open-vocabulary Semantic SegmentationA Unified View of Masked Image ModelingRest V2: Simpler, Faster And StrongerFull Contextual Attention For Multi-resolution Transformers In Semantic SegmentationDecoder Denoising Pretraining for Semantic SegmentationUnderstanding Gaussian Attention Bias of Vision Transformers Using
Effective Receptive FieldsRemax: Relaxing For Better Training On Efficient Panoptic SegmentationFeature Selective Transformer for Semantic Image SegmentationSkip-attention: Improving Vision Transformers By Paying Less AttentionIncepformer: Efficient Inception Transformer With Pyramid Pooling For Semantic SegmentationHCFormer: Unified Image Segmentation with Hierarchical ClusteringDiffusion For Out-of-distribution Detection On Road Scenes And BeyondEnhancing Transformer-based Vision Models: Addressing Feature Map Anomalies Through Novel Optimization StrategiesLow-Resolution Self-Attention for Semantic SegmentationToken Cropr: Faster Vits For Quite A Few TasksDmformer: Closing The Gap Between CNN And Vision TransformersSOS: Segment Object System For Open-world Instance Segmentation With Object PriorsA Simple Latent Diffusion Approach for Panoptic Segmentation and Mask
InpaintingPAUMER: Patch Pausing Transformer For Semantic SegmentationStructtoken : Rethinking Semantic Segmentation With Structural PriorTransformer Scale Gate for Semantic SegmentationSeeing Through Clutter: Structured 3D Scene Reconstruction via Iterative Object RemovalLocality-Attending Vision TransformerExploring Open-Vocabulary Object Recognition in Images using CLIPARTA: Adaptive Mixed-resolution Token Allocation For Efficient Dense Feature ExtractionMambavision: A Hybrid Mamba-transformer Vision BackboneCross-domain Semantic Segmentation With Large Language Model-assisted Descriptor GenerationSpiralmlp: A Lightweight Vision MLP ArchitecturePNM: Pixel Null Model For General Image Segmentation