CLIP
Emerging13papers using it
2023first seen
Papers using CLIP (13)
- Clip-handid: Vision-language Model For Hand-based Person IdentificationCAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language MisalignmentDeepSight: Bridging Depth Maps and Language with a Depth-Driven Multimodal ModelRAZOR: Ratio-Aware Layer Editing for Targeted Unlearning in Vision Transformers and Diffusion ModelsAligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal AlignmentImportance Sampling for Multi-Negative Multimodal Direct Preference OptimizationCan Argus Judge Them All? Comparing VLMs Across DomainsDisentangling 3D From Large Vision-language Models For Controlled Portrait GenerationExperimental Evaluation Of Static Image Sub-region-based Search Models Using CLIPDRIP: Dynamic Patch Reduction Via Interpretable PoolingCompositional Semantics for Open Vocabulary Spatio-semantic RepresentationsExtending Multi-modal Contrastive RepresentationsFM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for
Open-Vocabulary 3D Detection