cs.CV
50 papers tagged cs.CV (ordered by heat_score)
Papers
- Multi-Vector Index Compression in Any Modality (2026)Hanxiang Qin et al.11.70
- Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval (2026)Xiang Fang et al.8.81
- You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos (2026)Xiang Fang et al.8.24
- GENIUS: A Generative Framework for Universal Multimodal Search (2025)Sungyeon Kim et al.5.24
- Eulerian Gaussian Splatting using Hashed Probability Pyramids (2026)Mia Gaia Polansky et al.3.91
- VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion (2026)Hidir Yesiltepe et al.3.91
- Search is All You Need for Few-shot Anomaly Detection (2025)Qishan Wang et al.3.58
- Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective (2026)Xiang Fang et al.3.10
- Trajectory Constraints for Imaging Inverse Problems (2026)Chaoyan Huang et al.3.10
- Lightweight Complementary-Cue Fusion for Robust Video Face Forgery Detection (2026)Sunghwan Baek et al.3.10
- SalsaAgent: A multimodal embodied language model for interactive dance generation (2026)Payam Jome Yazdian et al.3.10
- Deep Psychovisual Image Representations (2026)Wendi Ma et al.3.10
- Subcortical Shape Variations and Their Associations with Cognition Across the 8th Decade of Life. A Study in the Lothian Birth Cohort 1936 (2026)Maria del C. Valdes-Hernandez et al.3.10
- Reducing Experimental Testing in Space Propulsion Film Cooling Analyses by Pixelwise Generative Image Interpolation (2026)Adam T. M\"uller et al.3.10
- VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies (2026)Mingjian Gao et al.3.10
- Reinforcement Learning with Robust Rubric Rewards (2026)Ya-Qi Yu et al.3.10
- Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning (2026)Ciara Rowles et al.3.10
- MambaBEV: An EV-based 3D detection model with Mamba2 (2026)Zihan You et al.2.94
- DirectorBench: Diagnosing Long-Form Video Generation with Personalized Multi-Agent Evaluation (2026)Jiamin Chen et al.1.96
- Gaga: Group Any Gaussians via 3D-aware Memory Bank (2026)Weijie Lyu et al.0.00
- Residual Connections Harm Generative Representation Learning (2026)Xiao Zhang et al.0.00
- A Greedy Hierarchical Approach to Whole-Network Filter-Pruning in CNNs (2026)Kiran Purohit et al.0.00
- Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance (2026)Haozhe Zhao et al.0.00
- Your Data Is Not Perfect: Towards Cross-Domain Out-of-Distribution Detection in Class-Imbalanced Data (2026)Xiang Fang et al.0.00
- Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model (2026)Donghwna Lee et al.0.00
- STEAM: Squeeze and Transform Enhanced Attention Module (2026)Rishabh Sabharwal et al.0.00
- A Flexible and Scalable Framework for Video Moment Search (2025)Chongzhi Zhang et al.0.00
- Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition (2026)Meng-zhu Li et al.0.00
- Fast 3D point clouds retrieval for Large-scale 3D Place Recognition (2025)Chahine-Nicolas Zede et al.0.00
- Domain-Agnostic Feature Modulation for Semi-Supervised Domain Generalization (2026)Venuri Amarasinghe (University of Moratuwa) et al.0.00
- CamC2V: Context-aware Controllable Video Generation (2026)Luis Denninger et al.0.00
- Privacy Protection Against Personalized Text-to-Image Synthesis via Cross-image Consistency Constraints (2026)Guanyu Wang et al.0.00
- EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance (2026)Zun Wang et al.0.00
- VRAG: Learning World Models for Interactive Video Generation (2026)Taiye Chen et al.0.00
- SAGE: Segment-Aware Gloss-Free Encoding for Token-Efficient Sign Language Translation (2026)JianHe Low et al.0.00
- MENTOR: Efficient Multimodal-Conditioned Tuning for Autoregressive Vision Generation Models (2026)Haozhe Zhao et al.0.00
- Finding DoRI: Discovery of Retained Images in Diffusion Models (2026)Antoni Kowalczuk et al.0.00
- HM-Talker: Hybrid Motion Modeling for High-Fidelity Talking Head Synthesis (2026)Shiyu Liu et al.0.00
- Scalable RF Simulation in Generative 4D Worlds (2026)Zhiwei Zheng et al.0.00
- Resolution as a Direction: Vector-Panning Feature Alignment for Cross-Resolution Re-Identification (2026)Zanwu Liu et al.0.00
- Streaming Drag-Oriented Interactive Video Manipulation: Drag Anything, Anytime! (2026)Junbao Zhou et al.0.00
- LoCoT2V-Bench: Benchmarking Long-Form and Complex Text-to-Video Generation (2026)Xiangqing Zheng et al.0.00
- Modality Alignment across Trees on Heterogeneous Hyperbolic Manifolds (2026)Wei Wu et al.0.00
- Dual-Stream Diffusion for World-Model Augmented Vision-Language-Action Model (2026)John Won et al.0.00
- Analyzing Persona Effects in Generated Explanations from Multimodal LLM Agents in Urban Perception (2026)Neemias da Silva et al.0.00
- On Asymmetric Optimization of Reasoning and Perception in Vision-Language Model Post-Training (2026)Xueqing Wu et al.0.00
- Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting (2026)Runze Xu et al.0.00
- An ensemble diversity approach to supervised binary hashing (2016)Miguel \'A. Carreira-Perpi\~n\'an and Ramin Raziperchikolaeiβ
- Auto-JacoBin: Auto-encoder Jacobian Binary Hashing (2016)Xiping Fu et al.β
- Content-based Video Indexing and Retrieval Using Corr-LDA (2019)Rahul Radhakrishnan Iyer et al.β