cs.MM
50 papers tagged cs.MM β re-sort below
Papers
- LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV (2026)Tengfei Liu et al.12.79
- Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval (2026)Xiang Fang et al.8.81
- You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos (2026)Xiang Fang et al.8.24
- Lightweight Complementary-Cue Fusion for Robust Video Face Forgery Detection (2026)Sunghwan Baek et al.3.10
- Bags of Local Convolutional Features for Scalable Instance Search (2016)Eva Mohedano et al.β
- LOH and behold: Web-scale visual search, recommendation and clustering
using Locally Optimized Hashing (2016)Yannis Kalantidis et al.β
- Large-Scale Query-by-Image Video Retrieval Using Bloom Filters (2016)Andre Araujo et al.β
- Bloom Filters and Compact Hash Codes for Efficient and Distributed Image
Retrieval (2016)Andrea Salvi et al.β
- De-Hashing: Server-Side Context-Aware Feature Reconstruction for Mobile
Visual Search (2016)Yin-Hsi Kuo and Winston H. Hsuβ
- Generalized residual vector quantization for large scale data (2016)Shicong Liu et al.β
- Fast Supervised Discrete Hashing and its Analysis (2016)Gou Koutaki et al.β
- Binary Subspace Coding for Query-by-Image Video Retrieval (2016)Ruicong Xu et al.β
- Accelerated Nearest Neighbor Search with Quick ADC (2017)Fabien Andr\'e (Technicolor) and Anne-Marie Kermarrec (Inria) and Nicolas Le Scouarnec (Technicolor)β
- Region-Based Image Retrieval Revisited (2017)Ryota Hinami et al.β
- Exploiting Modern Hardware for High-Dimensional Nearest Neighbor Search (2017)Fabien Andr\'eβ
- Efficient Interactive Search for Geo-tagged Multimedia Data (2018)Jun Long et al.β
- Hierarchical Information Quadtree: Efficient Spatial Temporal Image
Search for Multimedia Stream (2018)Chengyuan Zhang et al.β
- A Filter of Minhash for Image Similarity Measures (2018)Jun Long et al.β
- Efficient Continuous Top-$k$ Geo-Image Search on Road Network (2018)Chengyuan Zhang et al.β
- Reconfigurable Inverted Index (2018)Yusuke Matsui et al.β
- Fusion Hashing: A General Framework for Self-improvement of Hashing (2018)Xingbo Liu and Xiushan Nie and Yilong Yinβ
- Towards an All-Purpose Content-Based Multimedia Information Retrieval
System (2019)Ralph Gasser et al.β
- Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval (2019)Svebor Karaman et al.β
- SADIH: Semantic-Aware DIscrete Hashing (2019)Zheng Zhang et al.β
- Graph based Nearest Neighbor Search: Promises and Failures (2019)Peng-Cheng Lin and Wan-Lei Zhaoβ
- Effective and Efficient Indexing in Cross-Modal Hashing-Based Datasets (2019)Sarawut Markchit and Chih-Yi Chiuβ
- Efficient Bitmap-based Indexing and Retrieval of Similarity Search Image
Queries (2020)Omid Jafari et al.β
- Reinforcing Short-Length Hashing (2020)Xingbo Liu et al.β
- Improving Locality Sensitive Hashing by Efficiently Finding Projected
Nearest Neighbors (2020)Omid Jafari et al.β
- Experimental Analysis of Locality Sensitive Hashing Techniques for
High-Dimensional Approximate Nearest Neighbor Searches (2021)Omid Jafari et al.β
- The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search
Engines for Large-Scale Video Retrieval (2021)Giuseppe Amato et al.β
- Visually Aware Skip-Gram for Image Based Recommendations (2020)Parth Tiwari et al.β
- Rescuing Deep Hashing from Dead Bits Problem (2021)Shu Zhao et al.β
- Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search (2021)Cheng Ma et al.β
- Cross-modal Zero-shot Hashing by Label Attributes Embedding (2021)Runmin Wang et al.β
- Hierarchical Local-Global Transformer for Temporal Sentence Grounding (2026)Xiang Fang et al.β
- Deep Metric Multi-View Hashing for Multimedia Retrieval (2023)Jian Zhu et al.β
- ElasticHash: Semantic Image Similarity Search by Deep Hashing with
Elasticsearch (2023)Nikolaus Korfhage et al.β
- Central Similarity Multi-View Hashing for Multimedia Retrieval (2023)Jian Zhu et al.β
- How Far Are We from Generating Missing Modalities with Foundation Models? (2026)Guanzhou Ke et al.β
- Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP (2025)Bastian J\"ackl and Vojt\v{e}ch Kloda and Daniel A. Keim and Jakub Loko\v{c}β
- Designing Singing Syllabi with Virtual Avatars: AI-Assisted Syllabus Reauthoring (2026)Xinxing Wuβ
- Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval (2026)Ilyass Moummad et al.β
- AG-REPA: Causal Layer Selection for Representation Alignment in Audio Flow Matching (2026)Pengfei Zhang et al.β
- TimeSpot: Benchmarking Geo-Temporal Understanding in Vision-Language Models in Real-World Settings (2026)Azmine Toushik Wasi et al.β
- SenBen: Sensitive Scene Graphs for Explainable Content Moderation (2026)Fatih Cagatay Akyon et al.β
- CounterFlow: A Two-Phase Inference-Time Sampling for Counterfactual Video Foley Generation (2026)Gyubin Lee et al.β
- Decoupling Spatio-Temporal Adapter for Fine-Grained Badminton Action Localization (2026)Tianyu Wang (School of Economics and Management et al.β
- FAST-ME: Foundation-aware Adaptive Stopping for Motion Estimation for Efficient IoT Video Analysis (2026)Kakia Panagidi et al.β
- DrawVideo: Generating Long Video from Storyboard Keyframe Sketches (2026)Chuanzhi Xu et al.β