Awesome Speech Audio
πŸ“„Papers🧭TopicsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸ€–Ask AI
β‹―More
πŸ‘₯AuthorsπŸ“šReading PacksπŸ› οΈToolsπŸ“Blogsβœ‰οΈNewsletterπŸ”–Saved
+ Add Paper

← authors Β· overview

Mohamed Elhoseiny

10 papers Β· 0 citations
Most-cited papers
  • Reltransformer: A Transformer-based Long-tail Visual Relationship Recognition
    2021 Β· 19 citations
  • Vrsbench: A Versatile Vision-language Benchmark Dataset For Remote Sensing Image Understanding
    2024 Β· 15 citations
  • Exploring Hierarchical Graph Representation For Large-scale Zero-shot Image Classification
    2022 Β· 10 citations
  • Goldfish: Vision-language Understanding Of Arbitrarily Long Videos
    2024 Β· 6 citations
  • Imagecaptioner\(^2\): Image Captioner For Image Captioning Bias Amplification Assessment
    2023 Β· 6 citations
  • Category-level Text-to-image Retrieval Improved: Bridging The Domain Gap With Diffusion Models And Vision Encoders
    2025
  • MAGNET: A Multi-agent Framework For Finding Audio-visual Needles By Reasoning Over Multi-video Haystacks
    2025
  • A Survey On Long-video Storytelling Generation: Architectures, Consistency, And Cinematic Quality
    2025
  • Reefnet: A Large-scale Dataset And Benchmark For Fine-grained Coral Reef Recognition
    2025
  • Fishnet++: Analyzing The Capabilities Of Multimodal Large Language Models In Marine Biology
    2025
Topics
Visual LanguageBenchmarksVideo Understanding3D VisionImage GenerationVision-Language ModelsVideo-LanguageUncategorizedImage RestorationImage-Text Retrieval

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.