← authors · overview

Serena Yeung-Levy

15 papers · 4 citations

Most-cited papers

Why Are Visually-grounded Language Models Bad At Image Classification?
2024 · 3 citations
Deforhmr: Vision Transformer With Deformable Cross-attention For 3D Human Mesh Recovery
2024 · 1 citations
Visualoverload: Probing Visual Understanding Of Vlms In Really Dense Scenes
2025
TTRV: Test-time Reinforcement Learning For Vision Language Models
2025

Topics

Visual Language 3D Vision Video Understanding