10-second V-2A benchmark
Emerging1papers using it
2024first seen
The '10-second V2A benchmark' is a dataset used to evaluate the performance of video-to-audio generation models specifically for generating audio from video segments that are 10 seconds long.