AudioCap
Emerging13papers using it
325HF downloads
14HF likes
2023first seen
audiocaps HuggingFace mirror of official data repo.
π€ Hugging Faceβ mit
Papers using AudioCap (13)
- FoleyGenEx: Unified Video-to-Audio Generation with Multi-Modal Control, Temporal Alignment, and Semantic Precisione5-omni: Explicit Cross-modal Alignment for Omni-modal EmbeddingsLAMB: LLM-based Audio Captioning with Modality Gap Bridging via Cauchy-Schwarz DivergenceAC/DC: LLM-based Audio Comprehension via Dialogue ContinuationTraining-free Multimodal Guidance For Video To Audio GenerationMitigating Audiovisual Mismatch In Visual-guide Audio CaptioningDiffGAP: A Lightweight Diffusion Module in Contrastive Space for
Bridging Cross-Model GapONE-PEACE: Exploring One General Representation Model Toward Unlimited
ModalitiesAccommodating Audio Modality in CLIP for Multimodal ProcessingAudio-Visual LLM for Video UnderstandingZero-Shot Audio Captioning Using Soft and Hard PromptsMINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley
Audio Content Planning and GenerationEnhancing Audio-Language Models through Self-Supervised Post-Training
with Text-Audio Pairs