Awesome Papers

Papers

PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization (2026)
Han Jiang et al.
0.00
On the Sensitivity of Instruction-tuned LLMs to Harmful Sentences in Long Inputs (2026)
Faeze Ghorbanpour et al.
0.00
Dialogue-Based Simulation For Cultural Awareness Training (2021)
Sodiq Adewole et al.
—
Implementation of Google Assistant & Amazon Alexa on Raspberry Pi (2024)
Shailesh D. Arya et al.
—
Adaptive music: Automated music composition and distribution (2022)
David Daniel Albarrac\'in Molina
—
Bias in Automated Speaker Recognition (2022)
Wiebke Toussaint Hutiri and Aaron Ding
—
'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube (2022)
Krithika Ramesh et al.
—
Design Guidelines for Inclusive Speaker Verification Evaluation Datasets (2022)
Wiebke Toussaint Hutiri et al.
—
Towards Evaluation of Autonomously Generated Musical Compositions: A Comprehensive Survey (2022)
Daniel Kvak
—
The SPACE THEA Project (2022)
Martin Spathelf and Oliver Bendel
—
State of the Art of Audio- and Video-Based Solutions for AAL (2022)
Slavisa Aleksic et al.
—
Global Performance Disparities Between English-Language Accents in Automatic Speech Recognition (2023)
Alex DiChristofano et al.
—
Large scale analysis of gender bias and sexism in song lyrics (2023)
Lorenzo Betti et al.
—
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward (2022)
Awais Khan et al.
—
Checks and Strategies for Enabling Code-Switched Machine Translation (2022)
Thamme Gowda et al.
—
PolyHope: Two-Level Hope Speech Detection from Tweets (2022)
Fazlourrahman Balouchzahi and Grigori Sidorov and Alexander Gelbukh
—
Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature Review (2022)
Mikel K. Ngueajio and Gloria Washington
—
The Casual Conversations v2 Dataset (2023)
Bilal Porgali et al.
—
Right the docs: Characterising voice dataset documentation practices used in machine learning (2023)
Kathy Reid and Elizabeth T. Williams
—
Can Voice Assistants Sound Cute? Towards a Model of Kawaii Vocalics (2023)
Katie Seaborn et al.
—
Considerations for Ethical Speech Recognition Datasets (2023)
Orestis Papakyriakopoulos et al.
—
LoopBoxes -- Evaluation of a Collaborative Accessible Digital Musical Instrument (2023)
Andreas F\"orster and Alarith Uhde and Mathias Komesker and Christina Komesker and Irina Schmidt
—
AfriNames: Most ASR models "butcher" African Names (2023)
Tobi Olatunji et al.
—
Challenges and Opportunities for the Design of Smart Speakers (2023)
Tao Long et al.
—
Beyond Neural-on-Neural Approaches to Speaker Gender Protection (2023)
Loes van Bemmel et al.
—
Transcribing Educational Videos Using Whisper: A preliminary study on using AI for transcribing educational videos (2023)
Ashwin Rao
—
The Ethical Implications of Generative Audio Models: A Systematic Literature Review (2023)
Julia Barnett
—
Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model (2023)
Yuezhou Zhang et al.
—
The Biased Journey of MSD_AUDIO.ZIP (2023)
Haven Kim et al.
—
Frame-to-Utterance Convergence: A Spectra-Temporal Approach for Unified Spoofing Detection (2023)
Awais Khan et al.
—
Beyond Fairness: Age-Harmless Parkinson's Detection via Voice (2023)
Yicheng Wang et al.
—
AI (r)evolution -- where are we heading? Thoughts about the future of music and sound technologies in the era of deep learning (2023)
Giovanni Bindi et al.
—
Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys (2023)
Gabriel Cosache et al.
—
Voice Anonymization for All -- Bias Evaluation of the Voice Privacy Challenge Baseline System (2023)
Anna Leschanowsky et al.
—
Detecting anxiety from short clips of free-form speech (2023)
Prabhat Agarwal et al.
—
Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators (2024)
Wiebke Hutiri et al.
—
The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese (2024)
Ajinkya Kulkarni et al.
—
Careless Whisper: Speech-to-Text Hallucination Harms (2024)
Allison Koenecke et al.
—
Beyond Voice Assistants: Exploring Advantages and Risks of an In-Car Social Robot in Real Driving Scenarios (2024)
Yuanchao Li et al.
—
Avoiding an AI-imposed Taylor's Version of all music history (2024)
Nick Collins and Mick Grierson
—
ToXCL: A Unified Framework for Toxic Speech Detection and Explanation (2024)
Nhat M. Hoang et al.
—
Coimagining the Future of Voice Assistants with Cultural Sensitivity (2024)
Katie Seaborn et al.
—
Voice EHR: Introducing Multimodal Audio Data for Health (2024)
James Anibal et al.
—
Qualitative Approaches to Voice UX (2024)
Katie Seaborn et al.
—
Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech (2024)
Dena Mujtaba et al.
—
FairLENS: Assessing Fairness in Law Enforcement Speech Recognition (2024)
Yicheng Wang et al.
—
Emotion Manipulation Through Music -- A Deep Learning Interactive Visual Approach (2024)
Adel N. Abdalla et al.
—
ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs (2024)
Ahmed Heakl et al.
—
Preliminary Study of the Impact of AI-Based Interventions on Health and Behavioral Outcomes in Maternal Health Programs (2024)
Arpan Dasgupta et al.
—
Limits to Predicting Online Speech Using Large Language Models (2026)
Mina Remeli et al.
—