Awesome Papers

Papers

Multi-Agent Reinforcement Learning for Safe Autonomous Driving Under Pedestrian Behavioral Uncertainty (2026)
Prakash Aryan et al.
3.10
From Attribution to Action: A Human-Centered Application of Activation Steering (2026)
Tobias Labarta et al.
0.00
MoDAl: Self-Supervised Neural Modality Discovery via Decorrelation for Speech Neuroprosthesis (2026)
Yuanhao Chen et al.
0.00
I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors (2026)
Lelia Erscoi (Computational Speech Group et al.
0.00
Ubiquitous Acoustic Sensing on Commodity IoT Devices: A Survey (2021)
Chao Cai et al.
—
Real-time and interactive tools for vocal training based on an analytic signal with a cosine series envelope (2021)
Hideki Kawahara et al.
—
DEPA: Self-Supervised Audio Embedding for Depression Detection (2021)
Pingyue Zhang et al.
—
Decoding Imagined Speech and Computer Control using Brain Waves (2021)
Abhiram Singh et al.
—
On the human evaluation of audio adversarial examples (2021)
Jon Vadillo and Roberto Santana
—
Gesticulator: A framework for semantically-aware speech-driven gesture generation (2021)
Taras Kucherenko et al.
—
ExSampling: a system for the real-time ensemble performance of field-recorded environmental sounds (2026)
Atsuya Kobayashi et al.
—
Moving fast and slow: Analysis of representations and post-processing in speech-driven automatic gesture generation (2021)
Taras Kucherenko et al.
—
Sequence-to-Sequence Predictive Model: From Prosody To Communicative Gestures (2021)
Fajrian Yunus et al.
—
Enhancing Haptic Distinguishability of Surface Materials with Boosting Technique (2022)
Priyadarshini K and Subhasis Chaudhuri
—
Speech-Based Emotion Recognition using Neural Networks and Information Visualization (2021)
Jumana Almahmoud and Kruthika Kikkeri
—
Progressive Voice Trigger Detection: Accuracy vs Latency (2021)
Siddharth Sigtia et al.
—
Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop (2024)
Matthew Marge et al.
—
NHSS: A Speech and Singing Parallel Database (2021)
Bidisha Sharma et al.
—
AudioViewer: Learning to Visualize Sounds (2023)
Chunjin Song et al.
—
DEVI: Open-source Human-Robot Interface for Interactive Receptionist Systems (2021)
Ramesha Karunasena et al.
—
Mindless Attractor: A False-Positive Resistant Intervention for Drawing Attention Using Auditory Perturbation (2021)
Riku Arakawa and Hiromu Yakura
—
Generacion de voces artificiales infantiles en castellano con acento costarricense (2021)
Ana Lilia Alvarez-Blanco et al.
—
What Do We See in Them? Identifying Dimensions of Partner Models for Speech Interfaces Using a Psycholexical Approach (2021)
Philip R Doyle et al.
—
Low-latency auditory spatial attention detection based on spectro-spatial features from EEG (2024)
Siqi Cai et al.
—
Batebit Controller: Popularizing Digital Musical Instruments Development Process (2023)
Filipe Calegario and Jo\~ao Tragtenberg and Giordano Cabral and Geber Ramalho
—
Human Perception of Audio Deepfakes (2024)
Nicolas M. M\"uller et al.
—
Sequence-to-Sequence Voice Reconstruction for Silent Speech in a Tonal Language (2022)
Huiyan Li et al.
—
Multimodal analysis of the predictability of hand-gesture properties (2022)
Taras Kucherenko et al.
—
Couple Learning for semi-supervised sound event detection (2022)
Rui Tao et al.
—
Synthesizing Speech from Intracranial Depth Electrodes using an Encoder-Decoder Framework (2022)
Jonas Kohler et al.
—
Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation (2022)
Hideki Kawahara et al.
—
A Case Study on the Independence of Speech Emotion Recognition in Bangla and English Languages using Language-Independent Prosodic Features (2022)
Fardin Saad et al.
—
Embedding-based Music Emotion Recognition Using Composite Loss (2023)
Naoki Takashima et al.
—
Unsupervised Personalization of an Emotion Recognition System: The Unique Properties of the Externalization of Valence in Speech (2023)
Kusha Sridhar and Carlos Busso
—
Towards a Real-time Measure of the Perception of Anthropomorphism in Human-robot Interaction (2022)
Maria Tsfasman et al.
—
Visualizing Automatic Speech Recognition -- Means for a Better Understanding? (2022)
Karla Markert and Romain Parracone and Mykhailo Kulakov and Philip Sperl and Ching-Yu Kao and Konstantin B\"ottinger
—
QAC: Quantum-computing Aided Composition (2022)
Omar Costa Hamido
—
Wav2Vec2.0 on the Edge: Performance Evaluation (2022)
Santosh Gondi
—
Multi-style Training for South African Call Centre Audio (2022)
Walter Heymans et al.
—
Hidden bawls, whispers, and yelps: can text be made to sound more than just its words? (2024)
Calu\~a de Lacerda Pataca and Paula Dornhofer Paro Costa
—
Wavebender GAN: An architecture for phonetically meaningful speech manipulation (2022)
Gustavo Teodoro D\"ohler Beck et al.
—
ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users (2022)
Dhruv Jain et al.
—
Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN (2022)
Kaining Mao et al.
—
Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video (2024)
Matthew Groh et al.
—
DareFightingICE Competition: A Fighting Game Sound Design and AI Competition (2022)
Ibrahim Khan et al.
—
Whither the Priors for (Vocal) Interactivity? (2022)
Roger K. Moore
—
Robotic Speech Synthesis: Perspectives on Interactions, Scenarios, and Ethics (2022)
Yuanchao Li et al.
—
An interactive music infilling interface for pop music composition (2022)
Rui Guo
—
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent (2022)
Yuki Saito et al.
—
A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition (2024)
R. Gnana Praveen et al.
—