Human Voice Pitch Estimation: A Convolutional Network With Auto-labeled And Synthetic Data
2023 Β· Jeremy Cochoy
Abstract
In the domain of music and sound processing, pitch extraction plays a pivotal role. Our research presents a specialized convolutional neural network designed for pitch extraction, particularly from the human singing voice in acapella performances. Notably, our approach combines synthetic data with auto-labeled acapella sung audio, creating a robust training environment. Evaluation across datasets comprising synthetic sounds, opera recordings, and time-stretched vowels demonstrates its efficacy. This work paves the way for enhanced pitch extraction in both music and voice settings.
Authors
(none)
Tags
Stats
Related papers
- Pitchnet: Unsupervised Singing Voice Conversion With Pitch Adversarial Network (2019)10.97
- A Data-driven Approach To Smooth Pitch Correction For Singing Voice In Pop Music (2018)0.00
- Traditional Machine Learning For Pitch Detection (2019)10.85
- A Vocoder Based Method For Singing Voice Extraction (2019)5.24
- Noise-robust Dsp-assisted Neural Pitch Estimation With Very Low Complexity (2023)5.24
- Deep-learning Architectures For Multi-pitch Estimation: Towards Reliable Evaluation (2022)0.00
- Unsupervised Singing Voice Conversion (2019)11.19
- An Empirical Study On End-to-end Singing Voice Synthesis With Encoder-decoder Architectures (2021)0.00