Vocal Melody Extraction Using Patch-based CNN
2018 Β· Li Su
Abstract
A patch-based convolutional neural network (CNN) model presented in this paper for vocal melody extraction in polyphonic music is inspired from object detection in image processing. The input of the model is a novel time-frequency representation which enhances the pitch contours and suppresses the harmonic components of a signal. This succinct data representation and the patch-based CNN model enable an efficient training process with limited labeled data. Experiments on various datasets show excellent speed and competitive accuracy comparing to other deep learning approaches.
Authors
(none)
Tags
Stats
Related papers
- Multiple F0 Estimation In Vocal Ensembles Using Convolutional Neural Networks (2020)0.00
- A Streamlined Encoder/decoder Architecture For Melody Extraction (2018)12.68
- Tonet: Tone-octave Network For Singing Melody Extraction From Polyphonic Music (2022)9.76
- Modeling Music Modality With A Key-class Invariant Pitch Chroma CNN (2019)0.00
- Melody Extraction From Polyphonic Music By Deep Learning Approaches: A Review (2022)0.00
- Towards Improving Harmonic Sensitivity And Prediction Stability For Singing Melody Extraction (2023)0.00
- Human Vocal Sentiment Analysis (2019)0.00
- A Vocoder Based Method For Singing Voice Extraction (2019)5.24