Audio Segmentation Based On Melodic Style With Hand-crafted Features And With Convolutional Neural Networks
2018 Β· Amruta Vidwans, Nachiket Deo, Preeti Rao
Abstract
We investigate methods for the automatic labeling of the taan section, a prominent structural component of the Hindustani Khayal vocal concert. The taan contains improvised raga-based melody rendered in the highly distinctive style of rapid pitch and energy modulations of the voice. We propose computational features that capture these specific high-level characteristics of the singing voice in the polyphonic context. The extracted local features are used to achieve classification at the frame level via a trained multilayer perceptron (MLP) network, followed by grouping and segmentation based on novelty detection. We report high accuracies with reference to musician annotated taan sections across artists and concerts. We also compare the performance obtained by the compact specialized features with frame-level classification via a convolutional neural network (CNN) operating directly on audio spectrogram patches for the same task. While the relatively simple architecture we experiment w
Authors
(none)
Tags
Stats
Related papers
- Spectral And Rhythm Features For Audio Classification With Deep Convolutional Neural Networks (2024)0.00
- Automatic Tagging Using Deep Convolutional Neural Networks (2016)0.00
- Hierarchical Generative Modeling Of Melodic Vocal Contours In Hindustani Classical Music (2024)0.00
- Combining High-level Features Of Raw Audio Waves And Mel-spectrograms For Audio Tagging (2018)0.00
- Wavelet-filtering Of Symbolic Music Representations For Folk Tune Segmentation And Classification (2025)0.00
- Convolution Channel Separation And Frequency Sub-bands Aggregation For Music Genre Classification (2022)0.00
- Vocal Melody Extraction Using Patch-based CNN (2018)12.47
- A Streamlined Encoder/decoder Architecture For Melody Extraction (2018)12.68