Acoustic Scene Classification Using Convolutional Neural Network And Multiple-width Frequency-delta Data Augmentation
2016 Β· Yoonchang Han, Kyogu Lee
Abstract
In recent years, neural network approaches have shown superior performance to conventional hand-made features in numerous application areas. In particular, convolutional neural networks (ConvNets) exploit spatially local correlations across input data to improve the performance of audio processing tasks, such as speech recognition, musical chord recognition, and onset detection. Here we apply ConvNet to acoustic scene classification, and show that the error rate can be further decreased by using delta features in the frequency domain. We propose a multiple-width frequency-delta (MWFD) data augmentation method that uses static mel-spectrogram and frequency-delta features as individual input examples. In addition, we describe a ConvNet output aggregation method designed for MWFD augmentation, folded mean aggregation, which combines output probabilities of static and MWFD features from the same analysis window using multiplication first, rather than taking an average of all output probabi
Authors
(none)
Tags
Stats
Related papers
- Audio-based Music Classification With Densenet And Data Augmentation (2019)10.48
- Spectral And Rhythm Features For Audio Classification With Deep Convolutional Neural Networks (2024)0.00
- A Simple Fusion Of Deep And Shallow Learning For Acoustic Scene Classification (2018)0.00
- Classifying Variable-length Audio Files With All-convolutional Networks And Masked Global Pooling (2016)0.00
- Convolutional Gated Recurrent Neural Network Incorporating Spatial Features For Audio Tagging (2017)13.23
- Acoustic Scene Classification Using Bilinear Pooling On Time-liked And Frequency-liked Convolution Neural Network (2020)5.84
- Acoustic Scene Classification Using Multi-layer Temporal Pooling Based On Convolutional Neural Network (2019)0.00
- Audio Classification Of Low Feature Spectrograms Utilizing Convolutional Neural Networks (2024)5.84