A Simple Fusion Of Deep And Shallow Learning For Acoustic Scene Classification
2018 Β· Eduardo Fonseca, Rong Gong, Xavier Serra
Abstract
In the past, Acoustic Scene Classification systems have been based on hand crafting audio features that are input to a classifier. Nowadays, the common trend is to adopt data driven techniques, e.g., deep learning, where audio representations are learned from data. In this paper, we propose a system that consists of a simple fusion of two methods of the aforementioned types: a deep learning approach where log-scaled mel-spectrograms are input to a convolutional neural network, and a feature engineering approach, where a collection of hand-crafted features is input to a gradient boosting machine. We first show that both methods provide complementary information to some extent. Then, we use a simple late fusion strategy to combine both methods. We report classification accuracy of each method individually and the combined system on the TUT Acoustic Scenes 2017 dataset. The proposed fused system outperforms each of the individual methods and attains a classification accuracy of 72.8% on t
Authors
(none)
Tags
Stats
Related papers
- Acoustic Scene Classification Using Convolutional Neural Network And Multiple-width Frequency-delta Data Augmentation (2016)0.00
- Audio Scene Classification With Deep Recurrent Neural Networks (2017)11.29
- Combining High-level Features Of Raw Audio Waves And Mel-spectrograms For Audio Tagging (2018)0.00
- Spectral And Rhythm Features For Audio Classification With Deep Convolutional Neural Networks (2024)0.00
- Classifying Variable-length Audio Files With All-convolutional Networks And Masked Global Pooling (2016)0.00
- Audio Classification Of Low Feature Spectrograms Utilizing Convolutional Neural Networks (2024)5.84
- Acoustic Scene Classification Using Multi-layer Temporal Pooling Based On Convolutional Neural Network (2019)0.00
- A Deep Neural Network For Audio Classification With A Classifier Attention Mechanism (2020)0.00