Convolutional RNN: An Enhanced Model For Extracting Features From Sequential Data
2016 · Gil Keren, Björn Schuller
Abstract
Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input. We propose a model that enhances this feature extraction process for the case of sequential data, by feeding patches of the data into a recurrent neural network and using the outputs or hidden states of the recurrent units to compute the extracted features. By doing so, we exploit the fact that a window containing a few frames of the sequential data is a sequence itself and this additional structure might encapsulate valuable information. In addition, we allow for more steps of computation in the feature extraction process, which is potentially beneficial as an affine function followed by a non-linearity can result in too simple features. Using our convolutional recurrent layers we obtain an improvement in performance in two audio classification tasks, compared to traditional convolutional layers. Tensorflow code for the convolutional recurrent layers i
Authors
(none)
Tags
Stats
Related papers
- A Fully Recurrent Feature Extraction For Single Channel Speech Enhancement (2020)0.00
- Convolutional Gated Recurrent Neural Network Incorporating Spatial Features For Audio Tagging (2017)13.23
- AMFFCN: Attentional Multi-layer Feature Fusion Convolution Network For Audio-visual Speech Enhancement (2021)0.00
- Composing General Audio Representation By Fusing Multilayer Features Of A Pre-trained Model (2022)8.09
- Acoustic Scene Classification Using Convolutional Neural Network And Multiple-width Frequency-delta Data Augmentation (2016)0.00
- Music Artist Classification With Convolutional Recurrent Neural Networks (2019)11.93
- Rethinking Recurrent Latent Variable Model For Music Composition (2018)7.50
- Wavecrn: An Efficient Convolutional Recurrent Neural Network For End-to-end Speech Enhancement (2020)14.02