A Fully Tensorized Recurrent Neural Network
2020 Β· Charles C. Onu, Jacob E. Miller, Doina Precup
Abstract
Recurrent neural networks (RNNs) are powerful tools for sequential modeling, but typically require significant overparameterization and regularization to achieve optimal performance. This leads to difficulties in the deployment of large RNNs in resource-limited settings, while also introducing complications in hyperparameter selection and training. To address these issues, we introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell using a lightweight tensor-train (TT) factorization. This approach represents a novel form of weight sharing which reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs. Experiments on image classification and speaker verification tasks demonstrate further benefits for reducing inference times and stabilizing model training and hyperparameter selection.
Authors
(none)
Tags
Stats
Related papers
- Tensor-train Long Short-term Memory For Monaural Speech Enhancement (2018)0.00
- Learning Compact Recurrent Neural Networks (2016)0.00
- Tensor-to-vector Regression For Multi-channel Speech Enhancement Based On Tensor-train Network (2020)12.39
- Restricted Recurrent Neural Networks (2019)7.50
- Developing RNN-T Models Surpassing High-performance Hybrid Models With Customization Capability (2020)13.28
- Improved Neural Language Model Fusion For Streaming Recurrent Neural Network Transducer (2020)8.82
- Exploiting Low-rank Tensor-train Deep Neural Networks Based On Riemannian Gradient Descent With Illustrations Of Speech Processing (2022)0.00
- Exploring Deep Hybrid Tensor-to-vector Network Architectures For Regression Based Speech Enhancement (2020)7.50