SHARP: An Adaptable, Energy-efficient Accelerator For Recurrent Neural Network
2019 Β· Reza Yazdani, Olatunji Ruwase, Minjia Zhang, et al.
Abstract
The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, prior work has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today's RNN accelerators. In particular, we first show the problem of low resource-utilization and low adaptiveness for the state-of-the-art RNN implementations on GPU, FPGA and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching mechanism for increasing the adaptiveness of RNN computation, in order to efficiently handle the data
Authors
(none)
Tags
Stats
Related papers
- E-RNN: Design Optimization For Efficient Recurrent Neural Networks In Fpgas (2018)13.50
- A Comparison Of Adaptation Techniques And Recurrent Neural Network Architectures (2018)3.58
- Bifocal Neural ASR: Exploiting Keyword Spotting For Inference Optimization (2021)7.50
- Inference Skipping For More Efficient Real-time Speech Enhancement With Parallel Rnns (2022)10.35
- Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System (2018)8.60
- Fastwave: Accelerating Autoregressive Convolutional Neural Networks On FPGA (2020)8.82
- Fpga-based Low-power Speech Recognition With Recurrent Neural Networks (2016)13.50
- HAINAN: Fast And Accurate Transducer For Hybrid-autoregressive ASR (2024)0.00