Online Model Compression For Federated Learning With Large Models
2022 Β· Tien-Ju Yang, Yonghui Xiao, Giovanni Motta, et al.
Abstract
This paper addresses the challenges of training large neural network models under federated learning settings: high on-device memory usage and communication cost. The proposed Online Model Compression (OMC) provides a framework that stores model parameters in a compressed format and decompresses them only when needed. We use quantization as the compression method in this paper and propose three methods, (1) using per-variable transformation, (2) weight matrices only quantization, and (3) partial parameter quantization, to minimize the impact on model accuracy. According to our experiments on two recent neural networks for speech recognition and two different datasets, OMC can reduce memory usage and communication cost of model parameters by up to 59% while attaining comparable accuracy and training speed when compared with full-precision training.
Authors
(none)
Tags
Stats
Related papers
- Federated Pruning: Improving Neural Network Efficiency With Federated Learning (2022)7.50
- Model Compression For Dnn-based Speaker Verification Using Weight Quantization (2022)3.58
- Optimization Of Dnn-based Speaker Verification Model Through Efficient Quantization Technique (2024)0.00
- Training Speech Recognition Models With Federated Learning: A Quality/cost Framework (2020)12.93
- Empirical Evaluation Of Deep Learning Model Compression Techniques On The Wavenet Vocoder (2020)0.00
- One-pass Multiple Conformer And Foundation Speech Systems Compression And Quantization Using An All-in-one Neural Model (2024)0.00
- Omni-c: Compressing Heterogeneous Modalities Into A Single Dense Encoder (2026)0.00
- Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System (2018)8.60