Code-switched Language Models Using Dual Rnns And Same-source Pretraining | Awesome LLM Papers

Code-switched Language Models Using Dual Rnns And Same-source Pretraining

Saurabh Garg, Tanmay Parekh, Preethi Jyothi Β· Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Β· 2018

This work focuses on building language models (LMs) for code-switched text. We propose two techniques that significantly improve these LMs: 1) A novel recurrent neural network unit with dual components that focus on each language in the code-switched text separately 2) Pretraining the LM using synthetic text from a generative model estimated using the training data. We demonstrate the effectiveness of our proposed techniques by reporting perplexities on a Mandarin-English task and derive significant reductions in perplexity.

Similar Work
Loading…