Distilling Word Embeddings: An Encoding Approach
Full Text: cikm16.pdfDistilling knowledge from a well-trained cumbersome network to a small one has recently become a new research topic, as lightweight neural networks with high performance are particularly in need in various resource-restricted systems. This paper addresses the problem of distilling word embeddings for NLP tasks. We propose an encoding approach to distill task-specific knowledge from a set of high-dimensional embeddings, so that we can reduce model complexity by a large margin as well as retain high accuracy, achieving a good compromise between efficiency and performance. Experiments reveal the phenomenon that distilling knowledge from cumbersome embeddings is better than directly training neural networks with small embeddings.
Citation
L. Mou, R. Jia, Y. Xu, G. Li, L. Zhang, Z. Jin. "Distilling Word Embeddings: An Encoding Approach". ACM International Conference on Information and Knowledge Management (CIKM), pp 1977–1980, October 2016.Keywords: | |
Category: | In Conference |
Web Links: | DOI |
ACM Digital Library |
BibTeX
@incollection{Mou+al:CIKM16, author = {Lili Mou and Ran Jia and Yan Xu and Ge Li and Lu Zhang and Zhi Jin}, title = {Distilling Word Embeddings: An Encoding Approach}, Pages = {1977–1980}, booktitle = {ACM International Conference on Information and Knowledge Management (CIKM)}, year = 2016, }Last Updated: February 03, 2021
Submitted by Sabina P