|
别统计,预测!上下文统计VS上下文预测语义向量的一个系统地比较
Abstract
Context-predictingmodels (more commonly known as embeddings or neural language models) are thenew kids on the distributional semantics block. Despite the buzz surroundingthese models, the literature is still lacking a systematic comparison of thepredictive models with classic, count-vector-based distributional semanticapproaches. In this paper, we perform such an extensive evaluation, on a widerange of lexical semantics tasks and across many parameter settings. Theresults, to our own surprise, show that the buzz is fully justified, as thecontext-predicting models obtain a thorough and resounding victory againsttheir count-based counterparts.
上下文预测模型(更闻名于耳的说法是植入模型或者神经语言模型)是分布式语义模块的新生儿。尽管有各种争吵围绕着这些模型,目前的文献还缺少一个系统地比较预测模型和传统的、基于统计向量的分布式语义方法(的优劣)。在本文中,我们用各种参数设置从大范围的词汇语义任务上,对这两类模型进行广泛地比较。结果,出乎我们自己的预料,关于此的争吵将被澄清,在和基于上下文统计的模型对比中,基于上下文预测的模型获取的是一个彻底的、令人瞩目地胜利!
非常基础的工作!比较基于统计的模型和基于预测的模型哪一类好。
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-6-3 12:49
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社