|||
在翻译肯尼斯•丘吉(Kenneth Church)教授【钟摆摆得太远】,读到了敏斯基(Minsky)对神经网络的这个经典批判:异或(XOR)是神经网络感知机(perceptrons)的命门。丘吉教授指出这一批评其实对很多流行的机器学习算法适用,因为这些算法对训练数据有一个线性可分的假设前提(linearly separable assumption),而异或门是线性不可分的。有关章节的论述和翻译附于文末。
丘吉教授说,由于这个异或门,情感分类(sentiment classification)在学习中不得不放弃使用褒贬词,因为这些褒贬词虽然情感色彩浓烈,但是其指向却随着上下文而改变。而这种改变往往需要依赖异或判断,线性分离为基础的机器学习根本就学不会这种判别。
怎样理解这个异或门困境?
朋友是这样解说的,我觉得比较形象:
情绪词不能线性分隔,因为它的值依赖一个异或判断xor。即:
好坏的分类=“情绪词的好坏” XOR “对象”
例如对“我们”来说:
好词xor我们 = 好
好词xor他们 = 坏
坏词xor我们 = 坏
坏词xor他们 = 好
如果把对象用X轴表示,我们为+1、他们(我们的竞争对手)为-1;情绪词用Y轴表示,+1为好词,-1为坏词。
那么“好坏分类”= X*Y. 其结果为+1就代表“好”,-1代表"坏"。可以看出
意思为好的两个点处在一三象限,坏点在二四象限。这两组点不能用一条直线分隔开来。
所以基于线性分隔的机器学不会情绪词。
【附录:异或门批判的有关章节】
(节选自/译自: K. Church 2011. A Pendulum Swung Too Far. Linguistics issuesin Language Technology, Volume 6,Issue 5.)
3.2敏斯基的批评
敏斯基和帕佩特(Minsky and Papert 1969)表明,感知机(perceptrons, 或统而言之,线性分离机)不能学会分隔那些不可线性可分的,如异或(XOR)和连通性(connectedness)。在二维空间里,如果可以用直线分隔开标记为正例和负例的点,该散点图即线性可分。更一般地,在n维空间中,当有n-1维超平面(hyperplane)能分隔正例和负例,其点便线性可分。
3.2 Minsky's Objections
Minsky and Papert (1969) showed that perceptrons(and more generally, linear separators) cannot learn functions that are notlinearly separable such as XOR and connectedness. In two dimensions, a scatter plotis linearly separable when a line can separate the points with positive labelsfrom the points with negative labels. More generally, in n dimensions, pointsare linearly separable when there is a n-1 dimensional hyperplane thatseparates the positive labels from the negative labels.
......
3.3为什么当前技术忽略谓词
信息检索和情感分析的权重系统往往专注于刚性指示词(rigid designators,例如名词,译注:刚性指示词指的是意义独立,不随上下文而改变的实体名词,基于关键词模式匹配的机器学习比较容易模型刚性指示词,因为它不受上下文关系的干扰),而忽略谓词(动词,形容词和副词)以及强调词(例如,“非常”)和贬损词(loaded terms,如“米老鼠”和“破烂儿”,译者注,“米老鼠”是贬损词,因为它与“破烂儿”一样贬义,说一家企业是米老鼠,表示的是轻蔑)。其原因可能与敏斯基和帕佩特对感知机的批评有关。多年前,我们有机会接触MIMS 数据集,这是由AT&T操作员收集的文本留言。其中一些评论被操作员标记为正面,负面或中性。刚性指示词(通常是名词)往往与正面或者负面紧密关联,但也有不少贬损词,不是正面就是负面,很少中性。
3.3 Why Current Technology Ignores Predicates
Weighting systems for Information Retrievaland Sentiment Analysis tend to focus on rigid designators (e.g., nouns) andignore predicates (verbs, adjectives and adverbs) and intensifiers (e.g.,“very”) and loaded terms (e.g., “Mickey Mouse” and “Rinky Dink”). The reason might be related to Minsky and Papert's criticism ofperceptrons. Years ago, we had access to MIMS, a collection of text commentscollected by AT&T operators. Some of the comments were labeled byannotators as positive, negative or neutral. Rigid designators (typicallynouns) tend to be strongly associated with one class oranother, but there were quite a few loaded terms that were either positive ornegative, but rarely neutral.
贬损词怎样会是正面呢?原来是,当贬损词与竞争对手相关联,标注者就把文档标为对我方“好”(正例);当感性词与我们关联,即标注为对我方“坏”(负例)。换句话说,有一种异或关系(贬损词 XOR 我方)超出了线性分离机的能力。
How can loaded terms be positive? It turnsout that the judges labeled the document as good for us if the loaded term waspredicated of the competition, and bad if it was predicated of us. In otherwords, there is an XOR dependency (loaded term XOR us) that is beyond the capabilitiesof a linear separator.
目前的做法,情感分析和信息检索不考虑修饰成分(谓词与算元的关系,强调词和贬损词),因为除非你知道他们在修饰什么,修饰成分的意义很难捕捉。忽视贬损词和强调词似乎是个遗憾,尤其对情感分析,因为贬损词显然表达了强烈的主观意见。但对于一个特征,如果你不知道应该给正面还是负面的符号,即使你知道强度大也没什么用。
Current practice in Sentiment Analysis andInformation Retrieval does not model modifiers (predicate-argumentrelationships, intensifiers and loaded terms), because it is hard to make senseof modifiers unless you know what they are modifying. Ignoring loaded terms andintensifiers seems like a missed opportunity, especially for Sentiment Analysis,since loaded terms are obviously expressing strong opinions. But you can't domuch with a feature if you don't know the sign, even
if you know the magnitude is large.
当谓词-算元关系最终被模型化,由于上述 XOR 异或问题的存在,我们最终需要对线性可分的前提假定重新审视。
When predicate-argument relationships areeventually modeled, it will be necessary to revisit the linearly separableassumption because of the XOR problem mentioned above.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-22 00:52
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社