《镜子大全》《朝华午拾》分享 http://blog.sciencenet.cn/u/liwei999 曾任红小兵,插队修地球,1991年去国离乡,不知行止。

博文

自然语言的并列: preference semantics at its worst

已有 2937 次阅读 2016-12-10 23:51 |个人分类:立委科普|系统分类:科普集锦| conjoin, 并列结构

NLU、NLP 多年来还有一个公认的难点,就是并列结构(conjoined structure)。并列在思维逻辑里没有地位,它是语言表达的产物。并列是语言学中最不讲道理的程咬金,它总是横插一刀,而且任性,在任一层次。一切的 subcat arg structures 或 mod-head patterns 都必须为它让道,否则就堵塞交通,让 parsing 的路线断链。然而,如果没有并列,自然语言就会难以容忍地单调枯燥,尽失精简。

举个简单例子:

1027a

这句话逻辑上展开以后怎么样呢?

颈椎间盘突出症的最常见和最典型表现是一侧颈肩部及上肢的酸痛
==>
颈椎间盘突出症的最常见表现是一侧颈肩部的酸痛
颈椎间盘突出症的最典型表现是一侧颈肩部的酸痛
颈椎间盘突出症的最常见表现是上肢的酸痛
颈椎间盘突出症的最典型表现是上肢的酸痛这才牵涉两个并列,一个句子出现五六个甚至上十个并列, 并不鲜见。

这才牵涉两个并列,一个句子出现五六个甚至上十个并列, 并不鲜见。语言不是逻辑。没有并列,语言面临组合爆炸式啰嗦。很难想象,传统的单层 parsing 系统,譬如教科书上经典的乔姆斯基式 CFG-based chart parsing,可以把各种并列处理妥帖。并列 can be so f* hierarchical, even for a very deep multilevel

Conjoinment can be so f* hierarchical, even for a very deep, multilevel parsing system: conjoin remains a challenge if not very carefully/skillfully handled by a very experienced linguist ’cause the boundaries are tough to identify and they just appear at any levels at will.  The conjoined elements are semantically parallel but the parallelness, which ideally should be used as conditions to help identify the conjoined structure and its scope, is unfortunately in practice all relative and fuzzy, which can hardly be enforced. food can be conjoined with food, of course, but look at this:

我喜欢肥肉和哲学。

food and knowledge, totally different monsters of semantics, can also be conjoined, it is preference semantics at its worst.

OK, I am not going to elaborate solutions, which should be a long article by itself.  This post serves as an introduction of this linguistic monster, to arouse the awareness of linguistic challenges in natural language parsing.

【相关】

中文处理

Parsing

【置顶:立委NLP博文一览】

《朝华午拾》总目录




https://blog.sciencenet.cn/blog-362400-1019949.html

上一篇:【立委科普:如何自动区分同一批词表达的不同意义?】
下一篇:【一日一parsing:#自然语言太难了# 吗?】
收藏 IP: 192.168.0.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-19 04:01

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部