随园厚学分享 http://blog.sciencenet.cn/u/gothere 计算语言学博士 希望在这里留下学术的足迹

博文

SemEval又出新花样,探测和理解语言中的双关

已有 4933 次阅读 2016-7-2 10:14 |个人分类:Computational Linguistics|系统分类:科研笔记

德国达姆施塔特工业大学的NLP组申办的评测很有意思。除了正常的没有双关的句子外,掺杂了笑话、俗语、警句里面的双关,进行评测。

有趣的是,这个评测没有训练语料,只有测试语料。参赛队只能八仙过海了。机器学习技术难以直接发挥作用的情况下,实验方案只能更多地采取语言资源策略。比如,俚语词典、成语词典和新闻语料之类的来对照训练。

相比之下,觉得咱做的认知属性库(cognitivebase.com)是很有用的东东,谁家用了英文的词典数据,效果应该更好一些。看到这个评测,内心多了一份动力:)



SemEval-2017 Shared Task:


Detection and Interpretation of English Punshttps://logological.org/punsResearchers and industry professionals are invited to participate in ashared task on the computational detection and interpretation of Englishpuns. The task will occur as part of the SemEval-2017 workshop, to beheld in conjunction with a major NLP conference (TBA) in the summer of2017. SemEval is an ongoing series of evaluations of computationalsemantic analysis systems, organized under the aegis of SIGLEX, theSpecial Interest Group on the Lexicon of the Association forComputational Linguistics.

---- Task description ----A pun is a form of wordplay in which one signifier (e.g., a word orphrase) suggests two or more meanings by exploiting polysemy, orphonological similarity to another signifier, for an intended humorousor rhetorical effect.  Puns where the two meanings share the samespelling are known as homographic, whereas those where the two meaningsare spelled (and also usually pronounced) differently are known asheterographic.Conscious or tacit linguistic knowledge -- particularly of lexicalsemantics and phonology -- is an essential prerequisite for theproduction and interpretation of puns. This has long made them anattractive subject of study in theoretical linguistics, and has led to asmall but growing body of research into puns in computationallinguistics.  This SemEval shared task will be the first organizedevaluation of automatic pun processing systems.



Participants will be provided with two data sets.  The first data setwill contain several thousand short contexts (jokes, slogans, aphorisms,etc.). In some of these contexts, a single word will be used as ahomographic pun; in the rest, there will be no pun.  The second data setwill be similar to the first, except that the puns will be heterographicrather than homographic.  For one or both data sets, participatingsystems will compete in any or all of three subtasks:Subtask 1: Pun detection. For this subtask, participants are given anentire raw data set. For each context, the system must decide whether ornot it contains a pun.Subtask 2: Pun location. For this subtask, the contexts not containingpuns are removed from the data set. For each context, the system mustidentify which word is the pun.Subtask 3: Pun interpretation.  For this subtask, the pun word in eachcontext is marked, and contexts where the pun's two meanings are notfound in WordNet are removed from the data set. For each context, thesystem must annotate the two meanings of the given pun by reference toWordNet sense keys.For the first two subtasks, system performance will be measured with theusual precision and recall metrics from information retrieval, and forthe third subtask, we will use slightly modified versions of theprecision and recall metrics used for WSD.

---- Practical information ----

The following schedule is adapted from the SemEval-2017 call for taskproposals and is subject to change.

* July 1, 2016: Trial data ready

* January 10, 2017: Evaluation start

* January 31, 2017: Evaluation end

* February 28, 2017: Paper submission due

* March 31, 2017: Paper reviews due

* April 30, 2017: Camera-ready submission due

* Summer 2017: SemEval-2017 workshop

---- Organizing committee ----

* Tristan Miller, UKP Lab, Technische Universität Darmstadt

* Christian F. Hempelmann, Ontological Semantic Technology Lab, TexasA&M University-Commerce

* Iryna Gurevych, UKP Lab, Technische Universität Darmstadt

To contact the organizing committee, please e-mail Tristan Miller.

-- Tristan Miller, Research ScientistUbiquitous Knowledge Processing Lab (UKP-TUDA)Department of Computer Science, Technische Universität Darmstadt

Tel: +49 6151 162 5296 | Web: https://www.ukp.tu-darmstadt.de/




https://blog.sciencenet.cn/blog-39714-988162.html

上一篇:一堆数字、不见其人的学者介绍
下一篇:为何电风扇的风不如自然风舒服
收藏 IP: 223.2.50.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-17 14:16

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部