《镜子大全》《朝华午拾》分享 http://blog.sciencenet.cn/u/liwei999 曾任红小兵,插队修地球,1991年去国离乡,不知行止。

博文

读书笔记:YT 神功源自 TWSS

已有 3313 次阅读 2012-9-28 03:50 |个人分类:立委科普|系统分类:科研笔记| 读书笔记


YT 是黑话,以前论过,不赘。无需深究,乃借题发挥,引入最近的读书笔记一则。

Quote

For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:


Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.
Man 2: That’s what she said!

摘自 http://us.textanalyticsnews.com/fc_fcbi1lz/lz.aspx?p1=05555212S3562&CC=&p=1&cID=0&cValue=1

just finished reading the academic paper on this research, done by some professors at Washington Univ.

It is very, very research oriented and academic and should not even bother practitioners in industry at all.

It is eye-catching and certainly has academic value due to no one having done anything on this so-called TWSS (That is What She Said) problem before.

It is intended to identify/classify via machine learning a subset of puns which might (and might not) contain sarcasm on a brand. But mainly it is only a very small subset of data associated with some adult jokes.

First of all, puns are the last thing which should be brought to the table as an object for automatic processing in a real life system not only because they are statistically rare but also because they are so complex and often involve cultural context. There are endless jobs which are much more widespread and much more tractable for automatic processing. Spending resources on such a problem in industry is not wise, nor effective.

It is one of those again, technology news reporters like to cover stories like that as it draws people's attention and imagination.

Some research is twisted/exaggerated out of context to sound like the next big thing in real life technology.

If they are real for apps they should show benchmarks from real life large corpus. Not the benchmark reported in the paper on some select corpus of a particular source, but the one from the social media at large. First question to answer is how much TWSS is in social media, how relevant it is when it does occur to brands and lastly how the classification will be used in apps. None of these are answered by the research publication, so it is not worth the time in looking into this.

It is eye catching. That's all.

RE:  Subject: What can jokes teach us about NLP?
Can your text analytics algorithm tell the difference between a joke and a serious statement?

Reference:
http://www.aclweb.org/anthology-new/P/P11/P11-2016.pdf

【置顶:立委科学网博客NLP博文一览(定期更新版)】



http://blog.sciencenet.cn/blog-362400-617371.html

上一篇:昨天在金门桥一带拍的几张照片
下一篇:【研发笔记:粤语文句的情报挖掘】

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2020-7-6 00:32

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部