《镜子大全》《朝华午拾》分享 http://blog.sciencenet.cn/u/liwei999 曾任红小兵,插队修地球,1991年去国离乡,不知行止。

博文

Domain portability myth in natural language processing (NLP)

已有 4719 次阅读 2014-8-1 09:50 |个人分类:立委科普|系统分类:科研笔记| domain, portability, 领域移植性

One widely held myth is about domain portability of an NLP system: a hand-crafted rule system has poor portability because once the domain changes, the system has to be re-crafted from scratch while a Machine Learning system can be re-trained on domain data keeping the algorithm unchanged. It all sounds too obvious and convincing. Our article to publish in CCCF uncovers the real picture behind the myth. The truth might well be just the opposite: a well architected rule system enjoys much more domain portability than a machine learning system does. Stay tuned.


  • Lei Zhang Interesting. But how can you handle thousands of domains by a rule system efficiently.8h ago

  • Wei Li

    Wei Li That's why we need to write an article to answer such questions. To make it short, it depends on how the system is structured. If a domain independent deep parser is used as logical basis for domain apps, then fast domain porting is possible because only the relatively small domain specific component needs to be developed for each new domain. But with machine learning based on bag of words or on shallow processing, each new domain involves new data and training/tuning, which can be daunting. less7h ago

  • Lei Zhang

    Lei Zhang So you assume much knowledge is shared across different domains. Looking forward to the article.7h ago

  • Wei Li

    Wei Li Yes, for NLP, linguistic knowledge is largely common across domains. The core English grammar does not change whether it is used in this domain or that domain. On another note, in real life apps, it is rare that we need to handle thousands of domains at one time. The typical situation is that the product team decides to enter a new domain for a new offering, and the NLP development team needs to develop and port to that chosen domain for delivery. Then we can measure which approach makes the development faster in delivering the requirements from the product manager. less6h ago

Domain portability myth in natural language processing(Link)

Communications of Chinese Computer Federation (CCCF)August 2014

规则系统的移植性太差吗?
【计算机学会通讯】2014年第8期(总第102期)

【置顶:立委科学网博客NLP博文一览(定期更新版)】



https://blog.sciencenet.cn/blog-362400-816330.html

上一篇:解释 “讲课好的教师在高校为什么竟没有上升通道呢?”
下一篇:永远的北京城
收藏 IP: 192.168.0.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-4-26 11:52

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部