大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【计算机科学】【2019.02】自然语言处理中的神经迁移学习

已有 1459 次阅读 2019-11-27 09:47 |系统分类:科研笔记|文章来源:转载

本文为爱尔兰国立大学(作者:Sebastian Ruder)的博士论文,共329页。

 

当前基于神经网络的自然语言处理模型擅长从大量标注数据中学习。鉴于这些能力,自然语言处理越来越多地应用于新任务、新领域和新语言。然而,目前的模型对噪音和对抗性的例子很敏感,并且容易过度拟合。这种脆弱性,加上注意力的消耗,挑战了有监督学习的范式。

 

迁移学习使我们能够利用从相关数据中获得的知识来提高目标任务的绩效,以预训练词表征形式的内隐迁移学习是自然语言处理中的一个常见组成部分。本文认为,在自然语言处理模型中,更显式的迁移学习是解决训练数据不足和提高下游性能的关键。我们展示了支持这一假设的相关领域、任务和语言中迁移知识的实验结果。

 

我们对自然语言处理中的迁移学习做出了一些贡献:首先,我们提出了新的方法来自动选择相关数据进行有监督和无监督的域自适应。其次,我们提出了两种新的架构,以改善多任务学习中的共享性,优于单任务学习和最新技术。再次,分析了现有的无监督跨语言迁移模型的局限性,提出了一种新的无监督跨语言词汇嵌入模型和方法。最后,我们提出了一个基于微调语言模型的顺序迁移学习框架,并分析了其适应阶段。

 

The current generation of neural network-based natural languageprocessing models excels at learning from large amounts of labelled data. Giventhese capabilities, natural language processing is increasingly applied to newtasks, new domains, and new languages. Current models, however, are sensitiveto noise and adversarial examples and prone to overfitting. This brittleness,together with the cost of attention, challenges the supervised learningparadigm.
Transfer learning allows us to leverage knowledge acquired from related data inorder to improve performance on a target task. 
Implicit transferlearning in the form of pretrained word representations has been a commoncomponent in natural language processing. In this dissertation, we argue thatmore explicit transfer learning is key to deal withthe dearth of training data and to improve downstream performance of naturallanguage processing models. We show experimental results transferring knowledgefrom related domains, tasks, and languages that support this hypothesis.
We make several contributions to transfer learning for natural languageprocessing: Firstly, we propose new methods to automatically select relevantdata for supervised and unsupervised domain adaptation. Secondly, we proposetwo novel architectures that improve sharing in multi-task learning andoutperform single-task learning as well as the state-of-the-art. Thirdly, weanalyze the limitations of current models for unsupervised cross-lingualtransfer and propose a method to mitigate them as well as a novellatentvariable cross-lingual word embedding model. Finally, we propose aframework based on fine-tuning language models for sequential transfer learningand analyze the adaptation phase.
 

 

引言

项目背景

迁移学习

用于域自适应的数据选择

无监督与弱监督的跨语言学习

多任务学习中的改进共享

适应普遍的预训练表示

结论



更多精彩文章请关注公众号:qrcode_for_gh_60b944f6c215_258.jpg



https://blog.sciencenet.cn/blog-69686-1207744.html

上一篇:[转载]【信息技术】【2014】基于视频分析的道路感知
下一篇:[转载]【统计学】时间序列建模与预测初探
收藏 IP: 112.31.16.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-9-27 15:23

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部