||
本文为加拿大多伦多大学(作者:George Edward Dahl)的博士论文,共108页。
机器学习中的深度学习方法强调学习其输入分布式表达的高容量、可扩展模型。本文在语音识别、计算化学和自然语言处理等一系列不同的案例研究中证明了这种方法的有效性和通用性。在这些研究中,根据需要扩展并修改了神经网络模型,以更有效地完成每项设计任务。
在语音识别领域,利用深度神经网络建立了一个更精确的声学模型。该模型使用ReLu激活函数和dropout,降低了50小时广播新闻任务的误字率。类似神经网络产生的分子活性预测模型比制药工业中使用的实际生产系统更有效。尽管药物发掘方面的训练分析模型通常不是很庞大,但仍然可以通过利用同一模型中多个分析的数据、并使用有效的正则化方案来训练非常大的模型。在自然语言处理领域,本文首先描述了一种适用于文本数据的受限玻尔兹曼机训练算法。然后,介绍了一种新的解析语句的神经网络生成模型,该模型能够生成合理的样本,并证明了该模型的深层次变异具有性能上的优势。
The deep learning approach to machine learningemphasizes high-capacity, scalable models that learn distributedrepresentations of their input. This dissertation demonstrates the efficacy andgenerality of this approach in a series of diverse case studies in speechrecognition, computational chemistry, and natural language processing.Throughout these studies, I extend and modify the neural network models asneeded to be more effective for each task. In the area of speech recognition, Idevelop a more accurate acoustic model using a deep neural network. This model,which uses rectified linear units and dropout, improves word error rates on a50 hour broadcast news task. A similar neural network results in a model formolecular activity prediction substantially more effective than productionsystems used in the pharmaceutical industry. Even though training assays indrug discovery are not typically very large, it is still possible to train verylarge models by leveraging data from multiple assays in the same model and byusing effective regularization schemes. In the area of natural languageprocessing, I first describe a new restricted Boltzmann machine trainingalgorithm suitable for text data. Then, I introduce a new neural networkgenerative model of parsed sentences capable of generating reasonable samplesand demonstrate a performance advantage for deeper variants of the model.
下载英文原文地址:
http://page2.dfpan.com/fs/elcj52215291762cc95/
更多精彩文章请关注微信号:
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-27 19:22
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社