||
本文为德国卡尔斯鲁厄理工学院(作者:Martin Thoma)的硕士论文,共134页。
卷积神经网络(CNN)在各种计算机视觉任务中占据主导地位,因为Alex Krizhevsky证明了它们可以有效地训练,并将ImageNet大规模视觉识别挑战中的前5项错误率从26.2%降低到15.3%。CNN的许多方面都在各种出版物中进行了研究,但是有关神经网络结构的分析和构建的文献却很少。本项工作是缩小这一差距的一步。本文对现有的CNN分析和拓扑结构构建技术进行了全面的综述,提出了一种新的混淆矩阵分类误差可视化方法。在此基础上,对分级分类器进行了描述和评价;此外,对CIFAR-100的一些结果进行了确认和量化。例如,小批量、平均集成、数据扩充和测试时间转换对精确度的积极影响。其他的结果,如学习的颜色转换对测试精度的积极影响无法得到证实。本文开发了一个输入大小为32×32×3、100个类别、学习参数只有100万个的模型,它在基准数据集Asirra、GTSRB、HASYv2和STL-10上的性能优于目前最新的技术。
Convolutional Neural Networks (CNNs) dominatevarious computer vision tasks since Alex Krizhevsky showed that they can betrained effectively and reduced the top-5 error from 26.2 % to 15.3 % on theImageNet large scale visual recognition challenge. Many aspects of CNNs areexamined in various publications, but literature about the analysis andconstruction of neural network architectures is rare. This work is one step toclose this gap. A comprehensive overview over existing techniques for CNNanalysis and topology construction is provided. A novel way to visualizeclassification errors with confusion matrices was developed. Based on thismethod, hierarchical classifiers are described and evaluated. Additionally,some results are confirmed and quantified for CIFAR-100. For example, the positiveimpact of smaller batch sizes, averaging ensembles, data augmentation andtest-time transformations on the accuracy. Other results, such as the positiveimpact of learned color transformation on the test accuracy could not beconfirmed. A model which has only one million learned parameters for an inputsize of 32 × 32 × 3 and 100 classes and which beats the state of the art on thebenchmark dataset Asirra, GTSRB, HASYv2 and STL-10 was developed.
下载英文原文地址:
http://page2.dfpan.com/fs/5lcj6221f2912666682/
更多精彩文章请关注微信号:
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-9-24 22:01
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社