大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【计算机科学】【2017.12】图像分类与回归的深度神经网络模型

已有 269 次阅读 2020-1-21 17:58 |系统分类:科研笔记|文章来源:转载

本文为意大利特伦托大学(作者:Salim MALEK)的博士论文,共98页。

 

深度学习是机器学习的一个分支,在许多研究领域和实际中都得到了广泛应用。这种持续的发展主要可以追溯到潜在处理设施的可用性和可负担性,例如,仅在十年前,这些设施还没有普及。尽管它在计算机视觉领域,特别是在目标识别和检测领域显示出了广泛的前沿性能,但是深度学习还没有进入其他研究领域。此外,深度学习模型的性能强烈依赖于如何根据当前问题设计/定制这些模型。因此,这不仅引起了对精度的关注,而且还增加了处理开销。深度学习系统的成功和适用性共同依赖于这两个组成部分。

 

在这篇论文中,我们提出了创新的深度学习方案,并将其应用于有趣但较少涉及的主题。在这方面,第一个被覆盖的主题是视障者的粗糙场景描述,其想法是列举出可能存在于被视障者所捕获图像中的对象,为此,我们通过从各自的查询图像中提取多个特征来捕获纹理及色彩暗示。此外,为了提高提取特征的代表性,我们通过一个自动编码模型,在特征学习阶段对其进行强化。后者的顶部是logistic回归层,以便检测是否存在任何对象。

 

在第二个主题中,我们建议利用相同的模型,即在遥感图像中云移除背景下的自动编码器。简单地说,该模型是在某一地理区域的无云图像上学习的,然后应用于同一区域在不同时刻获取的另一个受云污染的图像。提出了两种重建策略,即基于像素的重建和基于面片的重建。从前面的两个主题中,我们定量地证明了自动编码器在(i)特征学习和(ii)序列数据的重建和映射方面都可以发挥关键作用。

 

卷积神经网络(CNN)是计算机视觉领域应用最广泛的一种模型,相对于传统的手工特征,它在目标和场景识别方面有着显著的性能。尽管如此,CNN的二维形式是很自然的,这就对它在一维数据中的适用性提出了疑问。因此,本文的第三部分致力于CNN的一维结构设计,并将其应用于光谱数据。换言之,CNN是为从一维化学计量数据中提取特征而定制的,同时提取的特征被输入高级回归方法以估计潜在的化学成分浓度。实验结果表明,与二维CNN类似,一维CNN也容易被传统方法所束缚。本文的最后一个贡献是提出了一种新的神经网络连接权值估计方法,它基于对CNN的每个核训练一个支持向量机。该方法具有快速、适用于小数据集应用的优点。

 

Deep learning, a branch of machinelearning, has been gaining ground in many research fields as well as practicalapplications. Such ongoing boom can be traced back mainly to the availabilityand the affordability of potential processing facilities, which were not widelyaccessible than just a decade ago for instance. Although it has demonstratedcutting-edge performance widely in computer vision, and particularly in objectrecognition and detection, deep learning is yet to find its way into otherresearch areas. Furthermore, the performance of deep learning models has astrong dependency on the way in which these latter are designed/tailored to theproblem at hand. This, thereby, raises not only precision concerns but alsoprocessing overheads. The success and applicability of a deep learning systemrelies jointly on both components. In this dissertation, we present innovativedeep learning schemes, with application to interesting though less-addressedtopics. In this respect, the first covered topic is rough scene description forvisually impaired individuals, whose idea is to list the objects that likelyexist in an image that is grabbed by a visually impaired person, To this end,we proceed by extracting several features from the respective query image in orderto capture the textural as well as the chromatic cues therein. Further, inorder to improve the representativeness of the extracted features, we reinforcethem with a feature learning stage by means of an auto encoder model. Thislatter is topped with a logistic regression layer in order to detect thepresence of objects if any. In a second topic, we suggest to exploit the samemodel, i.e., auto encoder in the context of cloud removal in remote sensingimages. Briefly, the model is learned on a cloud-free image pertaining to acertain geographical area, and applied afterwards on another cloud-contaminatedimage, acquired at a different time instant, of the same area. Tworeconstruction strategies are proposed, namely pixel-based and patch basedreconstructions. From the earlier two topics, we quantitatively demonstratethat auto encoders can play a pivotal role in terms of both (i) featurelearning and (ii) reconstruction and mapping of sequential data. ConvolutionalNeural Network (CNN) is arguably the most utilized model by the computer visioncommunity, which is reasonable thanks to its remarkable performance in objectand scene recognition, with respect to traditional hand-crafted features.Nevertheless, it is evident that CNN naturally is availed in itstwo-dimensional version. This raises questions on its applicability tounidimensional data. Thus, a third contribution of this thesis is devoted tothe design of a unidimensional architecture of the CNN, which is applied tospectroscopic data. In other terms, CNN is tailored for feature extraction fromone dimensional chemometric data, whilst the extracted features are fed intoadvanced regression methods to estimate underlying chemical componentconcentrations. Experimental findings suggest that, similarly to 2D CNNs,unidimensional CNNs are also prone to impose themselves with respect totraditional methods. The last contribution of this dissertation is to developnew method to estimate the connection weights of the CNNs. It is based ontraining an SVM for each kernel of the CNN. Such method has the advantage ofbeing fast and adequate for applications that characterized by small datasets.

 

引言与论文概述

利用自动编码器对视障者进行实时室内场景描述

利用背景化自动编码器重建云污染多光谱图像

用于光谱信号回归的一维卷积神经网络

卷积SVM

结论


更多精彩文章请关注公众号:qrcode_for_gh_60b944f6c215_258.jpg



http://blog.sciencenet.cn/blog-69686-1215137.html

上一篇:[转载]【信息技术】【2016.06】基于扩展语音特征建模和转换的语音变换
下一篇:[转载]【信息技术】【2019.03】【含源码】智能交通系统中基于视觉的车辆检测与跟踪

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2020-2-22 03:54

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部