[转载]【计算机科学】【2019.01】贝叶斯卷积神经网络

Artificial Neural Networks are connectionist systems that perform a given task by learning on examples without having prior knowledge about the task. This is done by finding an optimal point estimate for the weights in every node. Generally, the network using point estimates as weights perform well with large datasets, but they fail to express uncertainty in regions with little or no data, leading to overconfident decisions. In this thesis, Bayesian Convolutional Neural Network (BayesCNN) using Variational Inference is proposed, that introduces probability distribution over the weights. Furthermore, the proposed BayesCNN architecture is applied to tasks like Image Classification, Image Super-Resolution and Generative Adversarial Networks. BayesCNN is based on Bayes by Backprop which derives a variational approximation to the true posterior. Our proposed method not only achieves performances equivalent to frequentist inference in identical architectures but also incorporate a measurement for uncertainties and regularisation. It further eliminates the use of dropout in the model. Moreover, we predict how certain the model prediction is based on the epistemic and aleatoric uncertainties and finally, we propose ways to prune the Bayesian architecture and to make it more computational and time effective. In the first part of the thesis, the Bayesian Neural Network is explained and it is applied to an Image Classification task. The results are compared to point-estimates based architectures on MNIST, CIFAR-10, and CIFAR-100 datasets. Moreover, uncertainties are calculated and the architecture is pruned and a comparison between the results is drawn. In the second part of the thesis, the concept is further applied to other computer vision tasks namely, Image Super-Resolution and Generative Adversarial Networks. The concept of BayesCNN is tested and compared against other concepts in a similar domain.

1.1 问题描述

1.2 当前研究概况

1.3 我们的假设

1.4 本文研究贡献

2.1 神经网络

2.2 基于概率的机器学习

2.3 贝叶斯学习的不确定性

2.4 后向传播

2.5 模型权重修剪

4.1 变差推断的贝叶斯卷积神经网络

4.2 CNN中的不确定估计

4.3 模型修剪

5.1 实验方法

5.2 案例1：小型数据集（MNIST、CIFAR-10）

5.3 案例2：大型数据集（CIFAR-100）

5.4 不确定估计

5.5 模型修剪

5.6 训练时间

6.1 图像超分辨率的BayesCNN应用

6.2 生成对抗网络的BayesCNN应用

http://blog.sciencenet.cn/blog-69686-1173928.html

全部精选博文导读

GMT+8, 2019-10-16 18:03

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社