大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【读书1】【2017】MATLAB与深度学习——机器学习面临的挑战(1)

已有 487 次阅读 2018-9-11 09:09 |系统分类:科研笔记|文章来源:转载

 

图中的垂直流表示学习过程,训练后的模型被描述为水平流,称之为推理。

The vertical flow of the figure indicatesthe learning process, and the trained model is described as the horizontalflow, which is called inference.

 

用于机器学习建模的数据与在现实场景应用中提供的数据是不同的。

The data that is used for modeling in MachineLearning and the data supplied in the field application are distinct.

 

让我们在图1-4中添加一个模块,如图1-5所示,以更好地进行说明。

Let’s add another block to this image, asshown in Figure 1-5, to better illustrate this situation.



1-5 训练数据与输入数据在某些时候是明显不同的Training andinput data are sometimes very distinct

 

训练数据与输入数据的显著性区别是机器学习面临的结构性挑战。

The distinctness of the training data andinput data is the structural challenge that Machine Learning faces.

 

毫不夸张地说,机器学习的每一个问题都源于此。

It is no exaggeration to say that everyproblem of Machine Learning originates from this.

 

例如,如何使用由个人手写的笔迹组成的训练数据?

For example, what about using trainingdata, which is composed of handwritten notes from a single person?

 

由此训练产生的模型能否成功识别其他人书写的文字呢?

Will the model successfully recognize theother person’s handwriting?

 

事实上这种识别的可能性很低。

The possibility will be very low.

 

没有哪一种机器学习方法可以通过错误的训练数据来达到预期的目标。

No Machine Learning approach can achievethe desired goal with the wrong training data.

 

同样的思想也适用于深度学习。

The same ideology applies to Deep Learning.

 

因此,机器学习方法获得能充分反映现场数据特征的无偏训练数据是至关重要的

Therefore, it is critical for MachineLearning approaches to obtain unbiased training data that adequately reflectsthe characteristics of the field data.

 

无论训练数据或输入数据如何,用于使得模型性能一致的过程称为泛化

The process used to make the modelperformance consistent regardless of the training data or the input data iscalled generalization.

 

机器学习的成功很大程度上依赖于泛化的完成程度

The success of Machine Learning reliesheavily on how well the generalization is accomplished.

 

过度拟合Overfitting

 

泛化过程变质或失误的主要原因之一是过度拟合。

One of the primary causes of corruption ofthe generalization process is overfitting.

 

是的,这是又一个新术语。

Yes, another new term.

 

然而,没有必要感到沮丧。

However, there is no need to be frustrated.

 

过度拟合根本不是一个新概念。

It is not a new concept at all.

 

用举例的方式比简单的语句描述更容易理解术语的含义。

It will be much easier to understand with acase study than with just sentences.

 

考虑图1-6中所示的分类问题。

Consider a classification problem shown inFigure 1-6.



1-6 确定一条曲线来划分图中的两组数据Determine a curve todivide two groups of data


——本文译自Phil Kim所著的《Matlab Deep Learning》


更多精彩文章请关注微信号:qrcode_for_gh_60b944f6c215_258.jpg



http://blog.sciencenet.cn/blog-69686-1134099.html

上一篇:[转载]【源码】基于MATLAB的Arduino编程
下一篇:[转载]【信息技术】【2010.06】【含源码】图像处理在医学诊断中的应用

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备14006957 )

GMT+8, 2018-9-24 08:06

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部