博文

第四讲: Feasibility of Learning

已有 2522 次阅读 2014-1-12 23:50 |系统分类:科研笔记

1、 “no free lunch”理论告诫我们：如果没有假设条件存在，机器学习就不是不可行的。

2、 Hoeffding’s Inequality

$P[|\hat{X} - \mathbf{E}(\hat{X}) |> \epsilon ] \leq 2\exp{\left(-\frac{2\epsilon^2 N^2}{\sum_{i=1}^{N}(b_i - a_i)}\right)}$

其中，$a_i \leq X_i \leq b_1, \hat{X} = \frac1N \sum_{i=1}^{N} X_i $。

3、for any fixed $h$, in "big" data ($N$ large), in-sample error $E_{in}(h)$ is probably close to out-of-sample error $E_{out}(h)$ (within $\epsilon$)

$P[|E_{in}(h) - E_{out}(h)| > \epsilon ] \leq 2 \exp{(-2\epsilon^2 N )}$

If "$E_{in}(h)\approx E_{out}(h)$" and "$E_{in}(h)$ small" $\Rightarrow E_{out}(h)$ small $\Rightarrow h \approx f$ with respect to $P$.

4、

1）BAD Data for One $h$: $E_{out}(h)$ and $E_{in}(h)$ far away

当采样的时候，Hoeffding’s Inequality保证可采样数据以比较大的概率服从原始分布，但是这并不能保证没有样本违反原始数据分布，这些违反原始分布的样本称为Bad Data.

2）BAD data for many $h \Leftrightarrow$ no "freedom of choice" by $A \Leftrightarrow $ there exists some $h$ such that $E_{out}(h)$ and $E_{in}(h)$ far away. (实际上，当选择比较多的时候，我们总是会选择一个$\hat{h}$，使得数据在 $\hat{h}$上是不好的.Hoeffding’s Inequality给出了这样的$\hat{h}$ 的一个界)

5、The "Statistical" Learning Flow

1）if $|H| = M$ finite, $N$ large enough, for whatever $g$ picked by $A$, $E_{out}(g)\approx E_{in}(g)$

2）if $A$ finds one $g$ with $E_{in}(g)\approx 0$, PAC guarantee for $E_{out}(g) \approx 0$ $\rightarrow$ learning possible.

learning possible if $H$ finite and $E_{in}(g)$ small for large $N$.

转载本文请联系原作者获取授权，同时请注明本文来自廖红虹科学网博客。
链接地址：https://blog.sciencenet.cn/blog-507072-758528.html

上一篇：第三讲 Types of Learning
下一篇：第五讲 Training versus Testing

收藏 IP: 122.205.9.*| 热度|

当前推荐数：1 推荐人：陆泽橼

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

博文发布时间已经超过87600小时，评论已关闭。

廖红虹

扫一扫，分享此博文

不动如山分享 http://blog.sciencenet.cn/u/hustliaohh 脚踏实地，稳步向前

博文

第四讲: Feasibility of Learning

当前推荐数：1 推荐人：陆泽橼

该博文允许注册用户评论请点击登录评论 (0 个评论)

廖红虹

全部作者的其他最新博文

全部精选博文导读

相关博文

不动如山分享 http://blog.sciencenet.cn/u/hustliaohh 脚踏实地，稳步向前

博文

第四讲: Feasibility of Learning

当前推荐数：1 推荐人： 陆泽橼

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

廖红虹

全部作者的其他最新博文

全部精选博文导读

相关博文

当前推荐数：1 推荐人：陆泽橼

该博文允许注册用户评论请点击登录评论 (0 个评论)