博文

第六讲 Theory of Generalization

已有 2386 次阅读 2014-1-13 20:34 |个人分类:科研道路|系统分类:科研笔记

上一讲提到了错误界中$M$的可能取值$m_H(N)$是一个与样本个数$N$有关的函数。期望的结果是$m_H(N)$是随着$N$以多项式方式在增长而不是以指数形式增长，因此，引出了break point的概念。

1.在这一讲中，我们希望给定一个bounding function。Bounding function $B(N; k)$: maximum possible $m_H(N)$ when break point $= k$.下图给出了部分bounding function的值：

（这个表格有点像我们中学学过的杨辉三角形式）

2.由此得到递归式：$B(N; k)\leq B(N-1; k) + B(N-1; k-1)$.

进一步，得到

simple induction using boundary and inductive formula;
for fixed $k$, $B(N; k)$ upper bounded by $poly(N) \Rightarrow$ $m_H(N)$ is $poly(N)$ if break point exists.

于是，我们有

$m_H(N) \leq B(N,k) \leq \sum\limits_{i=1}^{k-1}\binom{N}{i}$.

3.最后得到Vapnik-Chervonenkis (VC) bound：

So, $E_{out}\approx E_{in}$ possible if $m_H(N)$ breaks somewhere and $N$ large enough.

4.PAC Learnability[1]

Let $A \in A_{C;H}$ be a learning function for $C$ (with respect to $P$) with sample size $m(\epsilon;\delta)$. If $A$ satisfies the condition that given any $\epsilon,\delta \in [0;1]$, $P(error_P(h) > \epsilon) \leq \delta$ for all $c \in C$, we say that $C$ is uniformly learnable by $H$ under the distribution $P$.

Sample size $m(\epsilon;\delta)$ is an integer-valued function of $\epsilon$ and $delta$;
$A$ is a learning function only when $A$ is a learning function for $C$ with all $P$!
The smallest $m(\epsilon;\delta)$ is called the sample complexity of $A$.

[1]Liangjie Hong.Learnability and the Vapnik-Chervonenkis Dimension. CSE 498 Machine Learning Seminar,April 10th, 2012.http://www.hongliangjie.com/wp-content/uploads/2012/04/source.pdf

转载本文请联系原作者获取授权，同时请注明本文来自廖红虹科学网博客。
链接地址：https://blog.sciencenet.cn/blog-507072-758814.html

上一篇：第五讲 Training versus Testing
下一篇：第七讲 The VC Dimension

收藏 IP: 122.205.9.*| 热度|

当前推荐数：1 推荐人：陆泽橼

该博文允许注册用户评论请点击登录评论 (0 个评论)

数据加载中...

返回顶部

博文发布时间已经超过87600小时，评论已关闭。

廖红虹

扫一扫，分享此博文

不动如山分享 http://blog.sciencenet.cn/u/hustliaohh 脚踏实地，稳步向前

博文

第六讲 Theory of Generalization

当前推荐数：1 推荐人：陆泽橼

该博文允许注册用户评论请点击登录评论 (0 个评论)

廖红虹

全部作者的其他最新博文

全部精选博文导读

相关博文

不动如山分享 http://blog.sciencenet.cn/u/hustliaohh 脚踏实地，稳步向前

博文

第六讲 Theory of Generalization

当前推荐数：1 推荐人： 陆泽橼

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

廖红虹

全部作者的其他最新博文

全部精选博文导读

相关博文

当前推荐数：1 推荐人：陆泽橼

该博文允许注册用户评论请点击登录评论 (0 个评论)