科学网—hustliaohh的博文

第十讲 Logistic Regression

2014-1-15 11:33

上一讲讲述了线性回归问题，以及线性回归做二类分类问题。从上一讲我们看到，我们得到的是硬分类，即非彼即此，自然，如果我们想知道对其分类结果的可信度，即以多大可信，就是这一讲的Logistic Regression讲述的内容。 1. Logistic Regression Problem Target function $f(x) = P(+1|x)\in $. Same data as hard ...

个人分类: 科研道路|4072 次阅读|没有评论

第九讲 Linear Regression

2014-1-15 11:09

上一讲我们得到 learning can happen with target distribution $P(y|x)$ and low $E_{in}$ w.r.t. error measure. 1. Linear Regression Problem linear regression hypothesis: $h(x) = w^T x$, $h(x)$: like perceptron, but without the $sign$. linear regression: find lines/hyperpla ...

个人分类: 科研道路|2973 次阅读|没有评论

第八讲 Noise and Error

2014-1-13 22:46

从上一讲我们得出结论： learning happens if finite $d_{VC}$, large $N$, and low $E_{in}$ .这一讲主要介绍在有噪声情况下的学习问题以及相关损失函数。 1. Noise and Probabilistic Target 2. Error Measure 机器学习的终极目标是$g\approx f$,那么如何度量其相似度呢？ 1）Pointwise Error ...

个人分类: 科研道路|2759 次阅读|没有评论

第七讲 The VC Dimension

2014-1-13 21:40

在上一讲中，我们有结论： $E_{out}\approx E_{in}$ possible if $m_H(N)$ breaks somewhere and $N$ large enough. 同时，得到$m_H(N)$的界可以得到错误界于是，我们有如下假设，并希望得到这样的结果： 1. VC Dimension VC dimension of $H$, denoted $d_{VC}(H)$ is the largest $N$ ...

个人分类: 科研道路|3888 次阅读|没有评论

第六讲 Theory of Generalization

2014-1-13 20:34

上一讲提到了错误界中$M$的可能取值$m_H(N)$是一个与样本个数$N$有关的函数。期望的结果是$m_H(N)$是随着$N$以多项式方式在增长而不是以指数形式增长，因此，引出了break point的概念。 1.在这一讲中，我们希望给定一个bounding function。Bounding function $B(N; k)$: maximum possible $m_H(N)$ when break point $= k ...

个人分类: 科研道路|2609 次阅读|没有评论

第五讲 Training versus Testing

2014-1-13 15:39

这一讲主要是分析上一讲中联合错误界中常数$M$的分析。在上一讲中，我们得知 If $|H|=M$ finite, $N$ large enouph, for wahtever $g$ picked by $A$, $E_{out}(g)\approx E_{in}(g)$; If $A$ finds one $g$ with $E_{in}(g)\approx 0$, PAC guarantee for $E_{out}(g)\approx 0$. $\Rightarrow $ Lear ...

个人分类: 科研道路|2337 次阅读|没有评论

第四讲: Feasibility of Learning

2014-1-12 23:50

1、 “no free lunch” 理论告诫我们：如果没有假设条件存在，机器学习就不是不可行的。 2、 Hoeffding’s Inequality $P \leq 2\exp{\left(-\frac{2\epsilon^2 N^2}{\sum_{i=1}^{N}(b_i - a_i)}\right)}$ &n ...

2695 次阅读|没有评论

第三讲 Types of Learning

2014-1-10 17:16

这一讲，林老师从4个不同的角度对机器学习算法做分类。 1、Learning with Different Output Space $\mathcal{Y}$ binary classification: $y = \{+1, -1\}$; multiclass classification: $y = \{1, 2,\cdots ,K \}$; regression: $y = \mathcal{R}$; structured learning: $y = $ structures; ......and ...

个人分类: 科研道路|2923 次阅读|没有评论

第二讲 Learning to Answer Yes/No

2014-1-10 15:06

这一讲主要介绍了第一个二类分类算法——感知学习算法(Perceptron)。感知学习最早是Rosenblatt于1957年提出的，这个算法可以算是开创了机器学习的先河。1962年，Novikoff证明了感知学习算法在线性可分数据集上可以在有限步内达到收敛。这一讲主要介绍了感知学习算法及其收敛性，感知学习算法在不可分数据集上的变异算法— ...

个人分类: 科研道路|4684 次阅读|没有评论

不动如山分享 http://blog.sciencenet.cn/u/hustliaohh 脚踏实地，稳步向前

博文

廖红虹