||
上一讲介绍了Logistic Regression以及其Cross-Entropy错误损失函数,另外还介绍了梯度下降算法。 这一讲主要介绍将线性模型用于分类问题,现有的线性模型有:线性分类、线性回归和Logistic回归。
1. Linear Models for Binary Classification
linear scoring function: $s = w^T x$:
for binary classification $y \in \{+1;-1\}$
$(ys)$: classification correctness score.
Visualizing Error Functions:
upper bound: useful for designing algorithmic error $\hat{err}$.
Theoretical Implication of Upper Bound:
Regression for Classification:
2. Stochastic Gradient Descent
stochastic gradient = true gradient + zero-mean "noise" directions
SGD logistic regression
3. Multiclass via Logistic Regression
One Class at a Time (One versus All): 分解多类问题为多个了二类问题。
4. Multiclass via Binary Classification
Source of Unbalance: One versus All,对于K个类,对任意两两组合学习一个分类器。
5. 评述
对于多类分类问题,最常用的两种分解方法就是:One-Versus-All和One-Versus-One,关于两者的性能评价请参见[1][2][3][4].
[1] Mohamed Aly. Survey on Multiclass Classification Methods. 2005.http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.175.107&rep=rep1&type=pdf
[2] Rifkin R, Klautau A. In defense of one-vs-all classification[J]. The Journal of Machine Learning Research, 2004, 5: 101-141.
[3] Tewari A, Bartlett P L. On the consistency of multiclass classification methods[J]. The Journal of Machine Learning Research, 2007, 8: 1007-1025.
[4] Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-11-24 12:12
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社