不动如山分享 http://blog.sciencenet.cn/u/hustliaohh 脚踏实地,稳步向前

博文

十一讲 Linear Models for Classification

已有 4344 次阅读 2014-1-15 14:53 |个人分类:科研道路|系统分类:科研笔记

上一讲介绍了Logistic Regression以及其Cross-Entropy错误损失函数,另外还介绍了梯度下降算法。 这一讲主要介绍将线性模型用于分类问题,现有的线性模型有:线性分类、线性回归和Logistic回归。

1. Linear Models for Binary Classification

linear scoring function: $s = w^T x$:


for binary classification $y \in \{+1;-1\}$


$(ys)$: classification correctness score.

Visualizing Error Functions:



upper bound: useful for designing algorithmic error $\hat{err}$.

Theoretical Implication of Upper Bound:



Regression for Classification:



2. Stochastic Gradient Descent


stochastic gradient = true gradient + zero-mean "noise" directions


SGD logistic regression


3. Multiclass via Logistic Regression

One Class at a Time (One versus All): 分解多类问题为多个了二类问题。



4. Multiclass via Binary Classification

Source of Unbalance: One versus All,对于K个类,对任意两两组合学习一个分类器。


5. 评述

对于多类分类问题,最常用的两种分解方法就是:One-Versus-All和One-Versus-One,关于两者的性能评价请参见[1][2][3][4].

[1] Mohamed Aly. Survey on Multiclass Classification Methods. 2005.http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.175.107&rep=rep1&type=pdf

[2] Rifkin R, Klautau A. In defense of one-vs-all classification[J]. The Journal of Machine Learning Research, 2004, 5: 101-141.

[3] Tewari A, Bartlett P L. On the consistency of multiclass classification methods[J]. The Journal of Machine Learning Research, 2007, 8: 1007-1025.

[4] Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA.



https://blog.sciencenet.cn/blog-507072-759363.html

上一篇:第十讲 Logistic Regression
下一篇:十二讲 Nonlinear Transformation
收藏 IP: 122.205.9.*| 热度|

1 陆泽橼

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-3-29 00:15

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部