cliffgao的个人博客分享 http://blog.sciencenet.cn/u/cliffgao 兴趣:生物信息学、统计、概率

博文

Python中的机器学习包 sklean

已有 12839 次阅读 2014-1-2 16:02 |个人分类:python|系统分类:科研笔记


Python中的机器学习包 sklean (scikit)

sklean: Machine Learning in Python

======= 安装=========

0.安装scikit-learn 之前,你可能需要安装python, NumPy, SciPy, Matplotlib等。

  安装说明见(http://scikit-learn.org/stable/install.html

1.到 sklearn 主页(http://scikit-learn.org/stable/index.html)下载软件 并安装.


========使用=========


# -*- coding: utf-8 -*-

"""

Created on Thu Jan  2 14:34:20 2014

example of sklearn

@author: cliff

"""

print(__doc__)


#使用SVM  分类器

from sklearn import svm

from sklearn.datasets import load_svmlight_file

#将数据集分为训练集、检验集

from sklearn.cross_validation import train_test_split

from sklearn import cross_validation

#引入评价指标

from sklearn.metrics import confusion_matrix    #计算混淆矩阵

from sklearn.metrics import matthews_corrcoef #计算MCC

from sklearn.metrics import  roc_auc_score  #计算MCC 只对二分类可以计算

from sklearn.metrics import  accuracy_score  #计算ACC


#引入数据

fr_n="/path/your_svm_file"

X,y=load_svmlight_file(fr_n)


# Run classifier  

print("===cross validation===")

clf = svm.SVC(kernel='rbf')

scores=cross_validation.cross_val_score(clf,X,y,cv=5,scoring="accuracy")

print(scores,scores.mean())


print("===performance on TEST===")

# Split the data into a training set and a test set; 分为训练集 检验集

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

clf= svm.SVC(kernel='rbf')

clf.fit(X_train,y_train)

y_pred =clf.predict(X_test)

# 计算混淆 矩阵 Compute confusion matrix

cm = confusion_matrix(y_test, y_pred)

print(cm)

#计算准确率,MCC等

print("MCC: %f " %matthews_corrcoef(y_test,y_pred))

print( "ACC:  %f "  %accuracy_score(y_test,y_pred))



print("===compute auc ===")

#compute the auc

classifier = svm.SVC(kernel='rbf',probability=True)

model=classifier.fit(X_train,y_train)

y_prob =classifier.predict_proba(X_test)[:,1] #get the probability of positive

print(y_test)

print(y_prob)

print( "AUC:  %f "  %roc_auc_score(y_test,y_prob))



========

另外一个例子:http://scikit-learn.org/stable/auto_examples/plot_confusion_matrix.html




https://blog.sciencenet.cn/blog-468005-755200.html

上一篇:推荐《与青年朋友谈科研与学习策略》
下一篇:Jmol 命令
收藏 IP: 111.30.45.*| 热度|

1 刘桂锋

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-6-16 18:57

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部