逻辑回归分类
python
复制代码
import numpy as np
from sklearn import linear_model
X = np.array([[4, 7], [3.5, 8], [3.1, 6.2], [0.5, 1], [1, 2], [1.2, 1.9], [6, 2], [5.7, 1.5], [5.4, 2.2]])
y = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2])
# 逻辑回归分类器
# solver:求解器,有'newton-cg'、'lbfgs'、'liblinear'、'sag'、'saga'五种选择,默认是'liblinear'
# C:正则化系数,越小正则化强度越高,越大越不容易过拟合,默认是1.0
classifier = linear_model.LogisticRegression(solver='liblinear', C=100)
classifier.fit(X, y)
朴素贝叶斯分类
- 朴素贝叶斯分类器是用贝叶斯定理进行建模的监督学习分类器
- 贝叶斯定理: P(A∩B) = P(A)*P(B|A)=P(B)*P(A|B)。如上公式也可变形为:P(A|B)=P(B|A)*P(A)/P(B)
- P(类别|特征)=P(特征|类别)*P(类别)/P(特征)
python
复制代码
import numpy as np
from sklearn.naive_bayes import GaussianNB
X = np.array([[4, 7], [3.5, 8], [3.1, 6.2], [0.5, 1], [1, 2], [1.2, 1.9], [6, 2], [5.7, 1.5], [5.4, 2.2]])
y = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=5)
# 训练分类器
classifier_gaussiannb = GaussianNB()
classifier_gaussiannb.fit(X_train, y_train)
y_test_pred = classifier_gaussiannb.predict(X_test)
支持向量机 SVM (可分类、可回归)
python
复制代码
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=5)
# kernel:linear:线性核函数; rbf:径向基函数 高斯核函数; poly:多项式核函数; sigmoid: sigmoid核函数; 默认是线性核函数
params = {'kernel': 'linear','class_weight': 'balanced'}
classifier = SVC(**params)
classifier.fit(X_train, y_train)
target_names = ['Class-' + str(int(i)) for i in set(y)]
print("#"*30)
print("Classifier performance on training dataset")
print(classification_report(y_train, classifier.predict(X_train),target_names=target_names))
print("#"*30)