02_逻辑回归

逻辑回归

  • [1 逻辑回归代码实现](#1 逻辑回归代码实现)
  • [2 超参数](#2 超参数)
  • [3 多项式逻辑回归](#3 多项式逻辑回归)
  • [4 多分类](#4 多分类)

线性可分数据→逻辑回归
非线性可分数据→多项式逻辑回归
多分类问题→OvO, OvR

1 逻辑回归代码实现

python 复制代码
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

x, y = make_classification(
    n_samples=200,
    n_features=2,
    n_redundant=0,
    n_classes=2,
    n_clusters_per_class=1,
    random_state=50
)
print(x.shape, y.shape)
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.7, random_state=0, stratify=y)
plt.scatter(x_train[:, 0], x_train[:, 1], c=y_train)
plt.show()

clf = LogisticRegression()
clf.fit(x_train, y_train)
print(clf.score(x_train, y_train))
print(clf.score(x_test, y_test))
y_predict = clf.predict(x_test)
print(y_predict)
print(clf.predict_proba(x_test)[:3])
print(np.argmax(clf.predict_proba(x_test), axis=1))

(200, 2) (200,)

0.9571428571428572

0.9666666666666667

[0 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 0 0 1 0 1

0 0 0 1 1 1 0 1 1 0 1 0 0 1 0 1 0 0 1 1 0 0 0]

[[0.9976049 0.0023951 ]

[0.00943605 0.99056395]

[0.99884752 0.00115248]]

[0 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 0 0 1 0 1

0 0 0 1 1 1 0 1 1 0 1 0 0 1 0 1 0 0 1 1 0 0 0]

2 超参数

python 复制代码
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV

x, y = make_classification(
    n_samples=200,
    n_features=2,
    n_redundant=0,
    n_classes=2,
    n_clusters_per_class=1,
    random_state=50
)
print(x.shape, y.shape)
x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.7, random_state=0, stratify=y)

params = [{
    'penalty': ['l2', 'l1'],
    'C': [0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000],
    'solver': ['liblinear']
}, {
    'penalty': ['none'],
    'C': [0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000],
    'solver': ['lbfgs']
}, {
    'penalty': ['elasticnet'],
    'C': [0.0001, 0.001, 0.01, 0.1, 1, 10, 100, 1000],
    'l1_ratio': [0, 0.25, 0.5, 0.75, 1],
    'solver': ['saga'],
    'max_iter': [200]
}]
grid = GridSearchCV(
    estimator=LogisticRegression(),
    param_grid=params,
    n_jobs=-1
)
grid.fit(x_train, y_train)
print(grid.best_score_)
print(grid.best_estimator_.score(x_test, y_test))
print(grid.best_params_)

0.9571428571428573

0.9666666666666667

{'C': 1, 'penalty': 'l2', 'solver': 'liblinear'}

3 多项式逻辑回归

python 复制代码
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import PolynomialFeatures

np.random.seed(0)
X = np.random.normal(0, 1, size=(200, 2))
y = np.array((X[:, 0] ** 2) + (X[:, 1] ** 2) < 2, dtype='int')
x_train, x_test, y_train, y_test = train_test_split(X, y, train_size=0.7, random_state=233, stratify=y)
plt.scatter(x_train[:,0], x_train[:,1], c = y_train)
plt.show()

clf = LogisticRegression()
clf.fit(x_train, y_train)
print(clf.score(x_train, y_train))
print(clf.score(x_train, y_train))

# 采用多项式逻辑回归
print('------采用多项式逻辑回归--------')
poly = PolynomialFeatures(degree=2)
poly.fit(x_train)
x2 = poly.transform(x_train)
x2t = poly.transform(x_test)
clf.fit(x2, y_train)
print(clf.score(x2, y_train))
print(clf.score(x2t, y_test))

0.7071428571428572

0.7071428571428572

------采用多项式逻辑回归--------

1.0

0.9666666666666667

4 多分类

python 复制代码
from sklearn import datasets
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
from sklearn.multiclass import OneVsOneClassifier

iris = datasets.load_iris()
x = iris.data
y = iris.target
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=30)
clf = LogisticRegression()
ovr = OneVsRestClassifier(clf)
ovr.fit(x_train, y_train)
print("ovr.score:")
print(ovr.score(x_test, y_test))

ovr = OneVsOneClassifier(clf)
ovr.fit(x_train, y_train)
print("ovo.score:")
print(ovr.score(x_test, y_test))

ovr.score:

0.9473684210526315

ovo.score:

1.0

相关推荐
FreakStudio35 分钟前
全网最适合入门的面向对象编程教程:50 Python函数方法与接口-接口和抽象基类
python·嵌入式·面向对象·电子diy
redcocal2 小时前
地平线秋招
python·嵌入式硬件·算法·fpga开发·求职招聘
artificiali2 小时前
Anaconda配置pytorch的基本操作
人工智能·pytorch·python
RaidenQ2 小时前
2024.9.13 Python与图像处理新国大EE5731课程大作业,索贝尔算子计算边缘,高斯核模糊边缘,Haar小波计算边缘
图像处理·python·算法·课程设计
花生了什么树~.3 小时前
python基础知识(六)--字典遍历、公共运算符、公共方法、函数、变量分类、参数分类、拆包、引用
开发语言·python
Lossya3 小时前
【机器学习】参数学习的基本概念以及贝叶斯网络的参数学习和马尔可夫随机场的参数学习
人工智能·学习·机器学习·贝叶斯网络·马尔科夫随机场·参数学习
Trouvaille ~3 小时前
【Python篇】深度探索NumPy(下篇):从科学计算到机器学习的高效实战技巧
图像处理·python·机器学习·numpy·信号处理·时间序列分析·科学计算
爆更小小刘3 小时前
Python基础语法(3)下
开发语言·python
哪 吒3 小时前
华为OD机试 - 第 K 个字母在原来字符串的索引(Python/JS/C/C++ 2024 E卷 100分)
javascript·python·华为od
憨憨小白4 小时前
Python 的集合类型
开发语言·python·青少年编程·少儿编程