机器学习之逻辑回归详解

摘要

逻辑回归（Logistic Regression）是机器学习中最基础且应用最广泛的分类算法之一。尽管名字中带有"回归"二字，但它实际上是一种经典的分类算法，主要用于解决二分类和多分类问题。本文将从几何回归的原理出发，详细介绍Sigmoid函数、决策边界、损失函数、梯度下降求解等核心概念，并进一步扩展到多分类逻辑回归和正则化技术。最后，通过多个完整的Python代码示例，帮助读者快速掌握逻辑回归的实战技能。

关键词： 逻辑回归、Sigmoid函数、交叉熵损失、梯度下降、正则化、多分类

一、逻辑回归概述

逻辑回归是一种基于线性回归改进的分类算法，其核心思想是将线性模型的输出通过Sigmoid函数映射到 $0,1$ 区间，从而得到样本属于正类的概率。由于其简单、高效、可解释性强等特点，逻辑回归在医学诊断、金融风控、垃圾邮件过滤等领域有着广泛的应用。

二、逻辑回归原理详解

2.1 Sigmoid函数

Sigmoid函数是逻辑回归的核心，其数学表达式为：

\\sigma(z) = \\frac{1}{1 + e\^{-z}}

其中， $z$ 是线性模型的输出，即 $z = wx + b$ 。

Sigmoid函数的特点：

输出值范围为(0, 1)，适合表示概率
函数曲线呈S形，关于点(0, 0.5)中心对称
当 $z \\to +\\infty$ 时， $\\sigma(z) \\to 1$
当 $z \\to -\\infty$ 时， $\\sigma(z) \\to 0$
当 $z = 0$ 时， $\\sigma(z) = 0.5$

复制代码

import numpy as np
import matplotlib.pyplot as plt

def sigmoid(z):
    """
    Sigmoid函数实现
    参数:
        z: 线性模型的输出，可以是标量、数组或矩阵
    返回:
        s: Sigmoid函数值，范围在(0, 1)之间
    """
    s = 1 / (1 + np.exp(-z))
    return s

# 可视化Sigmoid函数
z = np.linspace(-10, 10, 100)
s = sigmoid(z)

plt.figure(figsize=(10, 6))
plt.plot(z, s, 'b-', linewidth=2)
plt.axhline(y=0.5, color='r', linestyle='--', label='决策边界 (p=0.5)')
plt.axvline(x=0, color='g', linestyle='--', alpha=0.5)
plt.xlabel('z = wx + b', fontsize=12)
plt.ylabel('Sigmoid(z)', fontsize=12)
plt.title('Sigmoid函数曲线', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.ylim(-0.1, 1.1)
plt.tight_layout()
plt.show()

2.2 决策边界

决策边界（Decision Boundary）是分类器用来划分不同类别的边界。对于逻辑回归，当概率 $p \\geq 0.5$ 时预测为正类，当 $p \< 0.5$ 时预测为负类。

由 $p = 0.5$ 可得决策边界方程：

\\sigma(wx + b) = 0.5 \\Rightarrow wx + b = 0

这说明逻辑回归的决策边界是线性的（对于一维数据是一个点，对于二维数据是一条直线）。

复制代码

def plot_decision_boundary(X, y, model):
    """
    绘制二维数据集的决策边界
    
    参数:
        X: 特征矩阵 (n_samples, 2)
        y: 标签向量 (n_samples,)
        model: 训练好的逻辑回归模型
    """
    # 设置网格
    x1_min, x1_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
    x2_min, x2_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
    xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max, 200),
                           np.linspace(x2_min, x2_max, 200))
    
    # 预测网格点
    Z = model.predict(np.c_[xx1.ravel(), xx2.ravel()])
    Z = Z.reshape(xx1.shape)
    
    # 绘制决策边界和数据点
    plt.figure(figsize=(10, 8))
    plt.contourf(xx1, xx2, Z, alpha=0.3, cmap=plt.cm.RdYlBu)
    plt.contour(xx1, xx2, Z, colors='k', linewidths=0.5)
    
    # 绘制数据点
    scatter = plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdYlBu, 
                          edgecolors='black', s=50)
    plt.xlabel('特征1', fontsize=12)
    plt.ylabel('特征2', fontsize=12)
    plt.title('逻辑回归决策边界', fontsize=14)
    plt.colorbar(scatter)
    plt.tight_layout()
    plt.show()

2.3 损失函数

逻辑回归使用的损失函数是交叉熵损失（Binary Cross-Entropy），也称为对数损失（Log Loss）。

对于单个样本，损失函数定义为：

L(y, \\hat{y}) = -\[y \\log(\\hat{y}) + (1-y) \\log(1-\\hat{y})\]

其中 $\\hat{y} = \\sigma(wx + b)$ 是预测的概率值。

整个数据集的损失函数为：

J(w, b) = -\\frac{1}{m} \\sum_{i=1}\^{m}\[y\^{(i)} \\log(\\hat{y}\^{(i)}) + (1-y\^{(i)}) \\log(1-\\hat{y}\^{(i)})\]

交叉熵损失的特点：

当真实标签 $y=1$ 时，损失随 $\\hat{y}$ 减小而增大
当真实标签 $y=0$ 时，损失随 $\\hat{y}$ 增大而增大
当 $\\hat{y}=y$ 时，损失为0

复制代码

def binary_cross_entropy(y_true, y_pred):
    """
    计算二元交叉熵损失
    
    参数:
        y_true: 真实标签 (n_samples,)
        y_pred: 预测概率值 (n_samples,)，范围(0, 1)
    返回:
        loss: 交叉熵损失值
    """
    # 防止log(0)
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    
    loss = -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
    return loss

# 测试不同预测概率下的损失
y_true = np.array([0, 0, 1, 1])
y_pred = np.array([0.1, 0.3, 0.7, 0.9])

loss = binary_cross_entropy(y_true, y_pred)
print(f"交叉熵损失: {loss:.4f}")

2.4 梯度下降求解

逻辑回归的参数通过梯度下降法进行优化。损失函数对参数的梯度为：

\\frac{\\partial J(w,b)}{\\partial w_j} = \\frac{1}{m} \\sum_{i=1}\^{m}(\\hat{y}\^{(i)} - y\^{(i)}) x_j\^{(i)}

\\frac{\\partial J(w,b)}{\\partial b} = \\frac{1}{m} \\sum_{i=1}\^{m}(\\hat{y}\^{(i)} - y\^{(i)})

参数更新规则为：

w_j := w_j - \\alpha \\frac{\\partial J}{\\partial w_j}$$ $$b := b - \\alpha \\frac{\\partial J}{\\partial b}

其中 $\\alpha$ 是学习率。

复制代码

def gradient_descent(X, y, weights, bias, learning_rate=0.01, n_iterations=1000):
    """
    梯度下降法训练逻辑回归
    
    参数:
        X: 特征矩阵 (n_samples, n_features)
        y: 标签向量 (n_samples,)
        weights: 初始权重 (n_features,)
        bias: 初始偏置
        learning_rate: 学习率
        n_iterations: 迭代次数
    返回:
        weights: 更新后的权重
        bias: 更新后的偏置
        costs: 记录每次迭代的损失值
    """
    m = len(y)
    costs = []
    
    for i in range(n_iterations):
        # 前向传播：计算模型输出
        z = np.dot(X, weights) + bias
        y_pred = sigmoid(z)
        
        # 计算损失
        cost = binary_cross_entropy(y, y_pred)
        costs.append(cost)
        
        # 计算梯度
        dz = y_pred - y
        dw = (1/m) * np.dot(X.T, dz)
        db = (1/m) * np.sum(dz)
        
        # 更新参数
        weights = weights - learning_rate * dw
        bias = bias - learning_rate * db
        
        # 每100次迭代打印一次损失
        if i % 100 == 0:
            print(f"迭代 {i:4d} | 损失: {cost:.6f}")
    
    return weights, bias, costs

# 可视化训练过程
def plot_training_history(costs):
    """绘制训练损失曲线"""
    plt.figure(figsize=(10, 6))
    plt.plot(costs, 'b-', linewidth=2)
    plt.xlabel('迭代次数', fontsize=12)
    plt.ylabel('损失值', fontsize=12)
    plt.title('逻辑回归训练损失曲线', fontsize=14)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()

三、多分类逻辑回归

3.1 One-vs-Rest (OvR) 策略

OvR策略将多分类问题转化为多个二分类问题。对于K分类问题，训练K个分类器，每个分类器区分某一类与其余所有类。

预测时，所有K个分类器分别给出样本属于各自类别的概率，选择概率最大的类别作为最终预测结果。

3.2 Softmax函数

Softmax函数是Sigmoid函数在多分类问题上的推广，将K个类别的线性输出转换为概率分布：

P(y=k\|x) = \\frac{e\^{z_k}}{\\sum_{j=1}\^{K} e\^{z_j}}

其中 $z_k = w_k \\cdot x + b_k$ 是第k个类别的线性输出。

复制代码

def softmax(z):
    """
    Softmax函数实现
    
    参数:
        z: 线性输出 (n_samples, n_classes)
    返回:
        p: 各类别的概率 (n_samples, n_classes)
    """
    # 数值稳定性处理：减去最大值
    z_shifted = z - np.max(z, axis=1, keepdims=True)
    exp_z = np.exp(z_shifted)
    p = exp_z / np.sum(exp_z, axis=1, keepdims=True)
    return p

# 测试Softmax函数
z = np.array([[2.0, 1.0, 0.1],
              [1.0, 3.0, 0.2]])
p = softmax(z)
print("Softmax输出概率:")
print(p)
print(f"每行概率和: {np.sum(p, axis=1)}")  # 应为1

四、正则化

为了防止逻辑回归过拟合，可以在损失函数中加入正则化项。

4.1 L1正则化（Lasso）

J*{L1}(w, b) = -\\frac{1}{m} \\sum*{i=1}\^{m}\[y\^{(i)} \\log(\\hat{y}\^{(i)}) + (1-y\^{(i)}) \\log(1-\\hat{y}\^{(i)})\] + \\lambda \\sum_{j=1}\^{n} \|w_j\|

L1正则化会使得部分权重变为0，产生稀疏模型，具有特征选择的作用。

4.2 L2正则化（Ridge）

J*{L2}(w, b) = -\\frac{1}{m} \\sum*{i=1}\^{m}\[y\^{(i)} \\log(\\hat{y}\^{(i)}) + (1-y\^{(i)}) \\log(1-\\hat{y}\^{(i)})\] + \\lambda \\sum_{j=1}\^{n} w_j\^2

L2正则化会使得权重接近但不为0，所有特征都会对预测产生贡献。

复制代码

def regularized_loss(X, y, weights, bias, lambda_reg, penalty='l2'):
    """
    带正则化的交叉熵损失
    
    参数:
        X: 特征矩阵 (n_samples, n_features)
        y: 标签向量 (n_samples,)
        weights: 权重向量 (n_features,)
        bias: 偏置
        lambda_reg: 正则化系数
        penalty: 'l1' 或 'l2'
    返回:
        loss: 带正则化的损失值
    """
    m = len(y)
    
    # 计算交叉熵损失
    z = np.dot(X, weights) + bias
    y_pred = sigmoid(z)
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    ce_loss = -np.mean(y * np.log(y_pred) + (1 - y) * np.log(1 - y_pred))
    
    # 添加正则化项
    if penalty == 'l1':
        reg_loss = lambda_reg * np.sum(np.abs(weights))
    else:  # l2
        reg_loss = lambda_reg * np.sum(weights ** 2)
    
    return ce_loss + reg_loss

# 不同正则化系数的对比
print("不同正则化系数的效果对比:")
for lambda_reg in [0.001, 0.01, 0.1, 1.0]:
    loss = regularized_loss(X, y, weights, bias, lambda_reg, penalty='l2')
    print(f"λ = {lambda_reg:4.2f} | 正则化损失 = {loss:.6f}")

五、使用场景

5.1 二分类问题

典型应用场景：

垃圾邮件判断（spam vs. not spam）
疾病诊断（有病 vs. 无病）
信用风险评估（违约 vs. 不违约）
流失预警（流失 vs. 不流失）

5.2 多分类问题

典型应用场景：

手写数字识别（0-9共10类）
图像分类（猫、狗、鸟等）
文本分类（新闻分为体育、科技、娱乐等类别）
疾病亚型分类

5.3 概率预测

典型应用场景：

点击率预估（CTR prediction）
风险评估（贷款违约概率）
疾病风险预测
客户购买意向评分

六、实战代码

6.1 二分类：鸢尾花数据集二分类

复制代码

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import numpy as np

# 加载鸢尾花数据集
iris = load_iris()
print(f"数据集形状: {iris.data.shape}")
print(f"类别标签: {iris.target_names}")

# 使用二分类：setosa vs. non-setosa
# 筛选出类别0和类别1（setosa和versicolor）
mask = iris.target < 2
X = iris.data[mask]
y = iris.target[mask]

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

print(f"\n训练集大小: {len(X_train)}")
print(f"测试集大小: {len(X_test)}")

# 创建并训练逻辑回归模型
model = LogisticRegression(
    penalty='l2',      # 使用L2正则化
    C=1.0,             # 正则化强度的倒数，C越大正则化越弱
    solver='lbfgs',    # 优化算法
    max_iter=200,      # 最大迭代次数
    random_state=42
)

model.fit(X_train, y_train)

# 在测试集上预测
y_pred = model.predict(X_test)
y_pred_proba = model.predict_proba(X_test)

print(f"\n模型系数: {model.coef_}")
print(f"模型截距: {model.intercept_}")
print(f"\n测试集准确率: {accuracy_score(y_test, y_pred):.4f}")

# 详细分类报告
print("\n分类报告:")
print(classification_report(y_test, y_pred, target_names=['setosa', 'versicolor']))

# 混淆矩阵
print("混淆矩阵:")
print(confusion_matrix(y_test, y_pred))

# 预测概率示例
print("\n前5个样本的预测概率:")
print("实际标签 | 预测标签 | P(setosa) | P(versicolor)")
print("-" * 55)
for i in range(5):
    print(f"   {y_test[i]:^7} |    {y_pred[i]:^5}    |   {y_pred_proba[i, 0]:.4f}   |    {y_pred_proba[i, 1]:.4f}")

6.2 多分类：鸢尾花数据集多分类

复制代码

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
import numpy as np

# 加载完整的鸢尾花数据集（3类）
iris = load_iris()
X = iris.data
y = iris.target

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

print(f"数据集类别数: {len(np.unique(y))}")
print(f"类别分布: {np.bincount(y)}")

# 创建并训练多分类逻辑回归模型
# multi_class='multinomial' 使用Softmax函数
# solver='lbfgs' 支持多分类
model_multi = LogisticRegression(
    multi_class='multinomial',  # 使用多项式逻辑回归（Softmax）
    solver='lbfgs',
    C=1.0,
    max_iter=200,
    random_state=42
)

model_multi.fit(X_train, y_train)

# 预测
y_pred_multi = model_multi.predict(X_test)
y_pred_proba_multi = model_multi.predict_proba(X_test)

print(f"\n多分类模型测试集准确率: {accuracy_score(y_test, y_pred_multi):.4f}")

# 分类报告
print("\n多分类分类报告:")
print(classification_report(y_test, y_pred_multi, target_names=iris.target_names))

# 输出各类别的概率
print("\n前3个样本的预测概率:")
print("真实类别 | 预测类别 | Setosa概率 | Versicolor概率 | Virginica概率")
print("-" * 70)
for i in range(3):
    pred_class = iris.target_names[y_pred_multi[i]]
    true_class = iris.target_names[y_test[i]]
    print(f"  {true_class:^8} |   {pred_class:^8}   |   {y_pred_proba_multi[i, 0]:.4f}   |    {y_pred_proba_multi[i, 1]:.4f}     |    {y_pred_proba_multi[i, 2]:.4f}")

6.3 决策边界可视化

复制代码

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification, load_iris
from sklearn.linear_model import LogisticRegression
from matplotlib.colors import ListedColormap

def visualize_decision_boundary_2d():
    """
    在二维特征空间中可视化逻辑回归决策边界
    使用make_classification生成模拟数据
    """
    # 生成二维分类数据
    X, y = make_classification(
        n_samples=200,
        n_features=2,
        n_informative=2,
        n_redundant=0,
        n_classes=2,
        class_sep=1.5,
        random_state=42
    )
    
    # 训练逻辑回归模型
    model = LogisticRegression(solver='lbfgs', C=1.0, random_state=42)
    model.fit(X, y)
    
    # 创建网格
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max, 300),
                           np.linspace(x2_min, x2_max, 300))
    
    # 预测网格点
    Z = model.predict(np.c_[xx1.ravel(), xx2.ravel()])
    Z = Z.reshape(xx1.shape)
    
    # 获取概率用于绘制等概率线
    Z_proba = model.predict_proba(np.c_[xx1.ravel(), xx2.ravel()])[:, 1]
    Z_proba = Z_proba.reshape(xx1.shape)
    
    # 绘制决策边界
    fig, axes = plt.subplots(1, 2, figsize=(16, 6))
    
    # 左图：类别区域
    cmap_light = ListedColormap(['#FFAAAA', '#AAAAFF'])
    cmap_bold = ListedColormap(['#FF0000', '#0000FF'])
    
    axes[0].contourf(xx1, xx2, Z, alpha=0.3, cmap=cmap_light)
    axes[0].contour(xx1, xx2, Z, colors='k', linewidths=1)
    scatter = axes[0].scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, 
                               edgecolors='black', s=60)
    axes[0].set_xlabel('特征1', fontsize=12)
    axes[0].set_ylabel('特征2', fontsize=12)
    axes[0].set_title('逻辑回归决策边界', fontsize=14)
    
    # 右图：概率热力图
    im = axes[1].contourf(xx1, xx2, Z_proba, levels=20, cmap='RdYlBu_r')
    axes[1].contour(xx1, xx2, Z_proba, levels=[0.5], colors='k', linewidths=2)
    plt.colorbar(im, ax=axes[1], label='预测概率 P(y=1)')
    scatter = axes[1].scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, 
                               edgecolors='black', s=60)
    axes[1].set_xlabel('特征1', fontsize=12)
    axes[1].set_ylabel('特征2', fontsize=12)
    axes[1].set_title('预测概率热力图', fontsize=14)
    
    plt.tight_layout()
    plt.show()
    
    # 输出模型参数
    print(f"决策边界方程: {model.coef_[0, 0]:.4f}*x1 + {model.coef_[0, 1]:.4f}*x2 + {model.intercept_[0]:.4f} = 0")

# 执行可视化
visualize_decision_boundary_2d()

6.4 逻辑回归与线性回归对比

复制代码

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.metrics import mean_squared_error, r2_score

def compare_linear_logistic():
    """
    对比线性回归和逻辑回归的区别
    """
    # 生成分类数据
    X, y = make_classification(
        n_samples=100,
        n_features=1,
        n_informative=1,
        n_redundant=0,
        n_classes=2,
        n_clusters_per_class=1,
        class_sep=0.8,
        random_state=42
    )
    
    # 线性回归
    linear_model = LinearRegression()
    linear_model.fit(X, y)
    y_linear_pred = linear_model.predict(X)
    
    # 逻辑回归
    logistic_model = LogisticRegression(solver='lbfgs', random_state=42)
    logistic_model.fit(X, y)
    y_logistic_proba = logistic_model.predict_proba(X)[:, 1]
    
    # 绘图对比
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))
    
    X_plot = np.linspace(X.min(), X.max(), 300).reshape(-1, 1)
    
    # 左图：线性回归
    axes[0].scatter(X, y, c=y, cmap='RdYlBu', edgecolors='black', s=60, alpha=0.7)
    axes[0].plot(X_plot, linear_model.predict(X_plot), 'r-', linewidth=2, label='线性回归')
    axes[0].axhline(y=0.5, color='g', linestyle='--', label='决策边界 (y=0.5)')
    axes[0].set_xlabel('特征 X', fontsize=12)
    axes[0].set_ylabel('目标 y', fontsize=12)
    axes[0].set_title('线性回归（不适合分类）', fontsize=14)
    axes[0].legend()
    axes[0].set_ylim(-0.5, 1.5)
    
    # 计算线性回归指标
    mse = mean_squared_error(y, y_linear_pred)
    r2 = r2_score(y, y_linear_pred)
    axes[0].text(0.05, 0.95, f'MSE: {mse:.4f}\nR²: {r2:.4f}', 
                 transform=axes[0].transAxes, fontsize=10,
                 verticalalignment='top', bbox=dict(boxstyle='round', facecolor='wheat'))
    
    # 右图：逻辑回归
    axes[1].scatter(X, y, c=y, cmap='RdYlBu', edgecolors='black', s=60, alpha=0.7)
    axes[1].plot(X_plot, logistic_model.predict_proba(X_plot)[:, 1], 'r-', linewidth=2, label='逻辑回归 (Sigmoid)')
    axes[1].axhline(y=0.5, color='g', linestyle='--', label='决策边界')
    axes[1].axvline(x=-logistic_model.intercept_[0]/logistic_model.coef_[0, 0], 
                    color='orange', linestyle=':', linewidth=2, label='分类边界')
    axes[1].set_xlabel('特征 X', fontsize=12)
    axes[1].set_ylabel('概率 P(y=1)', fontsize=12)
    axes[1].set_title('逻辑回归（适合分类）', fontsize=14)
    axes[1].legend()
    axes[1].set_ylim(-0.1, 1.1)
    
    plt.tight_layout()
    plt.show()
    
    # 打印模型参数对比
    print("=" * 50)
    print("模型参数对比:")
    print(f"线性回归: y = {linear_model.coef_[0]:.4f}*x + {linear_model.intercept_:.4f}")
    print(f"逻辑回归: z = {logistic_model.coef_[0, 0]:.4f}*x + {logistic_model.intercept_[0]:.4f}")
    print(f"          P(y=1) = 1 / (1 + exp(-z))")
    print("=" * 50)

# 执行对比
compare_linear_logistic()

七、总结

本文详细介绍了逻辑回归的原理和使用方法，主要包括：

核心原理：Sigmoid函数将线性输出映射为概率，决策边界由 $wx+b=0$ 决定
损失函数：使用交叉熵损失，通过梯度下降法优化参数
多分类扩展：OvR策略和Softmax函数两种方法
正则化：L1正则化产生稀疏模型，L2正则化防止过拟合
应用场景：二分类、多分类、概率预测等多种任务

逻辑回归作为机器学习的基础算法，具有以下优点：

模型简单，易于理解和实现
可解释性强，系数具有明确的业务含义
训练速度快，适合大规模数据
输出概率值，便于风险评估

当然，逻辑回归也有其局限性：

只能学习线性决策边界
对特征工程依赖较强
在复杂分类任务中性能不如集成学习方法

建议读者在实战中多加练习，熟练掌握逻辑回归的使用技巧，为学习更复杂的机器学习算法打下坚实基础。