优化算法之遗传算法思想和应用实例

优化算法有很多，常用的有粒子群（蚁群）算法，模拟退火、遗传算法、免疫算法（IA）、差分进化算法（Differential Evolution, DE）、梯度下降等。这些算法思想有一些共性，大概是以下这些步骤：
1、把实际问题抽象成一个目标函数，或者说数学建模。优化目标过程就是求这个函数的极值，越接近极值就叫做适应度越好。
2、算法开始，在定义域范围内随机初始化一组解。
3、在定义域范围内，按照一定的规则和概率，改变自变量的取值，重新计算目标函数。改变的方式有多种，例如按一定的步长和概率随机改变，或者按一定的概率改变自变量里某个元素局部（变异），或者交换自变量向量元素的顺序等，这种操作，或者说这种算子被赋予各种名称，如交叉、变异、打疫苗、学习等等。其实他们的本质都是在定义域范围内，按照一定的概率，随机的改变自变量的取值，经过n次迭代，尽可能的覆盖整个定义域。
4每次迭代计算，通过计算目标函数，与极值越接近的解（或者说适应度越好的个体）就会淘汰较差的解。
5 在期望的误差范围内或者找到了极值，就终止迭代。

这以遗传算法为例。

概念1：基因和染色体

基因型（Genotype）

在遗传算法中，我们首先需要将要解决的问题映射成一个数学问题，也就是所谓的"数学建模"，那么这个问题的一个可行解即被称为一条"染色体"。一个可行解一般由多个变量构成，那么这每一个变量就被称为染色体上的一个"基因"。

交叉、变异就是对一个可行解的变量对顺序的改变和局部改变，注意变动需要在定义域范围内，从而产生新的可行解，计算这个可行解的适应度函数。

例如，一条染色体可以表示为二进制串，其中每个位代表一个基因：

比如说，对于如下函数而言，[1,2,3]、[1,3,2]、[3,2,1]均是这个函数的可行解（代进去成立即为可行解），那么这些可行解在遗传算法中均被称为染色体。

3x+4y+5z<100

这些可行解一共有三个变量构成，每个变量被称为组成染色体的一个基因。

种群（Population）

遗传算法保持大量的个体（individuals）------针对当前问题的候选解集合。

概念2：适应度函数(Fitness fuction)

在算法的每次迭代中，使用适应度函数（也称为目标函数）对个体进行评估。目标函数是用于优化的函数或试图得到极值的计算规则（把需要解决的问题定义为一个函数，求这个函数的极值）。

适应度好的个体代表了更好的解，被选择作为父辈。

概念3：交叉（Crossover）

遗传算法每一次迭代都会生成N条染色体，每一次迭代被称为一次"进化"。通过交叉和变异产生新个体（本质就是在定义域内改变函数的自变量）。

概念4：变异(Mutation)

交叉能保证每次进化留下优良的基因，但基因还是那么几个，只不过交换了他们的组合顺序。为了找到全局最优解，需要引入变异。

突变操作的目的是定期随机更新种群，将新模式引入染色体，尽可能多的覆盖定义域。

概念5：选择复制

每次进化中，为了保留上一代优良的染色体，需要将上一代中适应度最高的几条染色体直接原封不动地复制给下一代。

遗传算法逻辑步骤：

以求解最短路径为例，先建模，定义邻接矩阵，一个个体（可行解）就是一个城市遍历路径组合，目标函数就是计算一组城市路径的总长度。优化思想就是比较两组路径，计算总长度较小的路径（适应度好）。java代码如下，代码有注释：

复制代码

package algorithm;

import java.util.Arrays;
import java.util.Random;

public class GeneticAlgorithm {

    final static int INF = 10000;
   // static int[][] matrix = new int[6][6];
    static int[][] matrix = {
            {0, 6, 3, INF, INF, INF}, // 0
            {6, 0, 2, 5, INF, INF},   // 1
            {3, 2, 0, 3, 4, INF},     // 2
            {INF, 5, 3, 0, 2, 3},     // 3
            {INF, INF, 4, 2, 0, 5},   // 4
            {INF, INF, INF, 3, 5, 0}  // 5
    };

    // 参数设置，初始化种群的数目， 进化多少代
    static int populationSize = 10; //  初始化种群的数目 Population size
    static int generations = 5000;   // 进化多少代，即是迭代计算多少回
    static int numberOfCities = matrix.length;
    static Random random = new Random();

    public static void main(String[] args) {
        // 初始化种群，也就是随机10个路径
        int[][] population = initializePopulation(populationSize, numberOfCities);

        for (int g = 0; g < generations; g++) {
            // 计算每个个体的适应度，即路径长度
            double[] fitness = calculateFitness(population);

            // 更新种群，也就是如果计算到比较短路径的个体，就加入种群，淘汰旧种群里路径较长的个体
            population = createNewGeneration(population, fitness);
        }

        // 输出最短路径
        double[] fitnes = calculateFitness(population);
        int bestIndex = getBestIndex(fitnes);
        System.out.println("Best path: " + Arrays.toString(population[bestIndex]));
        System.out.println("Best path length: " + (1 / fitnes[bestIndex]));
    }

    // 初始化种群，这里是随机10组路径组合，10个个体 （Initialize the population with random paths）
    static int[][] initializePopulation(int size, int numberOfCities) {
        int[][] population = new int[size][numberOfCities];
        for (int i = 0; i < size; i++) {
            population[i] = generateRandomPath(numberOfCities);
        }
        return population;
    }

    // 随机生成一个路径
    static int[] generateRandomPath(int numberOfCities) {
        int[] path = new int[numberOfCities];
        for (int i = 0; i < numberOfCities; i++) {
            path[i] = i;
        }
        for (int i = 0; i < numberOfCities; i++) {
            int j = random.nextInt(numberOfCities);
            // Swap
            int temp = path[i];
            path[i] = path[j];
            path[j] = temp;
        }
        return path;
    }

    // 计算适应度，取路径的倒数 （Calculate the fitness of the population）
    static double[] calculateFitness(int[][] population) {
        double[] fitness = new double[population.length];
        for (int i = 0; i < population.length; i++) {
            fitness[i] = 1.0 / calculatePathLength(population[i]);
        }
        return fitness;
    }

    // 计算路径总长度，即每个个体路径长度的函数
    static int calculatePathLength(int[] path) {
        int totalLength = 0;
        for (int i = 0; i < path.length; i++) {
            totalLength += matrix[path[i]][path[(i + 1) % path.length]]; // Loop back to the start
        }
        return totalLength; // Return total path length
    }

    // 通过三种算子进化产生新一代（Create a new generation using selection, crossover, and mutation）
    static int[][] createNewGeneration(int[][] population, double[] fitness) {
        int[][] newPopulation = new int[population.length][];
        for (int i = 0; i < population.length; i++) {
            int[] parent1 = selectParent(population, fitness);
            int[] parent2 = selectParent(population, fitness);
            int[] child = crossover(parent1, parent2);
            mutate(child);
            newPopulation[i] = child;
        }
        return newPopulation;
    }

    // 轮盘赌选择父代（Select a parent using roulette wheel selection）
    static int[] selectParent(int[][] population, double[] fitness) {
        double totalFitness = Arrays.stream(fitness).sum();
        double selectionPoint = random.nextDouble() * totalFitness;
        double cumulativeFitness = 0.0;

        for (int i = 0; i < population.length; i++) {
            cumulativeFitness += fitness[i];
            if (cumulativeFitness >= selectionPoint) {
                return population[i];
            }
        }
        return population[population.length - 1]; // Default return
    }

    // 交叉算子，产生新个体
    static int[] crossover(int[] parent1, int[] parent2) {
        int[] child = new int[parent1.length];
        boolean[] visited = new boolean[parent1.length];

        // 两个父辈中交叉，从哪个节点断开两部分是随机的（Randomly select a crossover point）
        int crossoverPoint = random.nextInt(parent1.length);

        // Copy part from the first parent
        for (int i = 0; i < crossoverPoint; i++) {
            child[i] = parent1[i];
            visited[parent1[i]] = true;
        }

        // Fill the remaining from the second parent
        int childIndex = crossoverPoint;
        for (int i = 0; i < parent2.length; i++) {
            if (!visited[parent2[i]]) {
                child[childIndex++] = parent2[i];
            }
        }
        return child;
    }

    // 突变算子，这里是交换个体（路径）中某两个城市节点位置 （Mutate the child by swapping two cities）
    static void mutate(int[] path) {
        if (random.nextDouble() < 0.2) { // 基因突变的概率10%
            int index1 = random.nextInt(path.length);
            int index2 = random.nextInt(path.length);
            // Swap
            int temp = path[index1];
            path[index1] = path[index2];
            path[index2] = temp;
        }
    }

    // Get the index of the best individual based on fitness
    static int getBestIndex(double[] fitness) {
        int bestIndex = 0;
        for (int i = 1; i < fitness.length; i++) {
            if (fitness[i] > fitness[bestIndex]) {
                bestIndex = i;
            }
        }
        return bestIndex;
    }
}

代码说明:

邻接矩阵: 定义了城市之间的距离。

初始化种群: 随机生成路径，形成初始种群。

适应度计算: 计算每个个体的路径长度，并以其倒数作为适应度。

选择算子: 通过轮盘赌选择适应度更高的个体为父代。

交叉算子: 基于部分路径复制生成新个体。

变异算子: 随机交换个体的两个城市以增加多样性。

主循环: 进行指定数量的迭代，创建新一代的种群，并返回最佳路径。

另给一个求函数极值的pyhon代码例子：

复制代码

import random

import numpy as np
from matplotlib import pyplot as plt

"""
遗传算法 求函数极值
"""
def fit_function(x):
    """ 目标函数 """
    return x* x * np.sin(x)


class GA:
    def __init__(self, function, domain=50, pc=0.6, pm=0.01, M=50, popsize=50, length=10):
        self.pc = pc  # 交叉概率
        self.pm = pm  # 突变概率
        self.function = function
        self.length = length  # 染色体长度，即二进制数组位数，默认为10
        self.popsize = popsize  # 种群（染色体）数，初始解的个数，定义域范围内的 popsize 个随机取值
        """
        numpy.random.randint(low, high, size): 此函数用于生成指定范围内的随机整数。
        low: 随机数的下限（包含）。在这个例子中是 0。
        high: 随机数的上限（不包含）。在此例中是 2，所以生成的随机数只能是 0 或 1。
        生成一个 popsize * length 的数组，数组中的每个元素都是随机选择的 0 或 1。
        """
        self.pop = np.random.randint(0, 2, size=(popsize, length)) # 随机初始化种群，初始化一组x取值,x是一个10位的二进制数
        self.M = M  # 迭代多少次，进化多少代，这里给默认值50
        self.domain = domain  # 定义域，这里是0到50

    def toXcllection(self, pop):
        """ 把二进制数组转化成在定义域范围内的数，二进制转十进制 """
        # new 初始化一个全零数组，shape与原始种群相同，用于存储中间计算结果。
        new = np.zeros(shape=(self.popsize, self.length))

        for i in range(self.length):  # 这个循环通过位权重将二进制值转换为十进制值。
            new[:, i] = 2 ** (self.length - 1 - i) * pop[:, i]

        """
        计算new数组每一行的总和，得到每个染色体对应的十进制值。通过将每个染色体的总值乘以self.domain 并归一化，使其落在指定的定义域内。 
        """
        new = self.domain * np.sum(new, axis=1) / 2 ** (self.length)

        # 返回转换后的结果数组 new，它包含了对应于每个染色体的数值。
        return new

    def selection(self, pop):
        """ 选择算子 """
        value = self.toXcllection(pop)
        idx = np.random.choice(np.arange(self.popsize), replace=True, size=self.popsize,
                               p=abs(self.function(value)) / abs(self.function(value)).sum())
        return pop[idx]

    def mutation(self, pop, pm):
        """ 变异算子 """
        x, y = pop.shape
        newpop = np.ones((x, y))
        for i in range(x):  # x为种群个数
            if np.random.rand() < pm:
                mpoint = random.randint(0, y - 1)
                newpop[i, :] = pop[i, :]
                if newpop[i, mpoint] == 0:
                    newpop[i, mpoint] = 1
                else:
                    newpop[i, mpoint] = 0
            else:
                newpop[i, :] = pop[i, :]
        return newpop

    def crossover(self, pop, pc):
        """ 交叉算子  """
        x, y = pop.shape
        newpop = np.ones((x, y))
        for i in range(0, x, 2):
            if np.random.rand() < pc:  # 以概率pc对染色体进行交叉
                cn = random.randint(0, y - 1)  # 随机选择交叉点
                newpop[i, 0:cn] = pop[i, 0:cn]
                newpop[i, cn:y] = pop[i + 1, cn:y]
                newpop[i + 1, 0:cn] = pop[i + 1, 0:cn]
                newpop[i + 1, cn:y] = pop[i, cn:y]
            else:
                newpop[i, :] = pop[i, :]
                newpop[i + 1, :] = pop[i, :]
        return newpop

    def main(self):
        """ 主函数，把三个算子加起来，下面每次计算都更新种群pop，种群就是定义域内的一组变量取值  """
        for i in range(self.M):  # 进行M次进化
            newpop = self.selection(pop=self.pop)            # 选择
            newpop = self.crossover(pop=newpop, pc=self.pc)  # 交叉
            newpop = self.mutation(pop=newpop, pm=self.pm)   # 变异
            self.pop = newpop
        newpop = self.selection(pop=self.pop)
        return self.toXcllection(newpop)  # 返回最优解x


g = GA(function=fit_function)
x = np.arange(0, 50, 0.1)
y = fit_function(x)
plt.plot(x, y, color="green")
xx = g.main()
yy = fit_function(xx)
plt.scatter(xx, yy, color='red')
plt.show()

局限性
1、需要特殊定义

将遗传算法应用于给定问题时，需要为它们创建合适的表示形式------定义适应度函数和染色体结构，以及适用于该问题的选择、交叉和变异算子。
2、超参数调整

遗传算法的行为由一组超参数控制，例如种群大小和突变率等。将遗传算法应用于特定问题时，没有标准的超参数设定规则。
3、计算密集

种群规模较大时可能需要大量计算，在达到良好结果之前会非常耗时。

可以通过选择超参数、并行处理以及在某些情况下缓存中间结果来缓解这些问题。
4、过早趋同

如果一个个体的适应能力比种群的其他个体的适应能力高得多，那么它的重复性可能足以覆盖整个种群。这可能导致遗传算法过早地陷入局部最大值，而不是找到全局最大值。为了防止这种情况的发生，需要保证物种的多样性。
5、无法保证的解的质量

遗传算法的使用并不能保证找到当前问题的全局最大值（但几乎所有的搜索和优化算法都存在此类问题，除非它是针对特定类型问题的解析解）。

案例3，使用遗传算法进化卷积神经网络

复制代码

import random
import numpy as np
from deap import creator, base, tools, algorithms
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.utils import to_categorical

# 加载 CIFAR-10 数据集
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()
x_train, x_test = x_train.astype('float32') / 255.0, x_test.astype('float32') / 255.0
y_train, y_test = to_categorical(y_train, num_classes=10), to_categorical(y_test, num_classes=10)

# 定义适应度和个体
creator.create("FitnessMax", base.Fitness, weights=(1.0,))  # 最大化适应度
creator.create("Individual", list, fitness=creator.FitnessMax)

# 定义CNN架构的生成函数
def create_cnn_architecture():
    layers = []
    num_layers = random.randint(3, 6)  # 随机确定网络层数
    for _ in range(num_layers):
        layer_type = random.choice(['conv', 'pool', 'dense'])
        if layer_type == 'conv':
            channels = random.choice([16, 32, 64])  # 输出通道数
            kernel_size = random.choice([3, 5])  # 卷积核大小
            stride = random.choice([1, 2])  # 步幅
            layers.append(('conv', channels, kernel_size, stride))
        elif layer_type == 'pool':
            pool_size = random.choice([2, 3])  # 池化层的池大小
            layers.append(('pool', pool_size))
        elif layer_type == 'dense':
            units = random.choice([64, 128, 256])  # 全连接层的节点数
            layers.append(('dense', units))
    return layers

# 自定义交叉操作
def crossover(ind1, ind2):
    cx_point = random.randint(1, min(len(ind1), len(ind2)) - 1)
    ind1[cx_point:], ind2[cx_point:] = ind2[cx_point:], ind1[cx_point:]
    return ind1, ind2

# 自定义变异操作
def mutate(ind):
    if random.random() < 0.5:  # 有50%的概率进行变异
        index = random.randint(0, len(ind) - 1)
        layer_type = ind[index][0]

        if layer_type == 'conv':
            channels = random.choice([16, 32, 64])
            kernel_size = random.choice([3, 5])
            stride = random.choice([1, 2])
            ind[index] = ('conv', channels, kernel_size, stride)
        elif layer_type == 'pool':
            pool_size = random.choice([2, 3])
            ind[index] = ('pool', pool_size)
        elif layer_type == 'dense':
            units = random.choice([64, 128, 256])
            ind[index] = ('dense', units)
    return ind

# 配置遗传算法的工具箱
toolbox = base.Toolbox()
toolbox.register("individual", tools.initIterate, creator.Individual, create_cnn_architecture)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
toolbox.register("mates", crossover)  # 注册交叉操作
toolbox.register("mutate", mutate)  # 注册变异操作
toolbox.register("select", tools.selTournament, tournsize=3)  # 选择操作

def build_and_evaluate(individual):
    """构建和评估卷积神经网络"""
    model = models.Sequential()

    for layer in individual:
        if layer[0] == 'conv':
            model.add(layers.Conv2D(layer[1], layer[2], strides=layer[3], activation='relu', input_shape=(32, 32, 3)))
        elif layer[0] == 'pool':
            model.add(layers.MaxPooling2D(pool_size=(layer[1], layer[1])))
        elif layer[0] == 'dense':
            model.add(layers.Flatten())
            model.add(layers.Dense(layer[1], activation='relu'))

    model.add(layers.Dense(10, activation='softmax'))  # 输出层

    # 编译模型
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

    # 训练模型
    model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.1, verbose=0)

    # 评估模型
    _, accuracy = model.evaluate(x_test, y_test, verbose=0)
    return (accuracy,)

toolbox.register("evaluate", build_and_evaluate)

# 进化过程
def main():
    population = toolbox.population(n=10)  # 种群大小
    NGEN = 10  # 迭代次数
    for gen in range(NGEN):
        # 评估个体适应度
        fits = list(map(toolbox.evaluate, population))
        for fit, ind in zip(fits, population):
            ind.fitness.values = fit

            # 选择
        offspring = toolbox.select(population, len(population))
        offspring = list(map(toolbox.clone, offspring))

        # 交叉与变异
        for child1, child2 in zip(offspring[::2], offspring[1::2]):
            if random.random() < 0.5:  # 交叉的概率
                toolbox.mates(child1, child2)

        for mutant in offspring:
            if random.random() < 0.2:  # 变异的概率
                toolbox.mutate(mutant)

                # 将后代替换种群
        population[:] = offspring

        # 返回最佳个体
    fits = list(map(toolbox.evaluate, population))
    for fit, ind in zip(fits, population):
        ind.fitness.values = fit

    best_individuals = tools.selBest(population, k=1)
    return best_individuals

if __name__ == "__main__":
    best_individuals = main()
    print(f"Best individual: {best_individuals}")