MATLAB环境下基于PSO-DBN-ELM方法的图像分类

在纯数据驱动的图像识别方法中，深度信念网络DBN识别模型具备较好的识别性能。对于DBN模型而言，可利用的数据越多，挖掘的信息也越多，建立的模型就越准确。然而DBN本身仍存在一定的不足之处，一方面由于DBN内部包含多层限制玻尔兹曼机RBM，在训练过程中需要针对每一层RBM进行网络参数的选取，又因为其网络训练的黑箱性质，会导致整个调参过程相当繁琐，甚至出现无法收敛的结果，且目前没有DBN网络架构设计相关的标准规则。另一方面DBN本身引入传统的BP算法作为回归层，也继承了其易局部最小化、训练速度慢等问题。

目前对于DBN的改进方法一般分为两种，一是通过预处理(特征融合等)提高输入数据质量，二是通过优化策略(Dropout等)改进模型参数或者选取不同的分类/回归器替换BP层。极限学习机ELM拥有学习速度快、泛化能力强等特点，可以很大程度上改善DBN本身收敛速度过慢、容易陷入局部最优、非线性性能不足等问题，从而提高模型的准确率。

粒子群优化算法PSO是一种群体智能的模型优化算法，主要通过群体信息共享来获取最优解。PSO受启发于鸟类寻找食物时的群体协作能力，它将目标函数类比为鸟群中的鸟，即粒子。函数的在固定范围内寻求最优解类比为鸟在相应范围内的位置上寻找食物。粒子的移动方向依靠每个位置的适应度来决定，适应度越高代表越靠近最优解。每个粒子都拥有独立的位置和移动速度参数，在粒子群体更新过程中，粒子会向适应度最高的方向移动，也有可能向随机方向移动，但不会超出设定范围。

鉴于PSO、DBN和ELM的优势，提出一种基于PSO-DBN-ELM方法的图像分类模型，通过3层DBN提取嵌入在原始图像数据中的特征，然后将特征输入极限学习机(ELM)进行分类。针对隐节点选择困难的问题，采用粒子群算法对隐节点进行自动选择，并以最小的适应度函数(即DBN-ELM交叉验证分类的准确率)为目标。

部分代码如下：

复制代码

clear all
close all
format compact 
format long
%% 1.数据加载
fprintf(1,'加载数据 \n');
load('drivFace600');%其中1-173为1类，174-343为2类 344-510为3类 511-600为4类，各选择20%作为测试集
%第一类173组
[i1 i2]=sort(rand(173,1)); 
train(1:139,:)=input(i2(1:139),:);     train_label(1:139,1)=output(i2(1:139),1);
test(1:34,:)=input(i2(140:173),:);     test_label(1:34,1)=output(i2(140:173),1);
%第二类有170组
[i1 i2]=sort(rand(170,1));
train(140:275,:)=input(173+i2(1:136),:);    train_label(140:275,1)=output(173+i2(1:136),1);
test(35:68,:)=input(173+i2(137:170),:);     test_label(35:68,1)=output(173+i2(137:170),1);
%第三类有167
[i1 i2]=sort(rand(167,1));
train(276:408,:)=input(343+i2(1:133),:);    train_label(276:408,1)=output(343+i2(1:133),1);
test(69:102,:)=input(343+i2(134:167),:);     test_label(69:102,1)=output(343+i2(134:167),1);
%第4类有90
[i1 i2]=sort(rand(90,1));
train(409:480,:)=input(510+i2(1:72),:);    train_label(409:480,1)=output(510+i2(1:72),1);
test(103:120,:)=input(510+i2(73:90),:);     test_label(103:120,1)=output(510+i2(73:90),1); 
clear i1 i2 input output
%%打乱顺序
k=rand(480,1);[m n]=sort(k);
train=train(n(1:480),:);train_label=train_label(n(1:480),:);
k=rand(120,1);[m n]=sort(k);
test=test(n(1:120),:);test_label=test_label(n(1:120),:);
clear k m n
%no_dims = round(intrinsic_dim(train, 'MLE')); %round四舍五入
%disp(['MLE estimate of intrinsic dimensionality: ' num2str(no_dims)]);
numbatches=10;%数据分块数
numcases=48;%每块数据集的样本个数（不能太小）块数不能超过样本数
numdims=size(train,2);%单个样本的维数
% 训练数据
x=train;%将数据转换成DBN的数据格式
for i=1:numbatches
    train1=x((i-1)*numcases+1:i*numcases,:);
    batchdata(:,:,i)=train1;
end%将分好的10组数据都放在batchdata中
%% rbm参数
maxepoch=20;%训练rbm的次数
hid=4; %隐含层数
hmax=500;hmin=100; %各隐含层节点数取值区间
tic;
%%
h=PSO_dbnelm_cross(hid,hmax,hmin,batchdata,train,train_label); %PSO优化隐含层节点数
%%
t1=toc
tic;
numpen0=h(1,1); numpen1=h(1,2); numpen2=h(1,3);numpen3=h(1,4); %dbn最终隐含层的节点数
disp('构建一个num2str(H)层的置信网络');
clear i 
%% 训练第1层RBM
fprintf(1,'Pretraining Layer 1 with RBM: %d-%d \n',numdims,numpen0);%6400-500
numhid=numpen0;
restart=1;
rbm1;%使用cd-k训练rbm，注意此rbm的可视层不是二值的，而隐含层是二值的
vishid1=vishid;hidrecbiases=hidbiases;
%% 训练第2层RBM
fprintf(1,'\nPretraining Layer 2 with RBM: %d-%d \n',numpen0,numpen1);%500-200
batchdata=batchposhidprobs;%将第一个RBM的隐含层的输出作为第二个RBM 的输入
numhid=numpen1;%将numpen的值赋给numhid，作为第二个rbm隐含层的节点数
restart=1;
rbm1;
hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases;
%% 训练第3层RBM
fprintf(1,'\nPretraining Layer 3 with RBM: %d-%d \n',numpen1,numpen2);%200-100
batchdata=batchposhidprobs;%显然，将第二哥RBM的输出作为第三个RBM的输入
numhid=numpen2;%第三个隐含层的节点数
restart=1;
rbm1;
hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases;
%% 训练第4层RBM
fprintf(1,'\nPretraining Layer 4 with RBM: %d-%d \n',numpen2,numpen3);%200-100
batchdata=batchposhidprobs;%显然，将第二哥RBM的输出作为第三个RBM的输入
numhid=numpen3;%第三个隐含层的节点数
restart=1;
rbm1;
hidpen3=vishid; penrecbiases3=hidbiases; hidgenbiases3=visbiases;

 %% 训练极限学习机
 % 训练集特征输出
w1=[vishid1; hidrecbiases]; 
w2=[hidpen; penrecbiases]; 
w3=[hidpen2; penrecbiases2];
w4=[hidpen3; penrecbiases3];
digitdata = [x ones(size(x,1),1)];%x表示train数据集
w1probs = 1./(1 + exp(-digitdata*w1));%
  w1probs = [w1probs  ones(size(x,1),1)];%
w2probs = 1./(1 + exp(-w1probs*w2));%
  w2probs = [w2probs ones(size(x,1),1)];%
w3probs = 1./(1 + exp(-w2probs*w3)); %
  w3probs = [w3probs ones(size(x,1),1)];%
w4probs = 1./(1 + exp(-w3probs*w4)); % 
H = w4probs';  %%第三个rbm的实际输出值，也是elm的输入值H
lamda=0.001;  %% 正则化系数在0.0007-0.00037之间时  测试精度最大81.667%
H=H+1/lamda;  %加入regularization factor
T =train_label';            %训练集标签
T1=ind2vec(T);              %做分类需要先将T转换成向量索引
OutputWeight=pinv(H') *T1'; 
Y=(H' * OutputWeight)';
temp_Y=zeros(1,size(Y,2));
for n=1:size(Y,2)
    [max_Y,index]=max(Y(:,n));
    temp_Y(n)=index;
end
Y_train=temp_Y;
%Y_train=vec2ind(temp_Y1);
% 训练集准确率
train_accuracy=sum(Y_train==T)/length(T)
% 训练集实际分类与预测分类对比
figure(1)
plot(Y_train,'bo');hold on 
plot(T,'r*');

%% 测试极限学习机
N2 = size(test,1);
% 测试集特征输出
w1=[vishid1; hidrecbiases]; %(784+1*500)
w2=[hidpen; penrecbiases]; %(500+1*500)
w3=[hidpen2; penrecbiases2];%(500+1*2000)
w4=[hidpen3; penrecbiases3];
test1 = [test ones(N2,1)];
w1probs = 1./(1 + exp(-test1*w1));
  w1probs = [w1probs  ones(N2,1)];
w2probs = 1./(1 + exp(-w1probs*w2)); 
  w2probs = [w2probs ones(N2,1)];
w3probs = 1./(1 + exp(-w2probs*w3)); 
  w3probs = [w3probs ones(N2,1)];
w4probs = 1./(1 + exp(-w3probs*w4));  
H1=w4probs';
%加入正则化系数
H1=H1+1/lamda;
TY=(H1' * OutputWeight)';     %   TY: the actual output of the testing data
temp_Y=zeros(1,size(TY,2));
for n=1:size(TY,2)
    [max_Y,index]=max(TY(:,n));
    temp_Y(n)=index;
end
TY1=temp_Y;
% 加载输出
TV=test_label';
% 测试集分类准确率
test_accuracy = sum(TV==TY1) / length(TV)
% 测试集实际分类与预测分类对比
figure(2)
plot(TV,'r*');
hold on
plot(TY1,'bo');
xlabel('测试集样本数')
ylabel('标签种类')
title('测试阶段：实际输出与理想输出的差');
legend('真实值','预测值')
% 程序运行时间
t2=toc

工学博士，担任《Mechanical System and Signal Processing》审稿专家，担任
《中国电机工程学报》优秀审稿专家，《控制与决策》，《系统工程与电子技术》，《电力系统保护与控制》，《宇航学报》等EI期刊审稿专家。

擅长领域：现代信号处理，机器学习，深度学习，数字孪生，时间序列分析，设备缺陷检测、设备异常检测、设备智能故障诊断与健康管理PHM等。