1.2 基于卷积神经网络与SE注意力的轴承故障诊断

本博客来源于CSDN机器鱼,未同意任何人转载。

更多内容,欢迎点击本专栏,查看更多内容。

目录

[0 引言](#0 引言)

[1 卷积神经网络故障诊断模型](#1 卷积神经网络故障诊断模型)

[2 VGG13卷积神经网络故障诊断模型](#2 VGG13卷积神经网络故障诊断模型)

[3 VGG13+SE注意力机制的故障诊断模型](#3 VGG13+SE注意力机制的故障诊断模型)

[4 模型训练](#4 模型训练)

[5 模型使用](#5 模型使用)

[6 结语](#6 结语)


0 引言

本博客基于tensorfolow2.6.1与【上一篇博客提取的时频图数据】构建2D卷积神经网络做故障诊断(10分类,9类故障+1正常),并采用普通的2D卷积网络→VGG卷积网络→VGG+SE注意力的逐步递进的改进网络方式,教大家水文章。

1 卷积神经网络故障诊断模型

轴承故障诊断就是多分类,我们还利用二维卷积搭建一个10分类的模型,比如可以用LeNet这样的结构来做,其含输入层→2个卷积2个池化→1个全连接层→分类层,数据处理与网络构建代码如下:

复制代码
# -*- coding: utf-8 -*-
from tensorflow.keras.layers import Conv1D,Conv2D, Dense, \
        MaxPool1D,MaxPool2D, concatenate, Flatten,Reshape,Layer,\
        GlobalAveragePooling2D,Activation,multiply,BatchNormalization,Dropout
from tensorflow.keras import Input, Model
import tensorflow.keras.backend as K
import tensorflow as tf
import numpy as np    
import os
from PIL import Image

# In[1] 定义数据提取函数
def read_directory(directory_name,height,width,normal):
    '''
    data: n*height*width*3
    label:n*1
    '''
    file_list=os.listdir(directory_name)
    file_list.sort(key=lambda x: int(x.split('_')[0]))
    img = []
    label0=[]
    
    for each_file in file_list:
        img0 = Image.open(directory_name + '/'+each_file)
        gray = img0.resize((height,width))
        img.append(np.array(gray).astype(np.float))
        label0.append(float(each_file.split('.')[0].split('_')[-1]))
    if normal:
        data = np.array(img)/255.0#归一化
    else:
        data = np.array(img) 
    
    label=np.array(label0)
    return data,label 
# In[2] 定义LeNet10分类网络
def LeNet(k1_num,k1_size,k2_num,k2_size,fc1):
    input1_= Input(shape=(64, 64,3), name='input1')
    # LeNet构建的网络结构是 输入层-2d卷积-池化-2d卷积-池化-全连接1-全连接2-输出层
    x1 = Conv2D(k1_num, kernel_size=k1_size, strides=1, activation='relu', padding='same')(input1_)#b*64*64*k1_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#b*32*32*k1_num
    x1 = Conv2D(k2_num, kernel_size=k2_size, strides=1, activation='relu', padding='same')(x1)#b*32*32*k2_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#b*16*16*k2_num
    x = Flatten()(x1)#b* (16*16*k2_num)
    # x = Dropout(0.5)(x)
    x = Dense(fc1, activation='sigmoid', name='fc1')(x) #b*fc1
    # x = Dropout(0.5)(x)
    output_ = Dense(10, activation='softmax', name='output')(x)#b*10
    model = Model(inputs=input1_, outputs=output_)
    
    return model

注意,我们首先将时频图在输入之前resize到[64,64,3],并除255归一化到[0,1],再输入到网络,每个卷积层我们的padding都是same,因此不论怎么设置卷积核数量k_num与卷积核大小k_size,输入与输出的featuremap的size都是一样的。每个卷积之后接一个最大值池化做一次降采样,最后将池化层的输出Flatten成向量并交给全连接层,再接一个Dense做分类层即可完成,以64*64*3为输入,10输出的分类模型。

基于上述模型,首先利用训练集训练好模型并保存好,当系统检测到机械故障时,利用传感器采集待样本,提取待测样本的时频图输入到训练好的模型输出对应的类别,即可完成故障诊断。

2 VGG13卷积神经网络故障诊断模型

由于LeNet过于简单,直接这样写进论文不太好交差,因此我们稍微用复杂点的网络如GoogleNet、MobileNet、Resnet、VisionTransformer等结构构建故障诊断模型,我们采用VGG13举例,VGG13的结构如下:

两次卷积+1次最大值池化,重复5次,然后将第五次的输出Flatten成向量,经2个Dense进行降维,最后接一个节点数为10的Dense做分类,代码如下:

复制代码
# -*- coding: utf-8 -*-
from tensorflow.keras.layers import Conv1D,Conv2D, Dense, \
        MaxPool1D,MaxPool2D, concatenate, Flatten,Reshape,Layer,\
        GlobalAveragePooling2D,Activation,multiply,BatchNormalization,Dropout
from tensorflow.keras import Input, Model
import tensorflow.keras.backend as K
import tensorflow as tf
import numpy as np    
import os
from PIL import Image

# In[1] 定义数据提取函数
def read_directory(directory_name,height,width,normal):
    '''
    data: n*height*width*3
    label:n*1
    '''
    file_list=os.listdir(directory_name)
    file_list.sort(key=lambda x: int(x.split('_')[0]))
    img = []
    label0=[]
    
    for each_file in file_list:
        img0 = Image.open(directory_name + '/'+each_file)
        gray = img0.resize((height,width))
        img.append(np.array(gray).astype(np.float))
        label0.append(float(each_file.split('.')[0].split('_')[-1]))
    if normal:
        data = np.array(img)/255.0#归一化
    else:
        data = np.array(img) 
    
    label=np.array(label0)
    return data,label 
def VGG13(k1_num,k1_size,k2_num,k2_size,k3_num,k3_size,k4_num,k4_size,k5_num,k5_size,fc1,fc2):
    input1_= Input(shape=(64, 64,3), name='input1')
    # VGG13构建的网络结构是 输入层-2d卷积*2-池化-2d卷积*2-池化-2d卷积*2-池化-2d卷积*2-池化-2d卷积*2-池化-全连接1-全连接2-输出层
    x1 = Conv2D(k1_num, kernel_size=k1_size, strides=1, activation='relu', padding='same')(input1_)#B*64*64*k1_num
    x1 = Conv2D(k1_num, kernel_size=k1_size, strides=1, activation='relu', padding='same')(x1)#B*64*64*k1_num#连续两个卷积的卷积核数量与大小是一致的
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*32*32*k1_num
    x1 = Conv2D(k2_num, kernel_size=k2_size, strides=1, activation='relu', padding='same')(x1)#B*32*32*k2_num
    x1 = Conv2D(k2_num, kernel_size=k2_size, strides=1, activation='relu', padding='same')(x1)#B*32*32*k2_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*16*16*k2_num
    x1 = Conv2D(k3_num, kernel_size=k3_size, strides=1, activation='relu', padding='same')(x1)#B*16*16*k3_num
    x1 = Conv2D(k3_num, kernel_size=k3_size, strides=1, activation='relu', padding='same')(x1)#B*16*16*k3_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*8*8*k3_num
    x1 = Conv2D(k4_num, kernel_size=k4_size, strides=1, activation='relu', padding='same')(x1)#B*8*8*k4_num
    x1 = Conv2D(k4_num, kernel_size=k4_size, strides=1, activation='relu', padding='same')(x1)#B*8*8*k4_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*4*4*k4_num
    x1 = Conv2D(k5_num, kernel_size=k5_size, strides=1, activation='relu', padding='same')(x1)#B*4*4*k5_num
    x1 = Conv2D(k5_num, kernel_size=k5_size, strides=1, activation='relu', padding='same')(x1)#B*4*4*k5_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*2*2*k5_num
    x = Flatten()(x1)#B* (2*2*k5_num)
    # x = Dropout(0.5)(x)
    x = Dense(fc1, activation='sigmoid', name='fc1')(x)#B*fc1
    # x = Dropout(0.5)(x)
    x = Dense(fc2, activation='sigmoid', name='fc2')(x)#B*fc2
    output_ = Dense(10, activation='softmax', name='output')(x)#B*10
    model = Model(inputs=input1_, outputs=output_)
    return model

3 VGG13+SE注意力机制的故障诊断模型

直接用VGG13网络老师可能会说没得创新,同时也为了写论文的时候凑字数,因此需要再次加点内容灌水,常见的方式就是改网络中的模块,比如卷积改成可变型卷积,就说自己的诊断模型可以更好的提取故障特征;加各种注意力模型,比如SE、CBRM,就说自己的诊断模型具备可以聚焦数据中的重要特征;网络组合CNN+LSTM组合,就说自己的诊断模型还可以充分发掘数据中的时域特征。最后改个名字就叫Improved VGG Fault Diagnosis Model,简称IVFD-Net。这篇博客中我们给VGG13加入SE注意力模块,结构如下,代码如下:

代码如下:

复制代码
# -*- coding: utf-8 -*-
from tensorflow.keras.layers import Conv1D,Conv2D, Dense, \
        MaxPool1D,MaxPool2D, concatenate, Flatten,Reshape,Layer,\
        GlobalAveragePooling2D,Activation,multiply,BatchNormalization,Dropout
from tensorflow.keras import Input, Model
import tensorflow.keras.backend as K
import tensorflow as tf
import numpy as np    
import os
from PIL import Image

# In[1] 定义数据提取函数
def read_directory(directory_name,height,width,normal):
    '''
    data: n*height*width*3
    label:n*1
    '''
    file_list=os.listdir(directory_name)
    file_list.sort(key=lambda x: int(x.split('_')[0]))
    img = []
    label0=[]
    
    for each_file in file_list:
        img0 = Image.open(directory_name + '/'+each_file)
        gray = img0.resize((height,width))
        img.append(np.array(gray).astype(np.float))
        label0.append(float(each_file.split('.')[0].split('_')[-1]))
    if normal:
        data = np.array(img)/255.0#归一化
    else:
        data = np.array(img) 
    
    label=np.array(label0)
    return data,label 

class SE_AttentionLayer(Layer):
    def __init__(self, **kwargs):
        
        super(SE_AttentionLayer, self).__init__(**kwargs)
    def build(self, input_shape):
        assert len(input_shape) == 4
        self.W1 = self.add_weight(name='att_weight',
                                 shape=(input_shape[-1], input_shape[-1]//4),
                                 initializer='uniform',
                                 trainable=True)
        self.W2 = self.add_weight(name='att_weight',
                                 shape=(input_shape[-1]//4, input_shape[-1]),
                                 initializer='uniform',
                                 trainable=True)
        super(SE_AttentionLayer, self).build(input_shape)

    def call(self, inputs): 
        channel = inputs.shape[-1]
        se_feature = GlobalAveragePooling2D()(inputs)
        se_feature = Reshape((1, 1, channel))(se_feature)
        se_feature = K.dot(se_feature, self.W1)
        se_feature = Activation('sigmoid')(se_feature)
        se_feature = K.dot(se_feature, self.W2)
        se_feature = Activation('sigmoid')(se_feature)
        # se_feature = Dense(channel // 4,
    				# 	   activation='relu',
    				# 	   use_bias=False)(se_feature)
    					   
        # se_feature = Dense(channel,
        #                    activation='sigmoid',
    				# 	   use_bias=False)(se_feature)
        
        
        outputs = multiply([inputs, se_feature])

        return outputs

def ProposedNet2D(k1_num,k1_size,k2_num,k2_size,k3_num,k3_size,k4_num,k4_size,k5_num,k5_size,fc1,fc2):
    input1_= Input(shape=(64, 64,3), name='input1')
    # VGG13构建的网络结构是 输入层-2d卷积*2-池化-2d卷积*2-池化-2d卷积*2-池化-2d卷积*2-池化-2d卷积*2-池化-全连接1-全连接2-输出层
	# VGG13+SE构建的网络结构是 输入层-2d卷积*2-池化-SE注意力-2d卷积*2-池化-SE注意力-2d卷积*2-池化-SE注意力-2d卷积*2-池化-SE注意力-2d卷积*2-池化-SE注意力-全连接1-全连接2-输出层
    x1 = Conv2D(k1_num, kernel_size=k1_size, strides=1, activation='relu', padding='same')(input1_)#B*64*64*k1_num
    x1 = Conv2D(k1_num, kernel_size=k1_size, strides=1, activation='relu', padding='same')(x1)#B*64*64*k1_num#连续两个卷积的卷积核数量与大小是一致的
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*32*32*k1_num
	x1 = SE_AttentionLayer()(x1) #B*32*32*k1_num
    x1 = Conv2D(k2_num, kernel_size=k2_size, strides=1, activation='relu', padding='same')(x1)#B*32*32*k2_num
    x1 = Conv2D(k2_num, kernel_size=k2_size, strides=1, activation='relu', padding='same')(x1)#B*32*32*k2_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*16*16*k2_num
	x1 = SE_AttentionLayer()(x1) #B*16*16*k2_num
    x1 = Conv2D(k3_num, kernel_size=k3_size, strides=1, activation='relu', padding='same')(x1)#B*16*16*k3_num
    x1 = Conv2D(k3_num, kernel_size=k3_size, strides=1, activation='relu', padding='same')(x1)#B*16*16*k3_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*8*8*k3_num
	x1 = SE_AttentionLayer()(x1) #B*8*8*k3_num
    x1 = Conv2D(k4_num, kernel_size=k4_size, strides=1, activation='relu', padding='same')(x1)#B*8*8*k4_num
    x1 = Conv2D(k4_num, kernel_size=k4_size, strides=1, activation='relu', padding='same')(x1)#B*8*8*k4_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*4*4*k4_num
	x1 = SE_AttentionLayer()(x1) #B*4*4*k4_num
    x1 = Conv2D(k5_num, kernel_size=k5_size, strides=1, activation='relu', padding='same')(x1)#B*4*4*k5_num
    x1 = Conv2D(k5_num, kernel_size=k5_size, strides=1, activation='relu', padding='same')(x1)#B*4*4*k5_num
    x1 = MaxPool2D(pool_size=2, strides=2)(x1)#B*2*2*k5_num
	x1 = SE_AttentionLayer()(x1) #B*2*2*k5_num

    x = Flatten()(x1)#B* (2*2*k5_num)
    
    # x = Dropout(0.5)(x)
    x = Dense(fc1, activation='sigmoid', name='fc1')(x)#B*fc1
    # x = Dropout(0.5)(x)
    x = Dense(fc2, activation='sigmoid', name='fc2')(x)#B*fc2
    output_ = Dense(10, activation='softmax', name='output')(x)#B*10
    model = Model(inputs=input1_, outputs=output_)
    
    return model

注意SE注意力里面要对featuremap通道数//4,因此ki_num需要大于4。

4 模型训练

有了上一篇博客的数据与这一篇博客的模型,即可进行模型的训练,正式训练前需要创建一个model文件夹保存训练的模型,result文件夹保存损失曲线图与混淆矩阵图,完整代码如下:

复制代码
# coding: utf-8
# In[1]: 导入必要的库函数
import os
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix 
import seaborn as sns
from model import ProposedNet2D,SE_AttentionLayer,read_directory
import tensorflow as tf
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow.keras.optimizers import Adam,SGD
import random
tf.random.set_seed(0)
np.random.seed(0)
random.seed(0)
# In[] 加载数据
height=64
width=64

# 时频图---2D-CNN输入
x_train1,y_train=read_directory('image/train',height,width,normal=1)
x_valid1,y_valid=read_directory('image/test',height,width,normal=1)

# In[] 搭建模型 
# 参数设置,有点多 分别
# 是第1卷积层的卷积核数量k1_num、卷积核大小k1_size*k1_size
# 是第2卷积层的卷积核数量k2_num、卷积核大小k2_size*k2_size
# 是第3卷积层的卷积核数量k3_num、卷积核大小k3_size*k3_size
# 是第4卷积层的卷积核数量k4_num、卷积核大小k4_size*k4_size
# 是第5卷积层的卷积核数量k5_num、卷积核大小k4_size*k5_size
# 全连接层的节点数fc1,fc2
# 训练次数 epoches,batchsize,学习率 learningrate
# learningrate,epochs,batch_size,k1_num,k1_size,k2_num,k2_size,k3_num,k3_size,k4_num,k4_size,k5_num,k5_size,fc1,fc2=\
    # [0.0001, 1, 35, 4, 1, 19, 2, 21, 2, 16, 2, 16, 2, 149, 50]
k1_num,k1_size,k2_num,k2_size,k3_num,k3_size,k4_num,k4_size,k5_num,k5_size,fc1,fc2=4,3,8,3,16,3,32,3,64,3,40,20
epochs=100
batch_size = 64
learningrate=0.001

model= ProposedNet2D(k1_num,k1_size,k2_num,k2_size,k3_num,k3_size,k4_num,k4_size,k5_num,k5_size,fc1,fc2)
model.summary()
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath='model/best', monitor='val_accuracy', verbose=0, save_best_only=True, mode = 'max')

model.compile(optimizer=Adam(learning_rate=learningrate),
              loss=SparseCategoricalCrossentropy(),
              metrics=['accuracy'])
# exit()
# In[] 训练
train_again = True  # 为 False 的时候就直接加载训练好的模型进行测试  True就重新训练
# 训练模型
if train_again:
    history = model.fit(x_train1,y_train, epochs=epochs,
                        validation_data=(x_valid1, y_valid),
                        batch_size=batch_size, verbose=1,callbacks= [checkpoint])
    model=tf.keras.models.load_model('model/best',custom_objects = {'SE_AttentionLayer': SE_AttentionLayer})

    # 画loss曲线
    plt.figure()
    plt.ylabel('MSE')
    plt.xlabel('Epoch')
    plt.plot(history.history['loss'], label='training')
    # plt.plot(history.history['val_loss'], label='validation')
    plt.title('loss curve')
    plt.legend()
    plt.savefig('result/model2d_loss_curve.jpg')
else:  # 加载模型
    model=tf.keras.models.load_model('model/best',custom_objects = {'SE_AttentionLayer': SE_AttentionLayer})


test_pred = model.predict(x_valid1)
pred_labels = np.argmax(test_pred,1)
acc=np.sum(pred_labels==y_valid)/len(y_valid)
print('分类精度为:%f '%(acc*100),'%')
C1= confusion_matrix(y_valid, pred_labels)
print('混淆矩阵:')
print(C1)
xtick=[str(i) for i in range(10)]
ytick=[str(i) for i in range(10)]

plt.figure()
# C2=C1/C1.sum(axis=1)
sns.heatmap(C1,fmt='g', cmap='Blues',annot=True,cbar=False,xticklabels=xtick, yticklabels=ytick)
plt.xlabel('Predict label')
plt.ylabel('True label')
plt.title('Confusion_matrix')
plt.savefig('result/Confusion_matrix_2d.jpg')
plt.show()  

5 模型使用

这里就是加载训练好的模型做推理,没得什么好说的,用pyqt5做一个qt界面,可以更好的水文章,代码如下:

复制代码
# coding: utf-8
# In[1]: 导入必要的库函数
import numpy as np
import tensorflow as tf
from model import SE_AttentionLayer
from PIL import Image

# In[] 加载数据
#加载一个待测样本 我们就取测试集中的第0个 0_2.jpg 第0张图片是第2类
img0 = Image.open('image/test/0_2.jpg')
img0 = img0.resize((64,64))
img0=np.array(img0).astype(np.float)/255
x_test1=img0[np.newaxis,:,:,:]
# In[] 加载训练好的模型
model=tf.keras.models.load_model('model/best',custom_objects = {'SE_AttentionLayer': SE_AttentionLayer})
test_pred = model.predict(x_test1)
pred_label = np.argmax(test_pred,1)
print('该样本为第:',pred_label[0],'类')

6 结语

通过第4节的网络训练,测试集的精度大约是84%,这个数据集是可以达到98%-100%的分类精度(数据集只用了一种频率(48KHz)一种负载(0HP)的轴承数据,因此这个数据很简单),之所以没有达到是因为我们在创建网络的时候,超参数是自己随便设计的,总共需要设定的参数有:learningrate,epochs,batch_size,k1_num,k1_size,k2_num,k2_size,k3_num,k3_size,k4_num,k4_size,k5_num,k5_size,fc1,fc2,要通过手动选择这些参数使网络效果达到最优,十分困难,这就引出了我们的下一篇博客【基于蛇算法的VGG13SE网络超参数优化】。

获取更多内容请点击【专栏】,您的点赞与收藏是我持续更新【Python神经网络1000个案例分析】的动力。

相关推荐
fantasy_arch3 小时前
深度学习--softmax回归
人工智能·深度学习·回归
Blossom.1183 小时前
量子计算与经典计算的融合与未来
人工智能·深度学习·机器学习·计算机视觉·量子计算
硅谷秋水3 小时前
MoLe-VLA:通过混合层实现的动态跳层视觉-语言-动作模型实现高效机器人操作
人工智能·深度学习·机器学习·计算机视觉·语言模型·机器人
2301_764441334 小时前
基于神经网络的肾脏疾病预测模型
人工智能·深度学习·神经网络
HABuo4 小时前
【YOLOv8】YOLOv8改进系列(12)----替换主干网络之StarNet
人工智能·深度学习·yolo·目标检测·计算机视觉
小李独爱秋5 小时前
机器学习开发全流程详解:从数据到部署的完整指南
人工智能·机器学习
Dovis(誓平步青云)5 小时前
深挖 DeepSeek 隐藏玩法·智能炼金术2.0版本
人工智能·深度学习·机器学习·数据挖掘·服务发现·智慧城市
ZTLJQ5 小时前
基于机器学习的三国时期诸葛亮北伐失败因素量化分析
人工智能·算法·机器学习
赵钰老师5 小时前
【Deepseek、ChatGPT】智能气候前沿:AI Agent结合机器学习与深度学习在全球气候变化驱动因素预测中的应用
人工智能·python·深度学习·机器学习·数据分析
nuise_6 小时前
李宏毅机器学习笔记06 | 鱼和熊掌可以兼得的机器学习 - 内容接宝可梦
人工智能·笔记·机器学习