调整Activation Function参数对神经网络的影响

目录

介绍:

数据集:

[模型一(tanh) :](#模型一(tanh) :)

模型二(relu):

[模型三(sigmoid) :](#模型三(sigmoid) :)

模型四(多层tanh):

模型五(多层relu):

介绍:

Activation Function(激活函数)是一种非线性函数,应用在神经网络的每个节点(神经元)上,用来引入非线性变换,增加神经网络的表达能力。

在神经网络中,每个节点的输入是通过加权和计算得到的,然后通过激活函数进行非线性变换,得到输出。激活函数可以将输入的范围映射到一个固定的范围内,常用的范围是0, 1-1, 1。激活函数的引入可以使神经网络具有更强的表达能力,能够处理更复杂的输入数据。

常见的激活函数有:

  • Sigmoid函数:将输入映射到0, 1的范围内,具有平滑的非线性特性,但存在梯度消失的问题。
  • ReLU函数:将输入小于0的部分映射为0,大于0的部分保持不变,具有较好的非线性特性,但存在神经元死亡的问题。
  • Tanh函数:将输入映射到-1, 1的范围内,具有平滑的非线性特性,但也存在梯度消失的问题。
  • Leaky ReLU函数:在ReLU函数的基础上,将输入小于0的部分乘以一个小的斜率,解决了神经元死亡的问题。

选择合适的激活函数取决于具体的任务和数据特点,不同的激活函数在不同的情况下会有不同的表现。

数据集:

python 复制代码
# scatter plot of the circles dataset with points colored by class
from sklearn.datasets import make_circles
from numpy import where
from matplotlib import pyplot
# generate circles
X, y = make_circles(n_samples=1000, noise=0.1, random_state=1)
# select indices of points with each class label
for i in range(2):
	samples_ix = where(y == i)
	pyplot.scatter(X[samples_ix, 0], X[samples_ix, 1], label=str(i))
pyplot.legend()
pyplot.show()

模型一(tanh) :

python 复制代码
# mlp for the two circles classification problem
from sklearn.datasets import make_circles
from sklearn.preprocessing import MinMaxScaler
from keras.layers import Dense
from keras.models import Sequential
from keras.optimizers import SGD
from keras.initializers import RandomUniform
from matplotlib import pyplot
# generate 2d classification dataset
X, y = make_circles(n_samples=1000, noise=0.1, random_state=1)
# scale input data to [-1,1]
scaler = MinMaxScaler(feature_range=(-1, 1))
X = scaler.fit_transform(X)
# split into train and test
n_train = 500
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
# define model
model = Sequential()
init = RandomUniform(minval=0, maxval=1)
model.add(Dense(5, input_dim=2, activation='tanh', kernel_initializer=init))
model.add(Dense(1, activation='sigmoid', kernel_initializer=init))
# compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=500, verbose=0)
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
# plot training history
pyplot.plot(history.history['accuracy'], label='train')
pyplot.plot(history.history['val_accuracy'], label='test')
pyplot.legend()
pyplot.show()

模型二(relu):

python 复制代码
# define model
model = Sequential()
init = RandomUniform(minval=0, maxval=1)
model.add(Dense(5, input_dim=2, activation='relu', kernel_initializer=init))
model.add(Dense(1, activation='sigmoid', kernel_initializer=init))
# compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=500, verbose=0)
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
# plot training history
pyplot.plot(history.history['accuracy'], label='train')
pyplot.plot(history.history['val_accuracy'], label='test')
pyplot.legend()
pyplot.show()

模型三(sigmoid) :

python 复制代码
# define model
model = Sequential()
init = RandomUniform(minval=0, maxval=1)
model.add(Dense(5, input_dim=2, activation='sigmoid', kernel_initializer=init))
model.add(Dense(1, activation='sigmoid', kernel_initializer=init))
# compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=500, verbose=0)
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
# plot training history
pyplot.plot(history.history['accuracy'], label='train')
pyplot.plot(history.history['val_accuracy'], label='test')
pyplot.legend()
pyplot.show()

模型四(多层tanh):

python 复制代码
# define model
init = RandomUniform(minval=0, maxval=1)
model = Sequential()
model.add(Dense(5, input_dim=2, activation='tanh', kernel_initializer=init))
model.add(Dense(5, activation='tanh', kernel_initializer=init))
model.add(Dense(5, activation='tanh', kernel_initializer=init))
model.add(Dense(5, activation='tanh', kernel_initializer=init))
#Initializers define the way to set the initial random weights of Keras layers. The keyword arguments used for passing 
#initializers to layers depends on the layer.
model.add(Dense(5, activation='tanh', kernel_initializer=init))
model.add(Dense(1, activation='sigmoid', kernel_initializer=init))
# compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=500, verbose=0)
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
# plot training history
pyplot.plot(history.history['accuracy'], label='train')
pyplot.plot(history.history['val_accuracy'], label='test')
pyplot.legend()
pyplot.show()

模型五(多层relu):

python 复制代码
# define model
model = Sequential()
model.add(Dense(5, input_dim=2, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(5, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(5, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(5, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(5, activation='relu', kernel_initializer='he_uniform')) 
#he_uniform . Draws samples from a uniform distribution within [-limit, limit] , where limit = sqrt(6 / fan_in) 
#( fan_in is the number of input units in the weight tensor).

model.add(Dense(1, activation='sigmoid'))
# compile model
opt = SGD(lr=0.01, momentum=0.9)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=500, verbose=0)
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
# plot training history
pyplot.plot(history.history['accuracy'], label='train')
pyplot.plot(history.history['val_accuracy'], label='test')
pyplot.legend()
pyplot.show()
相关推荐
诸葛务农9 分钟前
共沸脱水技术及其在光刻胶用PGMEA纯化中的应用(中)
linux·数据库·人工智能
roman_日积跬步-终至千里11 分钟前
【SDD】高风险场景下的 SDD 最佳实践:分层风控+分级落地,约束AI编程边界
大数据·人工智能·ai编程
小小高不懂写代码11 分钟前
Vibe Coding时代的自我鞭策
前端·人工智能
计算机安禾13 分钟前
【算法分析与设计】第36篇:计算几何基础:凸包问题的分治与扫描线解法
大数据·人工智能·算法·机器学习·剪枝
人员安全定位14 分钟前
喜报!品铂科技获2025年度电力建设科学技术进步奖
大数据·人工智能·科技
库拉大叔15 分钟前
GPT-5.5 新手快速上手与实战指南
网络·人工智能·gpt
AI智图坊18 分钟前
拒绝模板同质化:拆解自由生图功能,如何通过GPT-Image-2与Nano Banana Pro双模型驱动电商AIGC?
大数据·人工智能·gpt·ai作画·aigc
货拉拉技术18 分钟前
飞速发展的计算机视觉
人工智能·算法
Sevyn19 分钟前
# 做 AI 热点监控的小项目,我才明白 Agent 不能只靠聊天记录
人工智能
AI大法师19 分钟前
最小视觉系统怎么搭:Logo、颜色、字体、模板和品牌介绍先做什么
人工智能·设计模式·新媒体运营