Journal of Image and Signal Processing
Vol. 08  No. 03 ( 2019 ), Article ID: 31134 , 11 pages
10.12677/JISP.2019.83016

A Research on Deep Learning Model for Face Emotion Recognition Based on Swish Activation Function

Lingjiao Wang, Qian Li, Hua Guo

College of Information Engineering, Xiangtan University, Xiangtan Hunan

Received: Jun. 11st, 2019; accepted: Jun. 27th, 2019; published: Jul. 3rd, 2019

ABSTRACT

In recent years, deep learning model has been developed rapidly. As one of the methods, deep convolution neural network has been widely used in computer vision. There are many factors affecting the performance of deep learning model, among which the selection of activation function and the structure of neural network have important impact on the performance of deep learning model. This paper analyses the advantages and disadvantages of the traditional activation function and the new Swish activation function, introduces Swish function into the deep learning model of facial emotion, proposes an improved back propagation algorithm, and uses multi-layer small-size convolution module instead of large-size convolution module in the convolution neural network to extract refinement features, and constructs a new deep learning model of facial emotion recognition, Swish-FER-CNNs. The experimental results show that the recognition accuracy of deep learning model based on Swish activation function is higher than that of activation functions such as ReLU, L-ReLU and P-ReLU. With the improved network structure, the recognition accuracy of the deep learning model of Swish-FER-CNNS constructed in the paper is improved by 4.02% compared with the existing model.

Keywords:Activation Function, Back Propagation, Convolutional Neural Network, Deep Learning, Computer Vision

1. 引言

2. 基于卷积神经网络的情绪识别模型

Figure 1. Principle diagram of emotion recognition model

2.1. 反向传播算法

$\frac{\text{d}z}{\text{d}x}=\frac{\text{d}z}{\text{d}y}\frac{\text{d}y}{\text{d}x}$ (1)

2.1.1. 激活函数分析

$f\left(x\right)=\frac{1}{1+{\text{e}}^{-x}}$ (2)

Figure 2. Sigmoid activation function

$f\left(x\right)=\mathrm{max}\left(0,x\right)$ (3)

ReLU激活函数在 $x>0$ 的定义域内导数恒为定值，反向传播时可简化计算，加快收敛速度。在 $x<0$ 定义域内具有硬饱和特性：输入落在此区域，对应的输出皆为0，神经元反向传播一阶梯度亦为0，神经元不具有激活作用，即神经元死亡，导致模型的拟合力下降。此外，ReLU函数在 $x<0$ 定义域对应输出为0这一特性导致神经元输出均值大于0，不利于迭代计算，此问题被称为均值偏移：后一个神经元的输入为前一个神经元的输出，因输出皆为正值，后一个神经元的输入被限制，模型的拟合能力下降，制约深度模型的性能。

Maas等人引入L-ReLU激活函数可有效解决均值偏移问题，其函数定义为：

$f\left(x\right)=\left\{\begin{array}{l}x,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}x\ge 0\\ \alpha x,\text{\hspace{0.17em}}\text{\hspace{0.17em}}x<0\end{array}$ (4)

L-ReLU激活函数在 $x\ge 0$ 定义域，一阶导数恒定，利于计算，与ReLU性质一致。在 $x<0$ 定义域内，L-ReLU图像位于y轴的负半轴，减缓了均值偏移。

He Kaiming等人引入P-ReLU激活函数，以获取更贴合模型的负轴斜率，其函数定义为：

$f\left(x\right)=\left\{\begin{array}{l}x,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}x\ge 0\\ {\alpha }_{i}x,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{ }x<0\end{array}$ (5)

$f\left(x\right)=\alpha *x*sigmoid\left(x\right)$ (6)

Figure 3. Swish activation function

$x>0$ 时，Swish激活函数的一阶导数易于计算，利于模型训练。当 $x<0$ 时，与ReLU函数相比，Swish函数既能够均衡正负轴比重，减缓了均值偏移现象，又由于它无硬饱和性，避免了神经元死亡现象；与L-ReLU函数相比，Swish函数是非线性的，具有软饱和性，鲁棒性更好；与P-ReLU相比，Swish函数不需要计算参数 $\alpha$ ，减少了计算量，且鲁棒性更好。因此，Swish激活函数的性能优于ReLU、L-ReLU和P-ReLU函数。

2.1.2. Swish-FER-CNNs中的反向传播算法

${u}^{\left(i\right)}=\text{Swish}\left({Α}^{\left(i\right)}\right)$ (7)

$\frac{\partial {u}^{\left(n\right)}}{\partial {u}^{\left(i\right)}}=\sum _{i,j\in Pa\left({u}^{\left(i\right)}\right)}\frac{\partial {u}^{\left(n\right)}}{\partial {u}^{\left(i\right)}}\frac{\partial {u}^{\left(i\right)}}{\partial {u}^{\left(j\right)}}$ (8)

2.2. Swish-FER-CNNs网络模型

Table 1. Swish-FER-CNNs network architecture

3. 数据集

4. 实验分析

Figure 4. Training accuracy

Figure 5. Training loss function

${P}_{i}=\frac{T{P}_{i}}{T{P}_{i}+F{N}_{i}}=\frac{T{P}_{i}}{Su{m}_{i}}$ (9)

Swish-FER-CNNs学习模型对各类情绪识别的准确率如表2所示。

Table 2. Confusion matrix of emotion recognition accuracy of Swish-FER-CNNs model

Figure 6. Accuracy of test confusion matrix for each model

(10)

Table 3. Comparison of recognition accuracy

5. 结束语

