利用电子健康档案中时间序列数据建立的预测模型在改善疾病管理方面发挥着重要作用。由于时态数据的序列相关性和特征空间维度大等特点,机器学习和非深度神经网络等传统方法难以提供疾病的准确预测。最新工作表明,长短时记忆(long short term memory, LSTM)神经网络性能优于大多数传统的疾病预测方法。为了进一步提高预测精度,本文提出了一种将卷积神经网络(convolutional neural network, CNN)与LSTM相结合的混合深度学习神经网络框架。使用电子健康档案中真实数据集的研究结果表明,相比传统SVM,CNN和LSTM模型,该算法的预测性能得到显著提高。
Predictive models built using temporal data in electronic health records (EHRs) can potentially play a major role in improving management of diseases. Due to the sequence correlation and large feature space dimensions, traditional methods such as machine learning and non-deep neural networks are difficult to provide accurate predictions of disease. Recent works show that the long short term memory (LSTM) neural network outperforms most of those traditional methods for disease prediction problems. In this study, a hybrid deep learning neural network framework that combines convolutional neural network (CNN) with LSTM is proposed to further improve the pre-diction accuracy. Empirical studies using the real-world datasets in electronic health records have shown that using the proposed hybrid deep learning neural network for disease prediction signif-icantly improves predictive performance compared to the use of support vector machine (SVM) model, CNN and LSTM alone.
电子健康档案,长短时记忆网络,卷积神经网络,混合深度学习, Electronic Health Record Long Short Term Memory Neural Network Convolutional Neural
Network Hybrid Deep Learning基于混合深度学习算法的疾病预测模型
梁敏1,莫毓昌1*,林栋2,陆迁1,李宁宁1
1华侨大学数学科学学院,计算科学福建省高校重点实验室,福建 泉州
2福建中医药大学针灸学院,福建 福州
收稿日期:2019年12月30日;录用日期:2020年1月14日;发布日期:2020年1月21日
摘 要
利用电子健康档案中时间序列数据建立的预测模型在改善疾病管理方面发挥着重要作用。由于时态数据的序列相关性和特征空间维度大等特点,机器学习和非深度神经网络等传统方法难以提供疾病的准确预测。最新工作表明,长短时记忆(long short term memory, LSTM)神经网络性能优于大多数传统的疾病预测方法。为了进一步提高预测精度,本文提出了一种将卷积神经网络(convolutional neural network, CNN)与LSTM相结合的混合深度学习神经网络框架。使用电子健康档案中真实数据集的研究结果表明,相比传统SVM,CNN和LSTM模型,该算法的预测性能得到显著提高。
f t = σ ( W f h ⋅ h t − 1 + W f x ⋅ x t + b f ) (1)
i t = σ ( W i h ⋅ h t − 1 + W i x ⋅ x t + b i ) (2)
C ˜ t = tanh ( W c h ⋅ h t − 1 + W c x ⋅ x t + b c ) (3)
C t = f t ⋅ C t − 1 + i t ⋅ C ˜ t (4)
o t = σ ( W o t ⋅ h t − 1 + W o x ⋅ x t + b o ) (5)
h t = o t ⋅ tanh ( C t ) (6)
其中 C t , C t − 1 和 C ˜ t 分别表示当前单元状态值,上一时刻的单元状态值和当前单元状态值的更新。符号 f t , i t 和 o t 分别表示遗忘门,输入门和输出门。在适当的参数设置下,根据等式(4)~(6),基于 C ˜ t 和 C t 的值计算输出值 h t 。根据输出值与实际值之间的差值,所有的权重矩阵通过时间反向传播算法(back-propagation through time, BPTT)进行更新 [22]。
传统MLNN使用全连接策略在输入层和输出层之间建立神经网络,这意味着每个输出神经元都有机会与每个输入神经元进行交互。假设有m个输入神经元和n个输出神经元,权重矩阵有 m × n 个参数。CNN通过设置大小为 k × k 的卷积核大大减少权重矩阵的参数。CNN的两个属性提高了参数优化的训练效率;在相同的计算复杂度下,CNN能够训练具有更多隐藏层的神经网络,即深层神经网络。
时态卷积神经网络引入了特殊的一维卷积,适用于处理单变量时间序列数据。时态CNN不像传统CNN那样使用 k × k 卷积核,而是使用大小为 k × 1 的卷积核。经过时间卷积运算之后,原始的单变量数据集可以扩展为m维特征的数据集。这样,时态CNN将一维卷积应用于时间序列数据,并将单变量数据集扩展为多维提取的特征(图2中的第一阶段);扩展后的多维特征数据更适合使用LSTM进行预测。
梁 敏,莫毓昌,林 栋,陆 迁,李宁宁. 基于混合深度学习算法的疾病预测模型Disease Prediction Models Based on Hybrid Deep Learning Strategy[J]. 人工智能与机器人研究, 2020, 09(01): 16-23. https://doi.org/10.12677/AIRR.2020.91003
参考文献ReferencesWei, W.-Q., Teixeira, P.L., Mo, H., Cronin, R.M., Warner, J.L. and Denny, J.C. (2015) Combining Billing Codes, Clini-cal Notes, and Medications from Electronic Health Records Provides Superior Phenotyping Performance. Journal of the American Medical Informatics Association, 23, e20-e27. https://doi.org/10.1093/jamia/ocv130Henriksson, A., Zhao, J., Bostr?m, H. and Dalianis, H. (2015) Modeling Heterogeneous Clinical Sequence Data in Semantic Space for Adverse Drug Event Detection. IEEE International Conference on Data Science and Advanced Analytics, Paris, 19-21 October 2015, 1-8. https://doi.org/10.1109/DSAA.2015.7344867Hsieh, T.J., Hsiao, H.F. and Yeh, W.C. (2011) Forecasting Stock Markets Using Wavelet Transforms and Recurrent Neural Networks: An Integrated System Based on Artificial Bee Colony Algorithm. Applied Soft Computing, 11, 2510-2525. https://doi.org/10.1016/j.asoc.2010.09.007Socher, R., Lin, C.C., Manning, C. and Ng, A.Y. (2011) Parsing Natural Scenes and Natural Language with Recursive Neural Networks. Proceedings of the 28th International Confe-rence on Machine Learning, Bellevue, 28 June-2 July 2011, 129-136.Kong, W., Dong, Z.Y., Hill, D.J., Luo, F. and Xu, Y. (2018) Short-Term Residential Load Forecasting Based on Resident Behaviour Learning. IEEE Transactions on Power Systems, 33, 1087-1088. https://doi.org/10.1109/TPWRS.2017.2688178Yan, K., Du, Y. and Ren, Z. (2018) MPPT Perturbation Op-timization of Photovoltaic Power Systems Based on Solar Irradiance Data Classification. IEEE Transactions on Sus-tainable Energy, 10, 514-521. https://doi.org/10.1109/TSTE.2018.2834415Du, Y., Yan, K., Ren, Z. and Xiao, W. (2018) Designing Lo-calized MPPT for PV Systems Using Fuzzy-Weighted Extreme Learning Machine. Energies, 11, 2615. https://doi.org/10.3390/en11102615Funahashi, K.I. and Nakamura, Y. (1993) Approximation of Dynamical Systems by Continuous Time Recurrent Neural Networks. Neural Networks, 6, 801-806. https://doi.org/10.1016/S0893-6080(05)80125-XKrizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) Imagenet Classification with Deep Convolutional Neural Networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, 3-6 December 2012, 1097-1105.Almalaq, A. and Edwards, G.A. (2017) Review of Deep Learning Methods Applied on Load Forecasting. Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, Cancun, 18-21 December 2017, 511-516. https://doi.org/10.1109/ICMLA.2017.0-110Wang, J., Yu, L.C., Lai, K.R. and Zhang, X. (2016) Dimensional Sentiment Analysis Using a Regional CNN-LSTM Model. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7-12 August 2016, 225-230. https://doi.org/10.18653/v1/P16-2037Kumar, U. and Jain, V. (2010) Time Series Models (Grey-Markov, Grey Model with Rolling Mechanism and Singular Spectrum Analysis) to Forecast Energy Consumption in India. Energy, 35, 1709-1716. https://doi.org/10.1016/j.energy.2009.12.021Wang, Y.-W., Shen, Z.-Z. and Jiang, Y. (2018) Comparison of ARIMA and GM(1,1) Models for Prediction of Hepatitis B in China. PLoS ONE, 13, e0201987. https://doi.org/10.1371/journal.pone.0201987马晓梅, 史鲁斌, 其木格. 基于ARIMA乘积季节模型和Holt-Winters季节模型的梅毒月发病率预测[J]. 郑州大学学报(医学版), 2018, 53(1): 79-84.Zhang, Y.M., Luo, L. and Yang, J.C. (2019) A Hybrid ARIMA-SVR Approach for Forecasting Emergency Patient Flow. Journal of Ambient Intelligence and Humanized Computing, 10, 3315-3323. https://doi.org/10.1007/s12652-018-1059-xKann, B.H., Aneja, S. and Loganadane, G.V. (2018) Pretreatment Identification of Head and Neck Cancer Nodal Metastasis and Extranodal Extension Using Deep Learning Neural Networks. Scientific Reports, 8, Article No. 14036. https://doi.org/10.1038/s41598-018-32441-yGu, J.Y., Liang, L.Z. and Song, H.Q. (2019) A Method for Hand-Foot-Mouth Disease Prediction Using Geo Detector and LSTM Model in Guangxi, China. Scientific Reports, 9, Article No. 17928. https://doi.org/10.1038/s41598-019-54495-2Chae, S., Kwon, S. and Lee, D. (2018) Predicting Infectious Disease Using Deep Learning and Big Data. International Journal of Environmental Research and Public Health, 15, 1596. https://doi.org/10.3390/ijerph15081596Zeiler, M.D. (2012) ADADELTA: An Adaptive Learning Rate Method.Jozefowicz, R., Zaremba, W. and Sutskever, I. (2015) An Empirical Exploration of Recurrent Network Architectures. Proceedings of the 32nd International Conference on International Conference on Machine Learning, Lille, 6-11 July 2015, 2342-2350.Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735Werbos, P.J. (1990) Backpropagation through Time: What It Does and How to Do It. Proceedings of the IEEE, 78, 1550-1560. https://doi.org/10.1109/5.58337Ketkar, N. (2017) Convolutional Neural Networks. In: Deep Learning with Python, Springer, Berlin, 63-78. https://doi.org/10.1007/978-1-4842-2766-4_5Goodfellow, I., Bengio, Y., Courville, A. and Bengio, Y. (2016) Deep Learning. MIT Press, Cambridge, 1.