﻿ 影响不同种族患高血压的因素分析 Analysis of Factors Influencing Hypertension in Different Ethnic Groups

Hans Journal of Data Mining
Vol.06 No.03(2016), Article ID:18388,10 pages
10.12677/HJDM.2016.63013

Analysis of Factors Influencing Hypertension in Different Ethnic Groups

Mengting Hu, Yu Fei*

School of Statistics and Mathematics, Yunnan University of Finance and Economics, Yunnan Kunming

Received: Aug. 1st, 2016; accepted: Aug. 21st, 2016; published: Aug. 24th, 2016

ABSTRACT

Recently, it is more likely to have high blood pressure, and complications related to high blood pressure are dangerous. Complications of hypertension have gradually become one of the killers of modern health. In this paper, we use the United States health survey data in UCI database for analysis and processing. We dealt with each factor of people from different races using Logit Classification and the Random Forest Classification, and obtained the following conclusions: Regardless of race, age had significant effects on high blood pressure; For different ethnic groups, influence of other factors on hypertension is different.

Keywords:Hypertension, Logit Classification, Random Forest Classification

1. 引言

2006年傅传喜等分别利用Logistic回归和分类树分析对高血压危险因素进行分析得出高血压的主要危险因素为年龄、血脂以及肥胖，同时得到分类树分析较Logisic回归分析分类效果好 [7] 。

2010年杨洋用BP人工神经网络对辽宁省彰武县农村人群进行患病预测，并与Logistic回归模型进行比较，利用ROC曲线(receiver operator characteristic curve)评价人工神经网络模型的预测性能 [8] 。

2. 数据来源及数据描述

Table 1. Variable list

3. 实证研究

3.1. 模型原理及形式

1) 建立二分类Logit模型

(P为患高血压病的概率) (1)

2) 建立随机森林模型

3.2. 模型结果分析

1) 对种族为白人的数据进行模型结果分析

① Logit参数估计结果分析

Table 2. Effect of various factors on hypertension (white)

(2)

② 随机森林分析结果

2) 对种族为黑人的数据进行模型结果分析

① Logit参数估计结果分析

Figure 1. The importance of each factor to the influence of hypertension (white)

Table 3. Classification results of nhanes_1 data in random forest

Table 4. Effect of various factors on hypertension (black)

(3)

② 随机森林分析结果

3) 对种族为其他的数据进行模型结果分析

① Logit参数估计结果分析

② 随机森林分析结果

Figure 2. The importance of each factor to the influence of hypertension (black)

Table 5. Classification results of nhanes_2 data in random forest

Table 6. Effect of various factors on hypertension (other)

Figure 3. The importance of each factor to the influence of hypertension (other)

4. 结论

1) 国家自然科学基金项目“广义估计方程(GEE)框架下的回归诊断：基于均值和协方差结构同时拟合的研究”(11561071)。

2) 云南省哲学社会科学研究基地2015年重点项目“云南社会经济可持续发展竞争力指标体系研究”(JD2015ZD20)。

Analysis of Factors Influencing Hypertension in Different Ethnic Groups[J]. 数据挖掘, 2016, 06(03): 106-115. http://dx.doi.org/10.12677/HJDM.2016.63013

1. 1. WHO (2002) Reducing Risks Promoting Healthy Life. World Health Organization, Geneva, 1．

2. 2. 孙振球. 医学统计学[M]. 北京: 人民卫生出版社, 2007: 333-341.

3. 3. 李英华. 高血压的现状与流行[J]. 中华心血管病杂志, 2004(7): 456.

4. 4. Tian, J.Y., Cheng, Q., Song, X.M., et al. (2006) Birthweight and Risk of Type-2diabetes, Abdominal Obesity and Hypertension among Chinese Adults. European Journal of Endocrinology, 155, 601-607. http://dx.doi.org/10.1530/eje.1.02265

5. 5. Ning, G., Su, J., Li, Y., et al. (2006) Artificial Neural Network Based Model for Cardiovascular Risk Stratification in Hypertension. Medical and Biological Engineering and Computing, 44, 202-208. http://dx.doi.org/10.1007/s11517-006-0028-2

6. 6. Ture, M., Kurt, I., Yavuz, E. and Kurum, T. (2005) Comparison of Multiple Prediction Models for Hypertension (Neural Networks, Logistic Regression and Flexible Discriminant Analyses). Anadolu Kardiyoloji Dergisi, 5, 24-28.

7. 7. 傅传喜, 马文军, 梁建华. 高血压危险因素logistic回归与分类树分析[J]. 中华疾病控制杂志, 2006, 10(3): 652-952.

8. 8. 杨洋. 利用人工神经网络模型预测原发性高血压的研究[D]: [硕士学位论文]. 北京: 中国医科大学, 2010.

9. 9. 吴喜之. 复杂数据统计方法: 基于R的应用(第2版) [M]. 北京: 中国人民大学出版社, 2013: 63-65.

*通讯作者。