﻿ 聚类联合关联规则的数据挖掘技术 The Combining Technology of Data Mining Based on Clustering and Association Rules

Operations Research and Fuzziology
Vol.07 No.04(2017), Article ID:22884,7 pages
10.12677/ORF.2017.74018

The Combining Technology of Data Mining Based on Clustering and Association Rules

Han Li, Dongsheng Zhang*

Collage of Software, Henan University, Kaifeng Henan

Received: Nov. 9th, 2017; accepted: Nov. 21st, 2017; published: Nov. 30th, 2017

ABSTRACT

Although clustering analysis and association rules as two main application methods can achieve data mining, but both two methods have three different. The data type of clustering operation is continuous and association rules are discrete. Clustering reflects the description function of the mining and association rules reflect prediction/validation function. The output form of clustering is clusters, and association rules then output the lines of rule. At the same time, both of them have some complementary to each other. So, this paper combined the both methods. The clustering analysis for the set of samples was first executed. This processing will make samples for their respective category entity information. Then, run association rules mining according to the samples what with classification properties. The method show the potential knowledge further including causes of the formation of clustering and the relationship between clusters. The experiment shows that the mining technology has better effect and great value of application.

Keywords:Clustering, Association Rules, Data Mining, Machine Learning

1. 引言

2. 聚类联合关联规则的挖掘技术

2.1. 聚类分析

${S}_{t}=\underset{i=1}{\overset{{n}_{t}}{\sum }}{\left({x}_{it}-{\stackrel{¯}{x}}_{t}\right)}^{\prime }\left({x}_{it}-{\stackrel{¯}{x}}_{t}\right)$

$S=\underset{t=1}{\overset{k}{\sum }}{S}_{t}=\underset{t=1}{\overset{k}{\sum }}\underset{i=1}{\overset{n}{\sum }}{\left({x}_{it}-{\stackrel{¯}{x}}_{t}\right)}^{\prime }\left({x}_{it}-{\stackrel{¯}{x}}_{t}\right)$

2.2. 关联规则

“可能性比较高”的界定方法，则采用支持度和置信度来表述：

L[1]={large 1-itemsets};

for (k=2; L[k-1]≠Φ; k=k+1) do

C[k]=apriori_gen(L[k-1]); //构造候选项集

for all transactions t∈D do

C[t]=subset(C[k], t);

//搜索事务t中包含的候选项集

for all C∈C[t] do C.sup=C.sup+1; end for

//计算支持数

end for

L[k]={ C∈C[k] | C.sup>=minsup};

//得到K阶大项集

end for

L=U[k] L[k];

insert into C[k]

select P[1], P[2], ∙∙∙, P[k − 1], Q[k − 1]

from L[k − 1] P, L[k − 1] Q

where P[1]= Q[1], ∙∙∙, P[k − 2] = Q[k − 2], P[k − 1] < Q[k − 1]

for all itemsets C Î C[k] do

for all (k − 1) itemsets S of C do

if (SÏL[k − 1]) then delete C from C[k]

2.3. 联合运用

3. 实验数据与方法

3.1. 样本数据

Table 1. Function contrast of clustering and association rule

Table 2. Sample data

3.2. 数据变换

3.3. 聚类分析

3.4. 关联规则挖掘

4. 结果与讨论

Figure 1. Cluster analysis

Figure 2. Data mining results of association rules after clustering

Table 3. Clustering results analysis of ample data

ques-B = 14.8-16. 7 ==> Clust = clust-2

Clust = clust-2 ==> Teacher = D6203

5. 结语

The Combining Technology of Data Mining Based on Clustering and Association Rules[J]. 运筹与模糊学, 2017, 07(04): 170-176. http://dx.doi.org/10.12677/ORF.2017.74018

