﻿ 基于协同过滤算法的电视用户个性化推荐 Personalized Recommendation of TV Users Based on Collaborative Filtering Algorithm

Xingxing Chen, Ruitao Li, Junhua Liao, Yanke Wu*

School of Mathematics and Computer Science, Guangdong Ocean University, Zhanjiang Guangdong

Received: Jul. 19th, 2019; accepted: Aug. 1st, 2019; published: Aug. 12th, 2019

ABSTRACT

In order to integrate and utilize existing data better, and improve the marketing effectiveness of TV program products, this paper processes the data of the watching information of TV users and establishes the user preference model to obtain three preference types of each user, and then uses the collaborative filtering algorithm to carry out personalized recommendation of individual users. In addition, we use K-means algorithm as well as the KNN algorithm to divide the users into groups and obtain the recommendation of each user group, thereby effectively solving the problem of personalized recommendation of the users.

Keywords:K-means, KNN, Collaborative Filtering, Personalized Recommendation

1. 引言

2. 协同过滤算法

Figure 1. Classification of television programs

1) 建立用户模型

2) 寻找目标用户的邻居

$Sim\left(i,j\right)=\frac{{\sum }_{q=1}^{15}\left({R}_{i,q}-\stackrel{¯}{{R}_{i}}\right)\left({R}_{j,q}-\stackrel{¯}{{R}_{j}}\right)}{\sqrt{{\sum }_{q=1}^{15}\left({R}_{i,q}-\stackrel{¯}{{R}_{i}}\right)\stackrel{¯}{{R}_{j}}}\sqrt{{\sum }_{q=1}^{15}{\left({R}_{j,q}-\stackrel{¯}{{R}_{j}}\right)}^{2}}}$ (1)

3) 产生目标用户的推荐产品

${P}_{i,q}=\stackrel{¯}{{R}_{i}}+\frac{{\sum }_{j\in U}sim\left(i,j\right)×\left({R}_{j,q}-\stackrel{¯}{{R}_{j}}\right)}{{\sum }_{j\in U}\left(|sim\left(i,j\right)|\right)}$ (2)

3. 数据来源及预处理

1) 去除回看信息重复和用户点播信息重复的数据、观看时间为空的记录和观看时长小于一分钟的记录。

2) 将其中冗余的属性以及与挖掘过程不相关的属性剔除得到处理后的数据，如表1所示。

Table 1. Form of data processing

4. 建立用户偏好模型

4.1. 构建用户偏好类型

Figure 2. The type of TV program

Table 2. Matrix: “user-type-duration”

4.2. 用户类型的偏好度

$interes{t}_{\left(i,q\right)}=\frac{{\sum }_{k=1}^{n}\text{\hspace{0.17em}}{t}_{label\left(i,q\right)}\left[k\right]}{{\sum }_{q=1}^{m}{\sum }_{k=1}^{n}\text{\hspace{0.17em}}{t}_{label\left(i,q\right)}\left[k\right]}$ (3)

$Loyalt{y}_{\left(i,q\right)}=\frac{{\sum }_{k=1}^{n}\text{\hspace{0.17em}}{t}_{label\left(i,q\right)}\left[k\right]}{{\sum }_{k=1}^{n}\text{\hspace{0.17em}}{t}_{program\left(i,q\right)}\left[k\right]}$ (4)

$Labe{l}_{\left(i,q\right)}={\omega }_{1}Loyalt{y}_{\left(i,q\right)}+{\omega }_{2}interes{t}_{\left(i,q\right)}$ (5)

Table 3. These preference types for each user

5. 协同过滤算法对单个用户的推荐

5.1. 建立用户模型

5.2. 寻找目标用户的邻居

5.3. 产生目标用户的推荐类型

5.4. 实现用户的节目推荐

Table 4. Recommended program for individual user

6. 用户的打包推荐

6.1. K-means聚类分析与KNN算法进行用户分群

Table 5. Clustering center of K-means algorithm

Table 6. User groups of individual user

Table 7. Preference types of each user group

6.2. 节目产品的打包和推送

Table 8. Recommended program package for each user group

7. 结论及展望

1) 各个电视台的节目类型时间比例是根据各个电视台(分周一至周五和周末)的常规播放规律来进行计算的，与准确的节目类型播放规律会有一定的误差。

2) 评分较高的节目推荐给目标用户，往往会导致推送的不精准和有一大部分评分较低的节目没有用户去收看。

