﻿ 双向切割单/双面英文碎纸片拼接复原算法设计 Algorithm Design of Restoring Two-Way Single/Double-Sized Shredded Documents

Vol.05 No.02(2016), Article ID:17493,7 pages
10.12677/AAM.2016.52021

Algorithm Design of Restoring Two-Way Single/Double-Sized Shredded Documents

Chen Zhang, Shiyun Wang

Science Department, Shenyang Aerospace University, Shenyang Liaoning

Received: Apr. 14th, 2016; accepted: May 2nd, 2016; published: May 5th, 2016

ABSTRACT

This paper designs an algorithm to restore English shredded documents no matter they are single- sized or double-sized text files which are cut both vertically and horizontally. Firstly, we cluster the fragments which were located in the same line in original text files according to the structural features of English letters and the row spacing. Then, using l1 norm difference model, we attach the fragments in the same class. By this way, the scraps of paper in the same line can be restored as a whole crosscutting shredded document. Finally, we should splice the crosscutting shredded documents into a complete image. In the numerical test part, taking the 2013 national mathematics model contest problem as examples, our algorithm restores 209 pieces of English shredded documents. Numerical results show that the correct rate of clustering is over 93% which demonstrates the efficiency of the algorithm.

Keywords:Peak Weight, Row Spacing Weight, Clustering Credibility, Jffreys & Matusita Distance, l1 Norm

1. 引言

2. 单面英文双向切割的碎纸片的拼接复原算法

2.1. 预处理

Principle 1：任意两张列相邻的碎纸片，其相邻的两个向量中，黑色像素灰度值以及分布位置相近。例如，给定碎片，其最右边的列向量为；其余纸片为，其最左边的列向量为。记，则模板碎片与左右相邻的可能性最大。

Principle 2：处于同一行的碎纸片，具有相似的空白行间距特征，构成英文字母的黑色像素具有分布特征。

Principle 3：任意两张行相邻的横切碎纸片，也具有两种相似特征。具体的，若给定模板碎片的最后一行向量为零向量(图像中的表现为空白行)，且该碎片底端的零向量共有行，而与之相邻的碎纸片的顶端前行也为零向量，根据空白行间距固定的特征，；另一方面，若的最后一行不为零向量，其余纸片为及其最上边的行向量为，若，则指定的横切碎片与待拼接的横切碎片上下相邻的可能性最大。

2.2. 同行碎纸片的聚类

2.2.1. 聚类原则

(1)

Figure 1. The horizontal sum of figure 081.bmp

Figure 2. The horizontal sum of figure 077.bmp

Figure 3. The positions of five inevitable peaks (081.bmp, 077.bmp)

(2)

2.2.2. 模板碎片的更新

2.3. 类碎片序列的列拼接

2.4. 横切碎片的行拼接

Table 1. The results of row-clustering of single-sized text file

3. 数值算例

4. 小结

Algorithm Design of Restoring Two-Way Single/Double-Sized Shredded Documents[J]. 应用数学进展, 2016, 05(02): 159-165. http://dx.doi.org/10.12677/AAM.2016.52021

1. 1. Prandtstetter, M. and Raidl, G.R. (2009) Meta-Heuristics for Reconstructing cross Cut Shredded Text Documents. In-stitute of Computer Graphics and Algorithms Vienna University of Technology, GECCO’09, 349-356. http://dx.doi.org/10.1145/1569901.1569950

2. 2. Butler, P., Chakraborty, P. and Ramakrishan, N. (2012) The De-shredder: A Visual Analytic Approach to Reconstructing Shredded Documents. IEEE Symposium on Visual Analytics Science and Technology, Seattle, 14-19 October 2012, 14-19. http://dx.doi.org/10.1109/vast.2012.6400560

3. 3. 鲁嘉琪. 基于文字信息的碎纸片拼接复原算法[J]. 现代电子技术, 2014, 37(4): 28-31.

4. 4. 尹玉萍, 刘万军, 张冲, 刘永超. 基于动态聚类的文档碎纸片自动拼接算法[J]. 计算机工程与应用, 2014, 50(18): 162-170.

5. 5. Sleit, A., Massad, Y. and Musaddaq, M. (2013) An Alternative Clustering Approach for Reconstructing cross Cut Shredded Text Documents. Telecommunication Systems, 52, 1491-1501. http://dx.doi.org/10.1007/s11235-011-9626-x

6. 6. 张宇, 刘雨东, 计钊. 向量相似度测量方法[J]. 声学技术, 2008, 28(4): 532-535.