本文针对目前流行的BLAST算法存在的种子检测效率不高的问题,提出了一种不同于常规查询策略的并行多种子检测算法,并基于脉动阵列结构设计和实现了一种具备同时多种子检测与并行扩展能力的算法加速器,实现了对NCBI BLAST程序前两个阶段的加速。加速器阵列采用流水工作模式,能够动态计算所有单词的匹配点位置,避免了索引结构的存储开销;采用了阵列分组和并行种子收集策略减少了有效种子数量;采用组内种子合并策略避免了冗余扩展操作;采用多种子并行扩展策略实现了无阻塞的数据库搜索,显著提高了数据库搜索效率。实验结果表明,本设计在阵列规模和时钟频率上都优于相关研究工作,相对于目前主流的通用微处理器可获得20倍以上的加速效果。<br/>BLAST adopts index-based approach to detect the matches between two substrings by looking up a large table and processing one match per query. In this paper, we propose a systolic array approach to detect string matches without using looking up tables. The pipelining systolic array is implemented as a multi-seeds detection and parallel extension engine to accelerate the first two stages of NCBI BLAST algorithm. Different from the index-based approach, our implementation con- sumes little memory resources and eliminates redundant string extensions. Our FPGA implemen-tation achieves superior performance results in both of processing element number and clock frequency over related works in the area of FPGA BLAST accelerators. The experimental results show a speedup factor of more than 20× over the NCBI BLAST software running on a current main- stream PC platform.
Execution time (ms) and speed-up for different querie
阵列 规模
BLASTp
BLASTn
软件
硬件
加速比
软件
硬件
加速比
128
1901
1047
1.82
6625
1115
5.58
512
5603
1090
5.14
8904
1147
7.76
1 K
9327
1157
8.06
9953
1180
8.43
3 K
25132
1487
16.9
12360
1260
9.81
4 K(*)
32469
1620
20.0
13,531
1316
10.3
8 K(*)
61162
2207
27.2
15,344
1393
11.0
表2. BLASTp和BLASTn算法加速效果(时间单位:毫秒)
搜索时间增长很缓慢。主要原因是随着序列长度的增加,软件建立索引表和执行搜索过程的开销都显著增加,而硬件搜索时间等于数据库流过阵列的时间加上停顿时间(即L + S + 停顿时间)。在这三个参数中,S为常量(等于数据库包含的字符数),与S相比,请求序列长度L的增加可以忽略不计,而停顿时间与种子数量直接相关。由于采用了多种优化策略施,在目标数据库确定的情况下,L的增加并不会导致种子数量和扩展开销急剧增加,所以加速比会相应增大。当输入序列长度为3072 bps时,使用算法加速器执行BLASTp和BLASTn程序搜索目标数据库,可分别获得约17倍和10倍的加速比;而模拟结果显示,当序列长度为8 Kbps时,可获得27倍和11倍的加速效果。
5.3. 性能功耗比
目前单个微处理器的平均功耗约为70 W~ 95 W [25] ,而使用Xilinx ISE Xpower工具的功耗分析结果表明,算法加速器功耗不超过10 W,XC5VLX330芯片的最大功耗小于30 W,仅为微处理器功耗的30%左右。因此就性能功耗比而言,基于FPGA平台的细粒度并行BLAST算法加速器方案明显优于传统的基于通用微处理器的解决方案。
夏 飞,金国庆,张春雷, (2015) 具有同时多种子检测与并行扩展能力的BLAST算法加速器研究The Research on BLAST Accelerator with Parallel Multi-Seeds Detection and Extension. 计算生物学,01,1-10. doi: 10.12677/HJCB.2015.51001
参考文献 (References)ReferencesAltschul, S.F., Gish, W., Miller, M., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.Kim, T.K., et al. (2005) HGBS: A hardware-oriented grid blast system. Proceedings of 5th International Symposium on Cluster Computing and the Grid, 1, 520-526.Yutao, Q. and Feng, L. (2003) Cyberparablast: The parallelized blast web server. Proceedings of 2nd International Conference on Cyber-worlds (CW’03), 3-5 December 2003, 474-477.Farmerie, W.G., Hammer, J., Liu, L. and Schneider, M. (2003) Having a blast: Analyzing gene sequence data with blastquest. Proceedings of 14th International Workshop on Database and Expert Systems Applications (DEXA’03), Prague, 1-5 September 2003, 37-41.Bjornson, R., Sherman, A., Weston, S., Willard, N. and Wing, J. (2002) Turboblast: A parallel implementation of blast built on the turbo-hub. Proceedings of 16th International Parallel and Distributed Processing Symposium (IPDPS’02), Ft. Lauderdale, 15-19 April 2002, 183-190.Lin, H., Ma, X., Chandramohan, P., Geist, A. and Samatova, N. (2005) Efficient data access for parallel blast. Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS’05), 1, 8 p.Oehmen, C. and Nieplocha, J. (2006) Scalablast: A Scalable Implementation of BLAST for High-Performance Data- Intensive Bioinformatics Analysis. IEEE Transaction on Parallel and Distributed Systems, 17, 740-749.National Center for Biotechnology Information (2014) Notes on GenBankstatistics.
http://www.ncbi.nlm.nih.gov/genbank/statisticsBuhler, J.D., Lancaster, J.M., Jacob, A.C. and Chamberlain, R.D. (2007) Mercury Blastn: Faster DNA sequence comparison using a streaming hardware architecture. Proceedings of 3rd Annual Reconfigurable Systems Summer Institute (RSSI’07), Urbana, 17-20 July 2007, 10 p.Krishnanurthy, P., Buhler, J., Chamberlain, R., et al. (2007) Biosequence similarity search on the mercury system. Proceedings of 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP’04), 365-375. Journal of VLSI Signal Process System, 49, 101-121.Lancaster, J.M., Buhler, J.D. and Chamberlain, R.D. (2009) Acceleration of ungapped extension in Mercury BLAST. Journal of Microprocessors & Microsystems, 33, 281-289.Jacob, A., Lancaster, J., Buhler, J. and Chamberlain, R.D. (2007) FPGA-accelerated seed generation in Mercury BLASTP. Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Napa, 23-25 April 2007, 95-106.Muriki, K. and Underwood, K.D. and Sass, R. (2005) RC-BLAST: Towards a portable, cost-effective open source hardware implementation. Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, Denver, 4-8 April 2005, 196-203.Herbordt, M.C., Model, J., Gu, Y.F., Sukhwani, B. and Van Court, T. (2006) Single pass, blast-like, approximate string matching on FPGAs. Proceedings of 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Napa, 24-26 April 2006, 217-226.Lavenier, D., Xinchun, L. and Georges, G. (2006) Seed-based genomic sequence comparison using a FPGA/FLASH accelerator. Proceedings of IEEE International Conference on Field Programmable Technology, Bangkok, 13-15 December 2006, 41-48.Sotiriades, E. and Dollas, A. (2007) Design space exploration for the BLAST algorithm implementation. Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Napa, 23-25 April 2007, 323-326.Sotiriades, E., Kozanitis, C. and Dollas, A. (2006) FPGA based architecture for DNA sequence comparison and database search. Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium, Rhodes Island, 25-29 April 2006, 186-193.Chang, C. (2012) BLAST implementation on BEE2. Technical Report, Electrical Engineering and Computer Science University of California at Berkeley. http://bee2.eecs.berkeley.eduCLC desktop Hardware-acceleration. White Paper on CLC Bioinformatics Cube, 2006. http://www.clccube.comMitrion. Inc. (2009) Ncbi blast accelerator. http://www.mitrionics.comTimelogic Biocomputing Solisions (2010) Timelogic decypher BLAST engine intro-duction.
http://www.timelogic.com/decypher_blast.htmlNCBI BLAST software download, 2010. ftp://ftp.ncbi.nih.gov/blast/European Bioinformatics Institute, EBI (2008) EBI database web.
http://www.ebi.ac.uk/uniprot/database/download.htmlNational Center for Biotechnology Information (2009) BLAST database web. http://blast.ncbi.nlm.nih.gov/Blast.cgiGeneral purpose micro-processor power dissipation statistic. List of CPU power dissipation figures, 2014.
http://en.wikipedia.org/wiki/ListofCPUpowerdissipation