1. 研究目的与意义(文献综述)
1.1研究目的及意义
近年来,作为生物文献中最重要的实验结果数据的呈现载体,图像的重要性越来越被研究者们所关注。然而,随着大数据时代的到来,生物文献的数量逐年呈指数型增长,研究者们想要人工实时地跟进最新发表的文献资料变得越来越困难。随着数据挖掘技术的日益成熟,通过文本挖掘技术来对文章的内容进行提取和整理越来越受到欢迎。与此同时,图像种类的多样性和包含信息的丰富度也推动着研究者们选择从图像中提取更多有用的信息。因此,结合文本信息和图片信息对生物文献进行综合文本挖掘从而构建结构化数据库具有极强的实用性。
本课题以海量生物文献为研究场景,结合图形图像处理技术和文本挖掘技术,重点完成包含生物文献和图像在内的信息挖掘,构建一个包含从生物文献和图片中挖掘信息的结构化生物文献知识库系统。在本课题中,对输入的生物文献进行图像的提取以获得文献中的所有图片,并通过图像识别过滤掉非柱状图的图,再运用图像分割技术从图片中将识别为柱状图的子图分割出来,随后运用文本挖掘技术挖掘图像和文本中的信息,最后将得到的信息结构化,即为构建的数据库。
2. 研究的基本内容与方案
2.1目标(开发的系统概况描述)
本课题旨在以海量生物文献为研究场景,结合图形图像处理技术和文本挖掘技术,重点完成生物文献和其包含的图像的信息挖掘,构建一个基于数据挖掘的生物文献的结构化生物文献知识库系统的构建。
3. 研究计划与安排
(1)第1——2周:查阅相关文献,了解相关方面的研究,明确选题;
(2)第3——6周:进一步阅读文献,并分析和总结,完成参考文献的阅读笔记,并初步确定技术路线,完成并提交开题报告;(3)第7——12周:完成需求分析,进行系统各部分的算法分析、设计和实现,并进一步进行优化;(4)第13——14周:撰写论文初稿,修改论文,定稿并提交论文评审;(5)第15周:准备论文答辩。
4. 参考文献(12篇以上)
[1]R. Rodriguez-Esteban, L. loddifov, Figure mining for biomedical research, Bioinformatics 25 (16) (2009) 2082-2084.[2]Structured literature Image finder: Extracting information from text and images in biomedical literature.[3]Structured literature image finder:Parsing text and figures in biomedical literature.[4]Li,L. et al. (2008) A figure image processing system. Graphics recognition, recentadvances and new opportunities. Vol. 5046 of Lecture Notes in Computer Science,Springer, Berlin, pp. 191–201.[5]Xu,S. et al. (2008a) Yale Image Finder (YIF): a new search engine for retrieving biomedical images. Bioinformatics, 24, 1968–1970.[6]Hearst,M.A. et al. (2007a) BioText search engine: beyond abstract search. Bioinformatics, 23, 2196–2197J.[7]Peng, X. Y. Shi, Y. M. Sun, D. Y. Li, B. H. Liu, QTLMiner: QTL database curation by mining tables in literature[8]T. Kuhn, M. L. Nagy, TB. Luong, M. Krauthammer, Mining images in biomedical publictions: Detection and analysis of gel diagrams, journal of Biomedical Semantics.[9]Rafkind,B. et al. (2006) Exploring text and image features to classify images in bioscience literature. In Proceedings of the BioNLP Workshop on LinkingNatural Language Processing and Biology at HLT-NAACL, The Association forComputational Linguistics, New York, New York, pp. 73–80.[10]W.W. Cohen, R. Wang, R.F. Murphy, Understanding captions in biomedical publications, in: KDD’03: Proceedings of the Ninth ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining, ACM, New York, NY,USA, 2003, pp. 499–504.[11]R.F. Murphy, Z. Kou, J. Hua, M. Joffe, W.W. Cohen, Extracting and structuring subcellular location information from on-line journal articles: the subcellularlocation image finder, in: Proceedings of the IASTED International Conferenceon Knowledge Sharing and Collaborative Engineering, 2004, pp. 109–114.[12]Z. Kou, W.W. Cohen, R.F. Murphy, A stacked graphical model for associatingsub-images with sub-captions, in: Proceeding of Pacific Symposium on Biocomputing, World Scientific, 2007, pp. 257–268.[13]Natsu Ishii, Asako Koike, Figure Classification in Biomedical Literature towards Figure Mining, 2013 IEEE International Conference on Bioinformatics and Biomedicine, pp263-269.[14]Daehyun Kim, Hong Yu, Figure Text Extraction in Biomedical Literature, January 13, 2011, PLoS ONE 6(1).[15]Jianqiang Sheng, Songhua Xu, Weicai Deng, Xiaonan Luo, Novel Image Features for Categorizing Biomedical Images. 2012 IEEE International Conference on Bioinformatics and Biomedicine.[16]Songhua Xu 1, James McCusker 2, Martin Schultz 1, and Michael Krauthammer, Improving OCR Performance in Biomedical Literature Retrieval through Preprocessing and Postprocessing.[17]Songhua Xu ,and Michael Krauthammer, Boosting Text Extraction From Biomedical Images using Text Region Detection.[18]S. Xu and M. Krauthammer. A new pivoting and iterative text detection algorithm for biomedical images. Journal of Biomedical Informatics, 43(6):924–931, 2010.[19]Yu H, Liu F, Ramesh BP (2010) Automatic figure ranking and user interfacing for intelligent biomedical figure search. PLoS ONE[20]Kim D, Yu H (2009) Hierarchical image classification in the bioscience literature. AMIA Annual Symposium.[21]Robust Segmentation of Biomedical Figures for Image-based Document Retrieval, 2012, Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on, 2012(10), pp1-6,[22]Daehyun Kim, Balaji Polepalli Ramesh, and Hong Yu, Automatic Figure Classification in Bioscience Literature.2011
课题毕业论文、开题报告、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。