簡易檢索 / 詳目顯示

研究生: 班法
Bamfa Ceesay
論文名稱: Event Extraction for Gene Regulation Network Using Statistical and Semantic Approaches
Event Extraction for Gene Regulation Network Using Statistical and Semantic Approaches
指導教授: 侯文娟
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 43
英文關鍵詞: Biological event extraction, Gene regulation network, Graph-based approach, Semantic approach, Statistical approach
論文種類: 學術論文
相關次數: 點閱:73下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Genic regulation networks are the primary study object in systems biology. They allow better understanding of the relationship between molecular mechanisms and cellular behavior. However, one of the bottlenecks in systems biology is the acquisition of an accurate genetic regulation network. In the recent years, the BioNLP community has produced systems for extracting genic interactions and Protein-Protein Interaction (PPI) from the literature. The sporulation network of the bacteria model for bacillus subtilis is very well studied. The automatic design of the gene regulation network is one of the main challenges in biology, because it is a crucial step forward in understanding the cellular regulation system.
    In this study, we present a description of a system on Gene Regulation Network (GRN) in bacteria and we use the data from the BioNLP’13 shared task (BIONLP-ST) on Event Extraction. For this work, we first propose a procedure to do biological event extraction combining a dependency graph-based method and a method using semantic analysis in Natural Language Processing (NLP). Then a second design, a statistical approach using Hidden Markov Model (HMM), is experimented.
    Dependency parsing is a significant and commonly used approach to finding out the dependency relationship between tokens in, for example, a sentence. We use dependency features to identify and classify our event trigger tokens using multi–class Support Vector Machine (SVMLight multiclass). However, the dependency features are not sufficient to give the semantic relationship between tokens with a sentence. Therefore, we develop a semantic analysis approach based on NLP techniques to capture more detail information and improve our result on event extraction.
    In our second design approach, we use a general statistical method via Markov’s logic instead of developing certain inferences and learning algorithms. Markov’s Model has achieved significant recognition in Natural Language Processing especially in the field of speech recognition.
    Our result shows that the graph-based approach obtains a better result on event extraction and produces a much better regulation network than the semantic analysis method. The combination of the two approaches has yet a much slightly better result than that with the individual approach. Moreover, the proposed statistical approach achieves a much better result than the combined and individual results of our graph-based and semantic analysis approaches.

    Abstract I Acknowledgments III Table of Contents IV List of Tables VI List of Figures VII 1. Introduction 1 2. Related Works 4 3. Architecture Overview 5 3.1 Overall Architecture 5 3.2 Experimental Data 7 4. Event Extraction Using Graph-Based Feature Sets 8 4.1 Parsing and Preprocessing 9 4.2 Graph Representation 11 4.3 Trigger Detection 11 4.4 Edge Detection 16 4.5 Semantic Processing 18 4.6 Experimental Result and Discussion 19 4.6.1 Evaluation Metrics 19 4.6.2 Result of the Event Extraction Using Graph-Based Feature Sets 20 5. Semantic Annotation Based on National Language Processing 22 5.1 Parsing and Preprocessing 22 5.2 Syntactic Annotation 23 5.3 Syntactic Analysis 24 5.4 Semantic Analysis 26 5.4.1 Triggers in a Noun Form 27 5.4.2 Triggers in a Verb Form 29 5.5 Experimental Results and Discussion 30 5.5.1 Result of Semantic Annotation Based on Natural Language Processing 30 6. Statistical Approach for Biological Event Extraction Using Markov’s Method 32 6.1 Preprocessing 34 6.2 Markov’s Logic Network for Event Prediction 34 6.3 Extraction of Events Using Logical Formulae 37 6.4 Experimental Results and Discussion 38 7. Conclusion 39 References 41

    BioNLP Shared Task. Available from http://2013.bionlp-st.org/

    Björne, Jari, Heimonen, Juho, Ginter, Filip, Airola, Antti, Pahikkala, Tapio and Salakoski, Tapio (2011). Extracting Contextualized Complex Biological Events with Rich Graph-based Feature Sets, Computational Intelligence, 27 (4), pages 541–557, 2011.

    Bui, Quoc-Chinh and Sloot, M.A. Peter (2011). Extracting Biological Events from Text Using Simple Syntactic Patterns, Proceedings of BioNLP Shared Task 2011 Workshop, pages 143–146, 2011.

    Gene Ontology. Available from http://www.geneontology.org/

    Huang, Yu-Ting, Yeh, Hsiang-Yuan, Cheng, Shih-Wu, Tu, Chien-Chih, Kuo, Chi-Li and Soo, Von-Wun (2006). Automatic Extraction of Information about the Molecular Interactions in Biological Pathways from Texts Based on Ontology and Semantic Processing, IEEE International Conference on System, Man, and Cybernetics, pages 3679–3684, 2006.

    Makhoul, John, Kubala, Francis, Schwartz, Richard and Weischedel, Ralph (1999). Performance Measures for Information Extraction, Proceedings of DARPA Broadcast News Workshop, pages 249–252, 1999.
    Martinez, David and Baldwin, Timothy (2011). Word Sense Disambiguation for Event Trigger Word Detection in Biomedicine, Proceedings of the Fourth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBio 2010), BMC Bioinformatics 2011, 12 (Suppl 2): S4, 2011.
    McClosky, David, Riedel, Sebastian, Surdeanu, Mihai, McCallum, Andrew and Manning, D. Christopher (2012). Combining Joint Models for Biomedical Event Extraction, BMC Bioinformatics 2012, 13(Suppl 11): S9, 2012.
    McClosky, David, Surdeanu, Mihai and Manning, D. Christopher (2011). Event Extraction as Dependency Parsing, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages 1626–1635, 2011.
    MeSH. Available from http://www.ncbi.nlm.nih.gov/mesh
    PubMed. Available from http://www.ncbi.nlm.nih.gov/pubmed
    Richardson, Matthew and Domingos, Pedro (2006). Markov Logic Networks, Machine Learning, Volume 62, pages 107–136, 2006.
    Riedel, Sebastian, Sætre, Rune, Chun, Hong-Woo, Takagi, Toshihisa and Tsujii, Jun’ichi (2011). Bio-Molecular Event Extraction with Markov Logic, Computational Intelligence, 27 (4), pages 558–582, 2011.
    Stanford full parser. Available from http://nlp.stanford.edu/software/lex-parser.shtml
    Stopword List. Available from http://jmlr.org/papers/volume5/lewis04a/
    aa11-smart-stop-list/english.stop
    SVMLight Multi-class. Available from http://www.cs.cornell.edu/people/tj/svm_light/
    svm_multiclass.html
    Tsochantaridis, Ioannis, Hofmann, Thomas, Joachims, Thorsten and Altun, Yasemin (2004). Support Vector Machine Learning for Interdependent and Structured Output Spaces, Proceedings of the Twenty-first International Conference on Machine Learning (ICML’04), ACM, Banff, Canada, pages 104–111, 2004.
    WordNet. Available from http://wordnet.princeton.edu/

    下載圖示
    QR CODE