研究生: |
石敬弘 Shih, Chin-Hong |
---|---|
論文名稱: |
基於類神經之關聯詞向量表示於文本分類任務之研究 Neural Relevance-aware Embedding For Text Categorization |
指導教授: |
陳柏琳
Chen, Berlin |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 75 |
中文關鍵詞: | 文本分類 、表示學習 、深度學習 、連體網路 、生成式對抗網路 |
英文關鍵詞: | Text Categorization, Representation Learning, Deep Learning, Siamese Networkws, Generative Adversarial Networks |
DOI URL: | https://doi.org/10.6345/NTNU202202684 |
論文種類: | 學術論文 |
相關次數: | 點閱:120 下載:47 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於資訊網路的蓬勃發展,人們在物聯網上存取文本資料的需求也與日俱增,因此文本分類在自然語言處理的領域中的應用為相當熱門的研究。目前,在文本分類中最為核心的問題為特徵表示的選擇,大部分的研究使用詞袋(Bag of words)模型做為文本的特徵表示,但詞袋模型無法有效的表達詞與詞之間的關係,進而失去了文本上的語意。
在本論文中,我們使用兩種新穎的類神經網路架構 : 連體網路(Siamese Nets)和生成式對抗網路(Generative Adversarial Nets), 在訓練過程中使模型能學習更為強健且帶有豐富語意的特徵表示。本論文實驗採用知名的分類資料庫,IMDB電影評論分類、20Newsgroups新聞群組分類,由一系列的情緒分析和主題分類的實驗結果顯示,藉由這些類神經網路所學習到的特徵表示可以有效地提昇文本分類的效能。
With the rapid global access to tremendous amounts of text data on the Internet, text categorization or classification has emerged as an important and hot research topic in the natural language processing (NLP) community with many applications. Currently, the foremost problem in text categorization would be feature representation, which is commonly based on the bag-of-words (BoW) model, where word unigrams, bigrams (n-grams) or some specifically designed patterns are typically extracted as the component features. It has been noted that the loss of word order raised by the BoW representations is particularly problematic on document categorization.
In order to leverage the influence of word order and proximity information on text categorization tasks, we explore a novel use of a Siamese nets and Generative adversarial nets for document representation and text categorization. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, we take advantage of these novel architectures for learning distributed vector representations of documents that can reflect the semantic relatedness.
[1] Feldman, R., & Sanger, J.: “The text mining handbook: advanced approaches in analyzing unstructured data.” (2007).
[2] Joachims T et al.: “Text categorization with support vector machines: Learning with many relevant features.” Machine learning: ECML-98, (1998).
[3] Cunningham H, Maynard D, Bontcheva K, et al.: “A framework and graphical development environment for robust NLP tools and applications.” ACL, (2002).
[4] LeCun Y, Bengio Y and Hinton G.: “Deep learning.” Nature, (2015).
[5] Salton G, Wong A, Yang C S.: “A vector space model for automatic indexing.” Communications of the ACM, (1975).
[6] Mikolov T, Yih W and Zweig G. : “Linguistic regularities in continuous space word representations.” NAACL, (2013).
[7] Hayes-Roth, Frederick, Donald Waterman, and Douglas Lenat.: “Building expert systems.” (1984).
[8] Stachniss, Cyrill, Giorgio Grisetti, and Wolfram Burgard.: “Information Gain-based Exploration Using Rao-Blackwellized Particle Filters.” (2005).
[9] Viola, Paul, and William M. Wells III.: “Alignment by maximization of mutual information.” (1997).
[10] Mantel, Nathan.: “Chi-square tests with one degree of freedom; extensions of the Mantel-Haenszel procedure.” (1963).
[11] Yitzhaki, Shlomo.: “Relative deprivation and the Gini coefficient.” The quarterly journal of economics, (1979).
[12] De Boer, Pieter-Tjerk, et al.: “A tutorial on the cross-entropy method.” Annals of operations research, (2005).
[13] Joachims, Thorsten.: “A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization.” No. CMU-CS-96-118. Carnegie-mellon univ pittsburgh pa dept of computer science, (1996).
[14] Lewis, David D.: “Naive (Bayes) at forty: The independence assumption in information retrieval.” European conference on machine learning. Springer, Berlin, Heidelberg, (1998).
[15] Masand, Brij, Gordon Linoff, and David Waltz.: “Classifying news stories using memory based reasoning.” Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, (1992).
[16] Weston, Jason, et al.: “Feature selection for SVMs.” Advances in neural information processing systems, (2001).
[17] Joachims, Thorsten.: “Making large-scale SVM learning practical.” (1998).
[18] Kohavi, Ron.: “Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid.” (1996).
[19] De Mántaras, R. López.: “A distance-based attribute selection measure for decision tree induction.” (1991).
[20] Chawla, Nitesh V.: “C4. 5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure.” Proceedings of the ICML, (2003).
[21] Friedl, Mark A., and Carla E. Brodley.: “Decision tree classification of land cover from remotely sensed data.” Remote sensing of environment, (1997).
[22] Maas, Andrew L., et al.: “Learning word vectors for sentiment analysis.” Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, (2011).
[23] Cardoso-Cachopo, Ana, and Arlindo L. Oliveira.: “An empirical comparison of text categorization methods.” SPIRE, (2003).
[24] Bengio, Yoshua, Aaron Courville, and Pascal Vincent.: “Representation learning: A review and new perspectives.” IEEE transactions on pattern analysis and machine intelligence, (2013).
[25] Brown, Peter F., et al.: “Class-based n-gram models of natural language.” Computational linguistics 18.4, (1992).
[26] Bengio, Yoshua, et al.: “A neural probabilistic language model.” Journal of machine learning research 3, (2003).
[27] Hinton, Geoffrey E.: “Learning distributed representations of concepts.” Proceedings of the eighth annual conference of the cognitive science society, (1986).
[28] Mikolov, Tomas, et al.: “Distributed representations of words and phrases and their compositionality.” Advances in neural information processing systems, (2013).
[29] Lawrence, Steve, et al.: “Face recognition: A convolutional neural-network approach.” IEEE transactions on neural networks, (1997).
[30] Hochreiter, Sepp, and Jürgen Schmidhuber.: “Long short-term memory.” Neural computation 9.8, (1997).
[31] Bromley, Jane, et al.: “Signature verification using a" siamese time delay neural network.” Advances in Neural Information Processing Systems, (1994).
[32] Chopra, Sumit, Raia Hadsell, and Yann LeCun.: “Learning a similarity metric discriminatively, with application to face verification.” Computer Vision and Pattern Recognitio, (2005).
[33] Mueller, Jonas, and Aditya Thyagarajan.: ”Siamese Recurrent Architectures for Learning Sentence Similarity.” AAAI, (2016).
[34] Goodfellow, Ian, et al.: “Generative adversarial nets.” Advances in neural information processing systems, (2014).
[35] Zhao, Junbo, Michael Mathieu, and Yann LeCun.: “Energy-based generative adversarial network.” (2016).