研究生: |
周伯冠 JHOU, BO-GUAN |
---|---|
論文名稱: |
使用深度學習方法於產品評論之建議探勘 Mining Suggestions from Product Reviews by Deep Learning Methods |
指導教授: |
侯文娟
Hou, Wen-Juan |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 71 |
中文關鍵詞: | 深度學習 、產品評論 、建議探勘 、詞向量 、詞頻率 |
英文關鍵詞: | Deep Learning, Product Reviews, Suggestion Mining, Word Vector, Word Frequency |
DOI URL: | http://doi.org/10.6345/NTNU201900799 |
論文種類: | 學術論文 |
相關次數: | 點閱:155 下載:23 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著網路的普及,許多的資訊愈來愈容易被搜索。其中,產品評論相關的訊息也隨著社群網路的發展被公開在社群中。這類的社群包含,論壇、社群網站、和產品官方網站等,甚至有組織專門在蒐集這些評論後組成評論相關的網站,並且將評論分類,給受用戶查看。消費者可以在上述的網站中查看產品的使用狀況與心得和是否符合自身所需,再決定是否購買;產品提供者也可以透過評論持續收集使用者的使用狀況與心得,對產品進行迭代設計,對產品進行改良以符合大眾的需求。
本研究將評論分為建議句 (suggestion)和非建議句 (non-suggestion)。使用 Stanford Core NLP 斷詞系統將文本以詞為單位進行處理;詞的表達方式分為兩種,詞向量與詞頻率;模型採用深度學習類神經網路,分為全連結類神經網路 (Full Connected Neural Networks, FCNN)、卷積類神經網路 (Convolutional Neural Networks, CNN)與長短期記憶類神經網路 (Long Short-Term Memory, LSTM)。本研究以 Z-Score 方式標準化詞頻率表達式,並用全連結類神經網路訓練,此架構即可和利用詞向量在卷積類神經網路與長短期記憶類神經網路有差不多效果,但速度上快非常多。此外,本研究提供結合詞向量與詞頻率表達式在上述三種模型上訓練,進而對結果做比較分析。效能的評估方式以精準率 (Precision)、召回率 (Recall) 和 F1 分數 (F1-measure, F1)作比較。
With the popularity of the Internet, much of the information are searched easily. The information related to product reviews is also pushed to the social community, such as forums, social media, and official product websites. There are some organizations that collect the comments to form a review-based website and classify the comments to potential customers. Customers can see the usage and experience of the products to determine whether the products satisfy their needs to decide whether they purchase the products. The product providers can collect the user's usage and experience through comments, and iterate the product design to improve the product and satisfy the needs of customers.
This study divided comments into two class, suggestion and non-suggestion. The Stanford Core NLP can segment corpus in word units. The expression of words were divided into two types, word vector and word frequency. The model uses deep learning neural network, which include Fully Connected Neural Networks (FCNN), Convolutional Neural Networks (CNN) and Long-Short Term Memory (LSTM) Neural Networks. This study normalized the word frequency in the Z-Score and trained it with a fully connected neural network. This architecture's performance can be similar to the architecture of word vectors in convolutional neural networks and long-short term memory neural networks, but the speed is very fast. In addition, this study provides a combination of word vector and word frequency. The method of performance evaluation is based on Precision, Recall, and F1-measure (F1).
Brun, C., & Hagege, C. (2013). Suggestion Mining: Detecting Suggestions for Improvement in Users' Comments. Research in Computing Science, 70(79.7179), 5379-62.
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of machine learning research, 3(Feb), 1137-1155.
Dong, L., Wei, F., Duan, Y., Liu, X., Zhou, M., & Xu, K. (2013, June). The automated acquisition of suggestions from tweets. In Twenty-Seventh AAAI Conference on Artificial Intelligence.
Fernández, A. M., Esuli, A., & Sebastiani, F. (2016). Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification. Journal of
artificial intelligence research, 55, 131-163.
Golchha, H., Gupta, D., Ekbal, A., & Bhattacharyya, P. (2018). Helping each Other: A Framework for Customer-to-Customer Suggestion Mining using a Semi-supervised Deep Neural Network. arXiv preprint arXiv:1811.00379.
Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1-167.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013a). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013b).
Negi, S., & Buitelaar, P. (2015). Towards the extraction of customer-to-customer suggestions from reviews. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 2159-2167).
Negi, S., Asooja, K., Mehrotra, S., & Buitelaar, P. (2016). A study of suggestions in opinionated texts and their automatic detection. In Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics (pp. 170-178).
Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., ... & Hsu, M. C. (2004). Mining sequential patterns by pattern-growth: The prefixspan approach. IEEE Transactions on knowledge and data engineering, 16(11), 1424-1440.
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
Ramanand, J., Bhavsar, K., & Pedanekar, N. (2010, June). Wishful thinking: finding suggestions and'buy'wishes from product reviews. In Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text (pp. 54-61). Association for Computational Linguistics.
Rendle, Steffen. "Factorization machines with libfm." ACM Transactions on Intelligent
Systems and Technology (TIST) 3.3 (2012): 57.