簡易檢索 / 詳目顯示

研究生: 洪振恩
Hung, Jhen-EN
論文名稱: 旅遊飯店評論關注面向分析
Analysis of Aspects from Travel Hotel Reviews
指導教授: 侯文娟
Hou, Wen-Juan
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 64
中文關鍵詞: 旅遊評論面向意見探勘監督式學習自然語言處理機器學習特徵選取
英文關鍵詞: Travel reviews, Aspect opinion mining, Supervised learning, NLP, Machine learning, Feature selection
DOI URL: http://doi.org/10.6345/NTNU201900478
論文種類: 學術論文
相關次數: 點閱:217下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 網路越來越發達的時代,我們能夠更便利性的透過上網瀏覽消費者所留下的評論,讓許多人在進行購物、訂房或是訂位前會習慣上網先查詢相關評價再來做最後決定,希望購買的物品能滿足自己的需求。店家也希望消費者在購物或是體驗過後能上網留下寶貴的意見,所以這些評論能夠吸引更多人關注並且提供店家維持品質和改善的方向。在每篇評論中包含了使用者所給予的意見,通常會評論著許多不同的面向,因此從大量的使用者評論中自動分類出每間店家中所代表的關注面向,是本研究的首要目標;另外也嘗試透過各項特徵對店家進行自動分類面向詞(Aspect term),讓消費者可以知道每間店家評論中關鍵的面向詞是在哪個關注面向,是本篇論文的另一項研究目標。
    本論文使用的資料來自於TripAdvisor國際旅遊評論網站,實驗訓練資料選自彰化市知名7間飯店,而測試資料選自台北2間飯店。首先使用中央研究院中文斷詞系統先將評論進行斷詞處理,再從資料評論中找出關鍵特徵。研究目的有二:第一個目的是先篩選出每則評論文章中的面向詞(Aspect term)和形容詞,並依序自動歸類到四個不同面向類別(aspect category)上,統計出每間飯店在關注面向中所出現的關注面向會落在哪個類別上,讓消費者可以快速知道每間飯店所著重的優勢。
    第二個目的是自動分類面向詞,找出全部評論文章裡所出現的面向詞(Aspect term),將所有面向詞映射到向量空間後搭配每個面向詞附近共同出現的形容詞,當使用者想要查看所在意面向的評論文章時,不需要每篇評論都看過,而是能夠透過自動分類出來的四個主題面向快速找到有關此面向的評論文章。本研究在關注面向部分,利用SVM訓練模型及預測結果,可以得到不錯的準確率。

    With the rapid development of the Internet, we can more easily browse consumers’ comments. Many people will get used to surf the Internet before making purchases, making reservations or booking. The evaluation help for making final decision, and hope that the items will correspond to people needs. The store also hopes that consumers can write some valuable opinions after shopping or experiencing, so these comments can attract more attention and provide the sellers with quality and improvement. Each comment contains the opinions given by users and comments have many different aspects. Therefore, the first goal of the study is to automatically classify the user’s comments from a large number of user comments. Furthermore, if we can find proper features to automatically categorize the targets aspect terms, then consumers can know which key face-to-face words in each targets’ comments are oriented upwards. This is the second research goal of the thesis.
    The dataset used in this thesis are from the TripAdvisor International Tourism Review website. The experimental training data are selected from 7 well-known hotels in Changhua and the test data are selected from 2 hotels in Taipei. First, we use the Academia Sinica Chinese word-segmentation system to process the comments, and then find the keywords from the data reviews. There are two research purposes: the first purpose is to screen out the aspect terms and emotional words in each review article, and automatically classify them into four different aspect categories, and then count each hotel's aspect focused on which category, so that consumers can quickly know the advantages of each hotel.
    The second purpose is to automatically classify the aspect terms, find the aspect terms that appear in all comment articles, and map all the aspect terms to the vector space and match the adjectives that appear near each aspect term. When users want to view the comment articles for the specific hotel, users don't need to read every comment. Instead, users can quickly find the review articles about this topic through the four categories that are automatically classified, and you can analyze each aspect in more detail. This study focuses on the aspect categorization, using the SVM to train the models and to predict results, and reaches a good accuracy.

    摘要 I Abstract III 附表目錄 VII 附圖目錄 VIII 第一章 緒論 1 第一節 研究背景與動機 1 第二節 研究目的 2 第三節 論文架構 3 第二章 文獻探討 4 第一節 中文斷詞系統 4 第二節 情緒分析 8 第三節 SemEval-2014 Task 4 11 第四節 支持向量機(Support Vector Machine) 13 第三章 研究方法與步驟 15 第一節 研究方法架構 15 第二節 實驗語料庫 19 第三節 斷詞與詞性標記 24 第四節 情緒分析 27 第五節 特徵值擷取(feature extraction) 28 第六節 Word2vec 33 第七節 分類-支持向量機(SVM) 35 第四章 實驗結果與分析 37 第一節 實驗語料庫 37 第二節 評估標準 39 第三節 實驗結果之一面向類別分析 41 第四節 面向詞向量化 47 第五節 實驗結果之二面向詞預測分析 48 第五章 結論與未來發展 58 參考文獻 60

    一、中文文獻
    中研院中文斷詞系統: http://ckipsvr.iis.sinica.edu.tw/
    李孟潔.(2009).利用機器學習作法之中文意見分析. 清華大學資訊工程學系學位論文, 1-34.
    林彤.(2017). 分析旅遊評論中之極性不一致性問題. 臺灣師範大學資訊工程學系,碩士論文。
    許先緯.(2018).旅遊評論關注面向與不一致分析研究. 臺灣師範大學資訊工程學系,碩士論文。
    結巴斷詞系統:https://github.com/fxsjy/jieba
    楊登堯.(2017). 利用臉書資訊探討網路新聞的吸引度及極性分析. 臺灣師範大學資訊工程學系,碩士論文。

    二、英文文獻
    Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
    Chen, Y. S., Chen, L. H., & Takama, Y. (2015, November). Proposal of lda-based sentiment visualization of hotel reviews. In Data Mining Workshop (ICDMW), 2015 IEEE International Conference on (pp. 687-693). IEEE.
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
    Esuli, A., & Sebastiani, F. (2006). “SentiWordNet: A publicly available resource for opinion mining”. In Proceedings of the 6th international conference on Language Resources and Evaluation (LREC’06), (pp.417–422).
    Gui, L., Yuan, L., Xu, R., Liu, B., Lu, Q., and Zhou, Y. (2014). Emotion Cause Detection with Linguistic Construction in Chinese Weibo Text. In Natural Language Processing and Chinese Computing (pp. 457-464).
    Hatzivassiloglou, V. and McKeown, K.R. (1997). Predicting the Semantic Orientation of Adjectives. In Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the European chapter of the association for computational linguistics (pp. 174-181).
    Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings if the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177).
    Kiritchenko, S., Zhu, X., Cherry, C., & Mohammad, S. (2014). NRC-Canada-2014: Detecting aspects and sentiment in customer reviews. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) (pp. 437-442).
    Ku, L. W., & Chen, H. H. (2007). Mining opinions from the Web: Beyond relevance retrieval. Journal of the Association for Information Science and Technology, 58(12), 1838-1850.
    Lu, B., & Tsou, B. K. (2010, July). Combining a large sentiment lexicon and machine learning for subjectivity classification. In Machine Learning and Cybernetics (ICMLC), 2010 International Conference on (Vol. 6, pp. 3311-3316). IEEE.
    Manek, A,S., and Shenoy, P.D.(2017). Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web 20(2), 135-154.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
    Mubarok, M.S., Adiwijaya, Aldhi, M.D. (2017). Aspect-based Sentiment Analysis to Review Products Using Naïve Bayes. In AIP Conference Proceedings 1867(pp. 601-608).
    Pang, B., Lee, L. and Vaithyanathan, S. (2002). Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume(pp. 79-86).
    Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., Pavlopoulos, J. & Androutsopoulos, I. (2014). Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) , pages 27–35
    Raut, V. B., & Londhe, D. D. (2014, November). Opinion mining and summarization of hotel reviews. In Computational Intelligence and Communication Networks (CICN), 2014 International Conference on (pp. 556-559). IEEE.
    Rathan, M., Vishwanath R.H., Venugopal, K.R., & Patnaik L.M.(2017). Consumer insight mining: Aspect based Twitter opinion mining of mobile phone reviews. Applied Soft Computing 68, 765-773.
    Shi, H. X., & Li, X. J. (2011, July). A sentiment analysis model for hotel reviews based on supervised learning. In Machine Learning and Cybernetics (ICMLC), 2011 International Conference on (Vol. 3, pp. 950-954). IEEE.
    Singh, V. K., Piryani, R., Uddin, A., & Waila, P. (2013, March). Sentiment analysis of movie reviews: A new feature-based heuristic for aspect-level sentiment classification. In Automation, computing, communication, control and compressed sensing (iMac4s), 2013 international multi-conference on (pp. 712-717). IEEE.
    Turney, P. D. (2002, July). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 417-424). Association for Computational Linguistics.
    Tan, S. and Zhang, J. (2007). An empirical study of sentiment analysis for Chinese documents. Expert Systems with Applications,34(4), (pp. 2622-2699).
    Wagner, J., Arora, P., Cortes, S., Barman, U., Bogdanova, D., Foster, J., & Tounsi, L. (2014). Dcu: Aspect-based polarity classification for semeval task 4. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)(pp. 223-229).

    無法下載圖示 本全文未授權公開
    QR CODE