研究生: |
楊登堯 Yang, Deng-Yao |
---|---|
論文名稱: |
利用臉書資訊探討網路新聞的吸引度及極性分析 News Attraction and Polarity Analysis Using Facebook Information |
指導教授: |
侯文娟
Hou, Wen-Juan |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 70 |
中文關鍵詞: | 自然語言處理 、情感分析 、中文剖析器 、語意字典 |
英文關鍵詞: | NLP, sentiment analysis, Chinese parser, semantic dictionary |
DOI URL: | https://doi.org/10.6345/NTNU202202809 |
論文種類: | 學術論文 |
相關次數: | 點閱:148 下載:30 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
過去人們獲取資訊的途徑只有從談話、書籍、報章雜誌等媒體,資訊量的收集速度緩慢且數量有限,然而現今網路的發達以及科技改良所賜,網路的方便性及發達帶給了這個社會資訊化。
社群網站的興起(例如:facebook、twitter),讓許多人開始透過這些網路平台,迅速傳播新聞資訊或就生活上的知識進行交流與溝通。報紙雜誌等傳統媒體,也開始透過網路平台進行發佈。
然而在資訊爆炸的時代,人們該如何從這些大量的報導中獲取想要或者喜歡的資訊,而媒體又該如何從新聞內文中適當的撰寫以便吸引閱聽人,讓人們能夠喜歡閱讀該報導,並且可以從這些報導當中,發掘現今人們的新聞喜好傾向,這些都是目前值得研究者探討且著墨的地方。
本研究將首先利用情緒分析的技術,分析現在網路新聞內文經常使用哪些詞彙或語句,可激發閱聽人的情緒反應以增加其閱讀興趣。其次,為了更進一步了解新聞極性的趨勢,也就是正向的新聞比較受歡迎還是負向的新聞比較受歡迎,會先進行斷詞之後,利用TF-IDF值尋找出關鍵字,然後利用語料庫進行比對,得到正向詞與負向詞的資訊,接著再利用Facebook提供的讚數當作佐證,就可以看出現在人們是喜歡哪一類的新聞。
研究結果發現,閱聽者比較常關注負向新聞,並且本研究利用圖斯勒(Marc Trussler)和索羅卡(Stuart Soroka)在加拿大麥基爾大學(McGill University)的實驗結果相比對,顯示和從心理系角度所做的研究,有相符的結果,進而佐證本研究的可信度。
In the past, the way of people obtaining information is only from the conversation, books, newspapers and other media information collection, which is slow and limited in number. However, due to the development of network technologies, in the present, the vast amount of information can be retrieved conveniently from internet.
Some community web sites (such as facebook , twitter)make many people start with these network platforms for the rapid dissemination of news or for exchange of knowledge on life. Traditional newspapers, magazines and other traditional media also begin to publish their reports on the network platform.
In an era of information explosion, how people can get information that they want or like from these extensive reports and how news can attract readers are worthy of investigation. Furthermore, the preference tendency of negative news or positive news is also very important.
The study will first take advantage of the sentiment analysis technology to analyze network news by extracting the frequently-used words or phrases in order to increase people’s interest in reading. Second, to further understand the trend of news polarity, that is, whether positive news is more popular than negative news or not. The study segments words, finds keywords using TF-IDF values, and then matches keywords with a sematic dictionary in order to get the polarity informations. Finally, use the number of ’like’ provided by Facebook as corroboration, the trend of news polarity that people like is shown.
The study shows that readers more concerned about the negative news. Comparing to the psychological research of Trussler and Soroka in Macgill University in Canada, the result is consistent. It thereby furthermore support the confidence of this study.
Bollier, D., & Firestone, C. M. (2010). The promise and peril of big data. Washington, DC:Aspen Institute, Communications and Society Program.
Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679.
Li, S., He, H., Xu, W. R., & Guo, J. (2009, July). Automatic Chinese sentiment word extraction based on Aximum Entropy. In Wavelet Analysis and Pattern Recognition, 2009. ICWAPR 2009. International Conference on (pp. 437-441). IEEE.
Lu, B., & Tsou, B. K. (2010, July). Combining a large sentiment lexicon and machine learning for subjectivity classification. In Machine Learning and Cybernetics (ICMLC), 2010 International Conference on (Vol. 6, pp. 3311-3316). IEEE.
Russim, P. (2011). Big data analytics. TDWI Best Practices Report, Fourth Quarter, 1-35.
Sui, H., Jianping, Y., Hongxian, Z., & Wei, Z. (2012, December). Sentiment analysis of Chinese micro-blog using semantic sentiment space model. In Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on (pp. 1443-1447). IEEE.
Wang, J. H., & Lee, C. C. (2011, October). Unsupervised opinion phrase extraction and rating in Chinese blog posts. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on (pp. 820-823). IEEE.
Wang, B., Min, Y., Huang, Y., Liu, Y., Li, X., Sun, Y., & Sun, C. (2013, March). Chinese reviews sentiment classification based on quantified sentiment lexicon and fuzzy set. In Information Science and Technology (ICIST), 2013 International Conference on (pp. 677-680). IEEE.
Yang, Y., & Zhou, Y. (2011, December). Chinese sentiment classification based on semantic structure of sentences. In Computer Science and Network Technology (ICCSNT), 2011 International Conference on (Vol. 3, pp. 1745-1749). IEEE.
Zhai, Z., Liu, B., Wang, J., Xu, H., & Jia, P. (2012). Product feature grouping for opinion mining. IEEE Intelligent Systems, 27(4), 37-44.
Zhai, Z., Xu, H., & Jia, P. (2010). An empirical study of unsupervised sentiment classification of chinese reviews. Tsinghua Science & Technology, 15(6), 702-708.
Zhang, H., Yu, Z., Xu, M., & Shi, Y. (2012, October). An Improved Method to Building a Score Lexicon for Chinese Sentiment Analysis. In Semantics, Knowledge and Grids (SKG), 2012 Eighth International Conference on (pp. 241-244). IEEE.
Zhuo, S., Wu, X., & Luo, X. (2014, August). Chinese text sentiment analysis based on fuzzy semantic model. In Cognitive Informatics & Cognitive Computing (ICCI* CC), 2014 IEEE 13th International Conference on (pp. 535-540). IEEE.
一次搞懂大數據(上)
https://www.bnext.com.tw/article/35807/BN-2015-03-31-151014-36
中研院斷詞系統
http://ckipsvr.iis.sinica.edu.tw/
呂瑞男. (2015). 大數據分析於行銷策略應用之研究. 中山大學高階經營碩士班學位論文, 1-39.
李凱斌(2011).使用 Google Adwords 之效益分析-以某工業電腦廠商為例。碩士論文, 元智大學。
結巴斷詞
https://github.com/fxsjy/jieba
葉乃嘉(2013).大數據運用與企業競爭力之探索性研究。碩士論文,國立台灣大學。
趙麗慧(2014).教育心理學-三化取向。台北:東華。
游琇雯(2015).使用情緒分析於社群論壇消費者評論滿意度評估之研究—以 TripAdvisor 旅遊網站為例。碩士論文,國立中興大學。
維基百科:TF-IDF介紹
https://zh.wikipedia.org/wiki/Tf-idf