簡易檢索 / 詳目顯示

研究生: 楊登堯
Yang, Deng-Yao
論文名稱: 利用臉書資訊探討網路新聞的吸引度及極性分析
News Attraction and Polarity Analysis Using Facebook Information
指導教授: 侯文娟
Hou, Wen-Juan
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 70
中文關鍵詞: 自然語言處理情感分析中文剖析器語意字典
英文關鍵詞: NLP, sentiment analysis, Chinese parser, semantic dictionary
DOI URL: https://doi.org/10.6345/NTNU202202809
論文種類: 學術論文
相關次數: 點閱:148下載:30
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 過去人們獲取資訊的途徑只有從談話、書籍、報章雜誌等媒體,資訊量的收集速度緩慢且數量有限,然而現今網路的發達以及科技改良所賜,網路的方便性及發達帶給了這個社會資訊化。
    社群網站的興起(例如:facebook、twitter),讓許多人開始透過這些網路平台,迅速傳播新聞資訊或就生活上的知識進行交流與溝通。報紙雜誌等傳統媒體,也開始透過網路平台進行發佈。
    然而在資訊爆炸的時代,人們該如何從這些大量的報導中獲取想要或者喜歡的資訊,而媒體又該如何從新聞內文中適當的撰寫以便吸引閱聽人,讓人們能夠喜歡閱讀該報導,並且可以從這些報導當中,發掘現今人們的新聞喜好傾向,這些都是目前值得研究者探討且著墨的地方。
    本研究將首先利用情緒分析的技術,分析現在網路新聞內文經常使用哪些詞彙或語句,可激發閱聽人的情緒反應以增加其閱讀興趣。其次,為了更進一步了解新聞極性的趨勢,也就是正向的新聞比較受歡迎還是負向的新聞比較受歡迎,會先進行斷詞之後,利用TF-IDF值尋找出關鍵字,然後利用語料庫進行比對,得到正向詞與負向詞的資訊,接著再利用Facebook提供的讚數當作佐證,就可以看出現在人們是喜歡哪一類的新聞。
    研究結果發現,閱聽者比較常關注負向新聞,並且本研究利用圖斯勒(Marc Trussler)和索羅卡(Stuart Soroka)在加拿大麥基爾大學(McGill University)的實驗結果相比對,顯示和從心理系角度所做的研究,有相符的結果,進而佐證本研究的可信度。

    In the past, the way of people obtaining information is only from the conversation, books, newspapers and other media information collection, which is slow and limited in number. However, due to the development of network technologies, in the present, the vast amount of information can be retrieved conveniently from internet.

    Some community web sites (such as facebook , twitter)make many people start with these network platforms for the rapid dissemination of news or for exchange of knowledge on life. Traditional newspapers, magazines and other traditional media also begin to publish their reports on the network platform.

    In an era of information explosion, how people can get information that they want or like from these extensive reports and how news can attract readers are worthy of investigation. Furthermore, the preference tendency of negative news or positive news is also very important.

    The study will first take advantage of the sentiment analysis technology to analyze network news by extracting the frequently-used words or phrases in order to increase people’s interest in reading. Second, to further understand the trend of news polarity, that is, whether positive news is more popular than negative news or not. The study segments words, finds keywords using TF-IDF values, and then matches keywords with a sematic dictionary in order to get the polarity informations. Finally, use the number of ’like’ provided by Facebook as corroboration, the trend of news polarity that people like is shown.

    The study shows that readers more concerned about the negative news. Comparing to the psychological research of Trussler and Soroka in Macgill University in Canada, the result is consistent. It thereby furthermore support the confidence of this study.

    摘要 I Abstract III 目錄 VII 附表目錄 VIII 附圖目錄 IX 第一章 簡介 1 第一節 研究背景 1 第二節 研究動機 1 第三節 研究目的 2 第四節 論文組織 3 第二章 相關研究探討 4 第一節 大數據 4 第二節 斷詞系統 6 第三節 極性分類 (polarity classification) 8 第四節 情感分析(Sentiment Analysis) 9 第五節 NTUSD 10 第六節 TF-IDF 11 第三章 研究方法 14 第一節 緒論 14 第二節 研究資料 14 第三節 研究方法與架構 29 第四節 研究方法描述 32 第四章 實驗結果與分析 36 第一節 抓取facebook上蘋果新聞資料 36 第二節 內文進行斷詞 39 第三節 新增高吸引度與低吸引度語料庫以及擴充字典 40 第四節 TF-IDF在新聞內文裡面的關鍵字 56 第五章 結論與未來展望 65 參考文獻 67

    Bollier, D., & Firestone, C. M. (2010). The promise and peril of big data. Washington, DC:Aspen Institute, Communications and Society Program.

    Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679.

    Li, S., He, H., Xu, W. R., & Guo, J. (2009, July). Automatic Chinese sentiment word extraction based on Aximum Entropy. In Wavelet Analysis and Pattern Recognition, 2009. ICWAPR 2009. International Conference on (pp. 437-441). IEEE.

    Lu, B., & Tsou, B. K. (2010, July). Combining a large sentiment lexicon and machine learning for subjectivity classification. In Machine Learning and Cybernetics (ICMLC), 2010 International Conference on (Vol. 6, pp. 3311-3316). IEEE.

    Russim, P. (2011). Big data analytics. TDWI Best Practices Report, Fourth Quarter, 1-35.
    Sui, H., Jianping, Y., Hongxian, Z., & Wei, Z. (2012, December). Sentiment analysis of Chinese micro-blog using semantic sentiment space model. In Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on (pp. 1443-1447). IEEE.

    Wang, J. H., & Lee, C. C. (2011, October). Unsupervised opinion phrase extraction and rating in Chinese blog posts. In Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on (pp. 820-823). IEEE.

    Wang, B., Min, Y., Huang, Y., Liu, Y., Li, X., Sun, Y., & Sun, C. (2013, March). Chinese reviews sentiment classification based on quantified sentiment lexicon and fuzzy set. In Information Science and Technology (ICIST), 2013 International Conference on (pp. 677-680). IEEE.

    Yang, Y., & Zhou, Y. (2011, December). Chinese sentiment classification based on semantic structure of sentences. In Computer Science and Network Technology (ICCSNT), 2011 International Conference on (Vol. 3, pp. 1745-1749). IEEE.

    Zhai, Z., Liu, B., Wang, J., Xu, H., & Jia, P. (2012). Product feature grouping for opinion mining. IEEE Intelligent Systems, 27(4), 37-44.

    Zhai, Z., Xu, H., & Jia, P. (2010). An empirical study of unsupervised sentiment classification of chinese reviews. Tsinghua Science & Technology, 15(6), 702-708.

    Zhang, H., Yu, Z., Xu, M., & Shi, Y. (2012, October). An Improved Method to Building a Score Lexicon for Chinese Sentiment Analysis. In Semantics, Knowledge and Grids (SKG), 2012 Eighth International Conference on (pp. 241-244). IEEE.

    Zhuo, S., Wu, X., & Luo, X. (2014, August). Chinese text sentiment analysis based on fuzzy semantic model. In Cognitive Informatics & Cognitive Computing (ICCI* CC), 2014 IEEE 13th International Conference on (pp. 535-540). IEEE.

    一次搞懂大數據(上)
    https://www.bnext.com.tw/article/35807/BN-2015-03-31-151014-36

    中研院斷詞系統
    http://ckipsvr.iis.sinica.edu.tw/

    呂瑞男. (2015). 大數據分析於行銷策略應用之研究. 中山大學高階經營碩士班學位論文, 1-39.

    李凱斌(2011).使用 Google Adwords 之效益分析-以某工業電腦廠商為例。碩士論文, 元智大學。

    結巴斷詞
    https://github.com/fxsjy/jieba

    葉乃嘉(2013).大數據運用與企業競爭力之探索性研究。碩士論文,國立台灣大學。

    趙麗慧(2014).教育心理學-三化取向。台北:東華。

    游琇雯(2015).使用情緒分析於社群論壇消費者評論滿意度評估之研究—以 TripAdvisor 旅遊網站為例。碩士論文,國立中興大學。

    維基百科:TF-IDF介紹
    https://zh.wikipedia.org/wiki/Tf-idf

    下載圖示
    QR CODE