簡易檢索 / 詳目顯示

研究生: 王涵
Wang, Han
論文名稱: Timeline Summarization for Event-related Facts and Public Issues on Chinese Social Media Platform
Timeline Summarization for Event-related Facts and Public Issues on Chinese Social Media Platform
指導教授: 柯佳伶
Koh, Jia-Ling
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 62
中文關鍵詞: Timeline summarizationSub-event detectionText data mining
英文關鍵詞: Timeline summarization, Sub-event detection, Text data mining
DOI URL: https://doi.org/10.6345/NTNU202203015
論文種類: 學術論文
相關次數: 點閱:94下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 無中文摘要

    In this paper, we proposed an approach to automatically generate timeline summarization for sub-event discussions related to a query event without supervised learning. In order to select event-related sentences, we designed a two-stage method to extract representative entity terms in the event-related discussions and filter out most of the sentences semantically un-related to the query event. A rule-based method was applied to extract sentences which describing sub-events. After that, the discussions are assigned to the corresponding sub-events according to the semantic relatedness measure. Finally, according to the occurring time of each sub-event, the timeline summarization is organized.

    We evaluated the performance of the proposed method on the real-world datasets. The experiment results showed that each processing step perform effectively. Especially, most noise sentences could be filtered by the proposed method. Moreover, the final timeline summarization graded by users is proven to be useful to well understand the discussion trend of a sub-event

    ABSTRACT ii ACKNOWLEDGEMENT iii CONTENT iv LIST OF FIGURE v LIST OF TABLE vi Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Goal 3 1.3 Challenge 3 1.4 Method 3 1.5 Organization 5 Chapter 2 Related Works 7 2.1 Entity Extraction and Ranking 7 2.2 Event Detection and Classification 9 2.3 Document Summarization and Timeline Generation 12 Chapter 3 Event-related Sentence Selection 16 3.1 Entity Term Extraction 17 3.2 Topic Clustering of Entity Terms 24 3.3 Sentence Selection 29 Chapter 4 Sub-event Matching and Summarization 36 4.1 Sub-event Sentence Selection 37 4.2 Sub-event Matching 41 4.3 Aspect Discovering 44 4.4 Timeline Summarization 44 Chapter 5 Performance Evaluation 47 5.1 Experiment Set Up 47 5.2 Performance Evaluation of Sentence Selection 48 5.3 Performance Evaluation of Discussion Summarization 53 Chapter 6 Conclusion and Future Work 59 Reference 60

    REFERENCE
    [1] Jiaul H. P., “A Novel TF-IDF Weighting Scheme for Effective Ranking”, in Proceedings of the
    36th international ACM SIGIR conference on Research and development in information
    retrieval, SIGIR'13, Pages 343-352, 2013.
    [2] Tuan T., Calaudia N., Nattiya K., Ujwal G., and Avishek A., “Balancing Novelty and Salience:
    Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events”, in
    Proceedings of the 24th ACM International on Conference on Information and Knowledge
    Management, CIKM'15, Pages 1201-1210, 2015.
    [3] Min P., Jiahui Z., Xuhui L., Jiajia H., Hua W., Yanchun Z., “Central Topic Model for Event
    oriented Topics Mining in Microblog Stream”, in Proceedings of the 24th ACM International
    on Conference on Information and Knowledge Management, CIKM'15, Pages 1611-1620,
    2015.
    [4] R. Reinanda, E. Meij, M. de Rijke, “Mining, Ranking and Recommending Entity Aspects”, in
    Proceedings of the 38th International ACM SIGIR Conference on Research and Development
    in Information Retrieval, SIGIR'15, Pages 263-272, 2015.
    [5] D. Pohl, A. Bouchachia, H. Hellwagner, “Automatic Sub-event Detection in Emergency
    Management Using Social Media”, in Proceedings of the 21st International Conference on
    World Wide Web (short paper), WWW'12, Pages 683-686, 2012.
    [6] M. Avvenuti, C. Meletti, S. Cresci, M. Tesconi, A. Marchetti, “EARS (Earthquake Alert and
    Report System): a real Time Decision Support System for Earthquake Crisis Management”, in
    Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and
    data mining, KDD'14, Pages 1749-1758, 2014.
    [7] J. Chae, D. Thom, D. S. Ebert, H. Bosch, T. Ertl, Y. Jang and R. Maciejewski, “Spatiotemporal
    Social Media Analytics for Abnormal Event Detection and Examination using Seasonal-trend
    Decomposition”, in IEEE, Visual Analytics Science and Technology (VAST), Page143 - 152,
    2012.
    [8] T. Sakaki, M. Okazaki and Y. Matsuo, “Earthquake Shakes Twitter Users: Real-time Event
    Detection by Social Sensors”, in Proceedings of the 19th international conference on World
    wide web, WWW'10, Pages 851-860, 2010.
    [9] T. Althoff, X. L. Dong, K. Murphy, S. Alai, V. Dang, W. Zhang, “TimeMachine: Timeline
    Generation for Knowledge-Base Entities”, in Proceedings of the 21th ACM SIGKDD
    International Conference on Knowledge Discovery and Data Mining, KDD'15, Pages 19-28, 2015.
    [10] J. H. Paik, D. W. Oard, “A Fixed-Point Method for Weighting Terms in Verbose Informational
    Queries”, in Proceedings of the 23rd ACM International Conference on Conference on
    Information and Knowledge Management, CIKM'14, Pages 131-140, 2014.
    [11] S. Siersdorfer, P. K., H. A. and S. Z., “Who With Whom And How? – Extracting Large Social
    Networks Using Search Engines”, in Proceedings of the 24th ACM International on
    Conference on Information and Knowledge Management, CIKM'15, Pages 1491-1500, 2015.
    [12] M. Gamon, J. Apacible, T. Yano, P. Pantel and X. Song, “Identifying Salient Entities in Web
    Pages”, in Proceedings of the 22nd ACM international conference on Information &
    Knowledge Management, CIKM'13, Pages 2375-2380, 2013.
    [13] F. Wang, Z. Wang, Z. Li and J. Won, “Concept-based Short Text Classification and Ranking”,
    in Proceedings of the 23rd ACM International Conference on Conference on Information and
    Knowledge Management, CIKM'14, Pages 1069-1078, 2014.
    [14] D. Abhik and D. Toshniwal, “Sub-event Detection During Nature Hazards Using Features of
    Social Media Data”, in Proceedings of the 22nd International Conference on World Wide Web,
    WWW'13, Pages 783-788, 2013.
    [15] F. V. M. A. Goncalves, W. Martins and L. Rocha, “Parallel Lazy Semi-Naive Bayes Strategies
    for Effective and Efficient Document Classification”, in Proceedings of the 24th ACM
    International on Conference on Information and Knowledge Management, CIKM'15, Pages
    1071-1080, 2015.
    [16] L. Shou, Z. Wang, K. Chen and G. Chen, “Sumblr: Continuous Summarization of Evolving
    Tweet Streams”, in Proceedings of the 36th international ACM SIGIR conference on Research
    and development in information retrieval, SIGIR'13, Pages 533-542, 2013.
    [17] R. McCreadie, C. Macdonald and I. Ounis, “Incremental Update Summarization: Adaptive
    Sentence Selection based on Prevalence and Novelty”, in Proceedings of the 23rd ACM
    International Conference on Conference on Information and Knowledge Management,
    CIKM'14, Pages 301-310, 2014.
    [18] J. Li and C. Cardie, “Timeline Generation: Tracking individuals on Twitter”, in Proceedings of
    the 23rd international conference on World wide web, WWW'14, Pages 643-652, 2014.
    [19] T. Hirao, M. Nishino, Y. Yoshida, J. Suzuki, N. Yasuda and M. Nagata, “Summarizing a
    Document by Trimming the Discourse Tree”, in IEEE/ACM Transactions on Audio, Speech
    and Language Processing (TASLP) TASLP Homepage archive, Volume 23 Issue 11, Pages
    2081-2092, 2015.
    [20] O. Gross, A. Doucet and H. Toivonen, “Document Summarization Based on Word
    Association”, in Proceedings of the 37th international ACM SIGIR conference on Research &
    development in information retrieval (short paper), SIGIR'14, Pages 1023-1026, 2014.
    [21] Z. Wei and W. Gao, “Gibberish, Assistant, or Master? Using Tweets Linking to News for
    Extractive Single-Document Summarization”, in Proceedings of the 38th International ACM
    SIGIR Conference on Research and Development in Information Retrieval (short paper),
    SIGIR'15, Pages 1003-1006, 2015.
    [22] T. Mikolov, K. Chen, G. Corrado and J. Dean, “Efficient Estimation of Word Representations
    in Vector Space”, arXiv:1301 3781v3, 2013.
    [23] 中文斷詞系統, http://ckipsvr.iis.sinica.edu.tw/

    下載圖示
    QR CODE