簡易檢索 / 詳目顯示

研究生: 唐科南
Thompson, Keenan Nathaniel
論文名稱: Diversity and Quality: Comparing Decoding Methods with PEGASUS for Text Summarization
Diversity and Quality: Comparing Decoding Methods with PEGASUS for Text Summarization
指導教授: 陳柏琳
Chen, Berlin
口試委員: 陳冠宇
Chen, Kuan-Yu
陳柏琳
Chen, Berlin
劉士弘
Liu, Shi-Hung
口試日期: 2021/10/24
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 35
英文關鍵詞: summarization, diverse decoding, PEGASUS, ROUGE, lexical diversity
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202101759
論文種類: 學術論文
相關次數: 點閱:108下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • This thesis offers three major contributions: (1) It considers a number of diverse decoding methods to address degenerate repetition in model output text and investigates what can be done to mitigate the loss in summary quality associated with the use of such methods. (2) It provides evidence that measure of textual lexical diversity (MTLD) is as viable tool as perplexity is for comparing text diversity in this context. (3) It presents a detailed analysis of the strengths and shortcomings of ROUGE, particularly in regard to abstractive summarization. To explore these issues the work analyzes the results of experiments run on the CNN/DailyMail dataset with the PEGASUS model.

    Abstract 2 Table of Contents 3 1. Introduction 5 1.1 Background and Motivation 5 2. Related Works 7 2.1 Approaches and Models 7 2.1.1 Extractive, Abstractive, and Hybrid Approaches 7 2.1.2 PEGASUS 9 2.1.2.1 Diagram | PEGASUS Architecture 10 2.2 Metrics 11 2.2.1 Summary Quality 11 2.2.1.1 Formula | ROUGE-N 12 2.2.1.2 Formula | ROUGE-L 12 2.2.2 Lexical Diversity 15 2.3 Neural Text Degeneration 16 2.3.1 Formula | Top-k 18 2.3.2 Formula | Temperature 18 2.3.3 Formula | Nucleus Sampling 19 3. Methodology 20 3.1 Environment 20 3.2 Metrics 20 3.2.1 ROUGE 20 3.2.2 MTLD 20 3.3 Dataset 21 3.4 Models 21 3.4.1 PEGASUS 21 3.5 Experiments 22 3.5.1 Baselines 22 3.5.2 Diverse Decoding Strategies 22 4. Results and Discussion 24 4.1 Table | Large | All 24 4.2 Table | Fine-tuned | All 25 4.3 Table | PEGASUS Paper Results 26 4.4 Sample | Fine-tuned | k = 40 27 4.5 Table | Fine-tuned | Nucleus Sampling 28 4.6 Sample | Fine-tuned | k = 640, k = 640 and t = 0.7 30 4.7 Sample | Fine-tuned | p = 0.80, 0.85, 0.90, 0.95 31 5. Conclusion 33 Bibliography 34

    Deutsch, Daniel, and Dan Roth. “Understanding the Extent to Which Summarization Evaluation Metrics Measure the Information Quality of Summaries.” ArXiv:2010.12495 [Cs], Oct. 2020. arXiv.org, http://arxiv.org/abs/2010.12495.
    Ganesan, Kavita. “ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks.” ArXiv:1803.01937 [Cs], Mar. 2018. arXiv.org, http://arxiv.org/abs/1803.01937.
    Holtzman, Ari, et al. “The Curious Case of Neural Text Degeneration.” ArXiv:1904.09751 [Cs], Feb. 2020. arXiv.org, http://arxiv.org/abs/1904.09751.
    Huang, Dandan, et al. “What Have We Achieved on Text Summarization?” ArXiv:2010.04529 [Cs], Oct. 2020. arXiv.org, http://arxiv.org/abs/2010.04529.
    Ippolito, Daphne, et al. “Comparison of Diverse Decoding Methods from Conditional Language Models.” ArXiv:1906.06362 [Cs], June 2019. arXiv.org, http://arxiv.org/abs/1906.06362.
    Lin, Chin-Yew. ROUGE: A Package for Automatic Evaluation of Summaries. ACL 2004, 2004.
    McCarthy, Philip M., and Scott Jarvis. “MTLD, Vocd-D, and HD-D: A Validation Study of Sophisticated Approaches to Lexical Diversity Assessment.” Behavior Research Methods, vol. 42, no. 2, 2, May 2010, pp. 381–92. DOI.org (Crossref), https://doi.org/10.3758/BRM.42.2.381.
    Ng, Jun-Ping, and Viktoria Abrecht. “Better Summarization Evaluation with Word Embeddings for ROUGE.” ArXiv:1508.06034 [Cs], Aug. 2015. arXiv.org, http://arxiv.org/abs/1508.06034.
    See, Abigail, et al. “Get To The Point: Summarization with Pointer-Generator Networks.” ArXiv:1704.04368 [Cs], Apr. 2017. arXiv.org, http://arxiv.org/abs/1704.04368.
    Welleck, Sean, Ilia Kulikov, Jaedeok Kim, et al. “Consistency of a Recurrent Language Model With Respect to Incomplete Decoding.” ArXiv:2002.02492 [Cs, Stat], Oct. 2020. arXiv.org, http://arxiv.org/abs/2002.02492.
    Welleck, Sean, Ilia Kulikov, Stephen Roller, et al. “Neural Text Generation with Unlikelihood Training.” ArXiv:1908.04319 [Cs, Stat], Sept. 2019. arXiv.org, http://arxiv.org/abs/1908.04319.
    Zhang, Jingqing, et al. “PEGASUS: Pre-Training with Extracted Gap-Sentences for Abstractive Summarization.” ArXiv:1912.08777 [Cs], July 2020. arXiv.org, http://arxiv.org/abs/1912.08777.

    下載圖示
    QR CODE