簡易檢索 / 詳目顯示

研究生: 朱曼綾
Ju, Man-Ling
論文名稱: 中文網路留言之幽默偵測
Humor Detection in Chinese Online Comments
指導教授: 陳正賢
Chen, Cheng-Hsien
口試委員: 陳正賢
Chen, Cheng-Hsien
張瑜芸
Chang, Yu-Yun
謝承諭
Hsieh, Chen-Yu
口試日期: 2023/07/25
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 88
中文關鍵詞: 中文網路論壇網路留言對話幽默深度學習BERT
英文關鍵詞: Chinese online forum, online comments, conversational humor, deep learning, BERT
研究方法: 實驗設計法
DOI URL: http://doi.org/10.6345/NTNU202301004
論文種類: 學術論文
相關次數: 點閱:145下載:29
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究旨在探討利用深度學習模型(deep learning)對於台灣網路論壇—批踢踢實業坊(簡稱 PTT 論壇)上推文(論壇中對留言的稱呼)的中文幽默文本分類。文中結合了失諧理論(Incongruity Theory)、貶抑理論(Disparagement Theory)及釋放理論(Release Theory)並提出幽默是由違反行為(behavioral violation)及邏輯/溝通(logical/maxim violation)上的原則所形成。本研究於兩種層次的語境中找尋行為及溝通違規以進行幽默分類。第一種語境為推文裡的局部語境(local context);第二種語境為整個文章及推文互動的全局語境(global context)。研究結果發現,相較於使用詞袋特徵(bag-of-words)的傳統機器學習模型,利用局部語境資訊的 BERT 模型可以提升模型表現。當 BERT 模型使用全局語境時,語境資訊的提取方式則對模型表現有不同影響。當模型提取原始的全局語境資訊時,模型表現沒有進步,而經過注意力機制對文章各部分進行重新賦權後,模型表現則有微幅提升。本研究亦從事後分析獲得幾項發現:一,就幽默推文來說,局部語境常出現某些討論主題。二,推文確實與文章某些部分有較緊密的連結。三、文章與幽默推文的連貫性較低,此發現支持了幽默裡的失諧現象,即透過轉換看待事物的視角來製造幽默。

    This study explores deep-learning-based humor classification in Chinese online comments collected from the popular Taiwanese online discussion forum, PTT. It incorporates the incongruity theory, disparagement theory, and release theory to understand humor as a combination of behavioral and logical/maxim violations. The research focuses on two levels of context to capture the behavioral and maxim violations in humor classification: the local context within the comment itself and the global context related to the original post to which the comment responds. The findings indicate that the BERT model using local contextual information significantly improves the model performance compared to the traditional machine learning model using the bag-of-words features, while the approach to incorporate the global context has impact on the performance of the BERT model. While incorporating the original information from the global context has limited contribution to the task, reweighting global context by the attention mechanism has mildly improved the model performance. Post-hoc analyses highlight common topics emerging from local context in humorous comments, partial connection between posts and comments, and lower coherence between posts and humorous comments, supporting the concept of incongruity in humor and emphasizing the role of diverse perspectives.

    摘要 ii Abstract iii Table of Contents iv List of Figures vii List of Tables viii 1 Introduction 1 1.1 Motivation 1 1.2 Theoretical Framework 3 1.3 Research Objectives and Questions 6 1.4 Contributions and Implications 7 1.5 Organization of the Study 9 2 Literature Review 10 2.1 Three Perspectives of Interpreting Humor 10 2.1.1 Incongruity Theory 10 2.1.2 Disparagement Theory 14 2.1.3 Release Theory 19 2.1.4 Integration of the Three Theoretical Perspectives 22 2.1.5 Interim Summary 26 2.2 Humor and Pragmatics 26 2.3 Computational Humor Research 31 2.3.1 Shallow Machine Learning 33 2.3.2 Deep Learning 36 2.4 This Study 41 3 Methodology 43 3.1 Data Collection 43 3.2 Operational Definition 45 3.3 Data Preprocessing 52 3.4 Algorithm 55 3.5 Research Questions 57 3.6 Experimental Setup 57 3.6.1 Model Comparisons 57 3.6.2 Model Evaluation 63 4 Results and Discussion 64 4.1 Results 65 4.2 Discussions 66 4.2.1 Word-based vs. Local Context Cues 66 4.2.2 Limitations of Global Context Cues 70 4.2.3 Partial Contribution of Global Context Cues 73 4.2.4 General Discussion 77 5 Conclusion 79 5.1 Summary of the Study 79 5.2 Limitations and Future Study 80 References 82

    Ali, A. (2022). Suffering and smiling: Nigerians’ humorous response to the coronavirus pandemic. Journal of African Media Studies, 14(2), 245-256. https://doi.org/10.1386/jams_00076_1
    Annamoradnejad, I., & Zoghi, G. (2020). Colbert: Using BERT sentence embedding for humor detection. arXiv preprint arXiv:2004.12765. https://doi.org/10.48550/arXiv.2004.12765
    Attardo, S. (1994). Linguistic theories of humor. Mouton De Gruyter. https://doi.org/10.1515/9783110219029
    Bansal, A. (2021). Advanced natural language processing with TensorFlow 2: Build effective real-world NLP applications using NER, RNNs, seq2seq models, Transformers, and more. Packt Publishing Ltd.
    Barros, L., Rodriguez, P., & Ortigosa, A. (2013, September 2-5). Automatic classification of literature pieces by emotion detection: A study on Quevedo's poetry. 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, Switzerland.
    Bentley, R. A., Acerbi, A., Ormerod, P., & Lampos, V. (2014). Books average previous decade of economic misery. PloS one, 9(1), 1-7. https://doi.org/10.1371/journal.pone.0083147
    Bergson, H. (1900). Le rire: Essai sur la signification du comique. Presses Universitaires de France.
    Bertero, D., & Fung, P. (2016a, May 23-28). Deep learning of audio and language features for humor prediction. Tenth International Conference on Language Resources and Evaluation, Portorož, Slovenia.
    Bertero, D., & Fung, P. (2016b, June 12-17). A long short-term memory framework for predicting humor in dialogues. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA.
    Bunt, H. (2017). Computational pragmatics. In Y. Huang (Ed.), The Oxford Handbook of Pragmatics (pp. 326-345). Oxford University Press. https://doi.org/https://doi.org/10.1093/oxfordhb/9780199697960.013.18
    Bunt, H., Alexandersson, J., Carletta, J., Choe, J.-W., Fang, A. C., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., & Soria, C. (2010, May 19-21). Towards an ISO standard for dialogue act annotation. Seventh conference on International Language Resources and Evaluation, Valletta, Malta.
    Cerisara, C., Jafaritazehjani, S., Oluokun, A., & Le, H. (2018, August 20-26). Multi-task dialog act and sentiment recognition on mastodon. The 27th International Conference on Computational Linguistics, Santa Fe, New Mexico.
    Chang, W.-L. M., & Haugh, M. (2020). The metapragmatics of “teasing” in Taiwanese Chinese conversational humour. The European Journal of Humour Research, 8(4), 7-30. https://doi.org/http://dx.doi.org/10.7592/EJHR2020.8.4.Chang
    Chen, H.-Y., Lin, Y.-S., & Lee, C.-C. (2021, January 19-22). Through the words of viewers: Using comment-content entangled network for humor impression recognition. 2021 IEEE Spoken Language Technology Workshop, Shenzhen, China.
    Chen, P.-Y., & Soo, V.-W. (2018, June 1-6). Humor recognition using deep learning. 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, LA.
    Core, M. G., & Allen, J. (1997, November 8-10). Coding dialogs with the DAMSL annotation scheme. AAAI fall symposium on communicative action in humans and machines, Boston, MA.
    Cruse, A. (2010). Meaning in language: An introduction to semantics and pragmatics. Oxford University Press.
    Dangermond, K., Weewer, R., Duyndam, J., & Machielse, A. (2022). “If it stops, then I’ll start worrying.” Humor as part of the fire service culture, specifically as part of coping with critical incidents. Humor, 35(1), 31-50. https://doi.org/https://doi.org/10.1515/humor-2021-0106
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019, June 2-7). Bert: Pre-training of deep bidirectional transformers for language understanding. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN.
    Dunning, T. (1994). Accurate methods for the statistics of surprise and coincidence. Computational linguistics, 19(1), 61-74. https://dl.acm.org/doi/10.5555/972450.972454
    Fan, X., Lin, H., Yang, L., Diao, Y., Shen, C., Chu, Y., & Zou, Y. (2020). Humor detection via an internal and external neural network. Neurocomputing, 394, 105-111. https://doi.org/10.1016/j.neucom.2020.02.030
    Freud, S. (1905). Der Witz und seine beziehung zum unbewussten. Franz Deuticke.
    Freud, S. (2003). Der Witz und seine beziehung zum unbewussten [Jokes and their relation to the unconscious]. Penguin.
    Giora, R. (2003). On our mind: Salience, context and figurative language. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195136166.001.0001
    Graves, A., Wayne, G., & Danihelka, I. (2014). Neural turing machines. arXiv preprint arXiv:1410.5401.
    Grice, H. P. (1975). Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Speech Acts (pp. 41-58). Brill. https://doi.org/10.1163/9789004368811_003
    Hay, J. (2000). Functions of humor in the conversations of men and women. Journal of pragmatics, 32(6), 709-742.
    Hobbes, T. (1651). Leviathan.
    Huang, Y. (2020). Hyperboles in advertising: A serial mediation of incongruity and humour. International Journal of Advertising, 39(5), 719-737. https://doi.org/10.1080/02650487.2019.1686331
    Kant, I. (1790). Kritik der urteilskraft.
    Kim, E., & Klinger, R. (2018). A survey on sentiment and emotion analysis for computational literary studies. arXiv preprint arXiv:1808.03137, 1-38. https://doi.org/10.48550/arXiv.1808.03137
    Koestler, A. (1964). The act of creation. Hutchinson & Co.
    La Fave, L. (1972). Humor judgments as a function of reference groups and identification classes. In J. H. Goldstein & P. E. McGhee (Eds.), The Psychology of Humor: Theoretical Perspectives and Empirical Issues (pp. 195-210). Academic Press. https://doi.org/10.1016/C2009-0-22105-2
    Lederer, R. (1981). A primer of puns. The English Journal, 70(6), 32-36. https://doi.org/10.2307/817149
    Lee, J. C. (2020). Taiwanese Identity and Cross-Strait Conflict Resolution [Master's thesis, New York University].
    Levinson, S. C. (2000). On the notion of a generalized conversational implicature. In S. C. Levinson (Ed.), Presumptive Meanings: The Theory of Generalized Conversational Implicature (pp. 11-72). MIT press. https://doi.org/10.7551/mitpress/5526.003.0007
    Liang, H.-c., & Hsieh, S. C.-y. (2014). Humorous communication in social and political issues: A case study of a celebrity imitation show. International Journal of Chinese Linguistics, 1(2), 275-292. https://doi.org/10.1075/ijchl.1.2.04lia
    Lin, S. P. (2018). Chuncui shequn, guading shequn yu wangluo gerenzhuyi: Yi PTT baguaban (PTT Gossiping) wei li [Pure community, peg community, and networked individualism: A study of PTT Gossiping]. The Journal of Information Society, 38, 127-161. https://doi.org/10.29843/JCCIS.202001_(38).0007
    Lynch, O. (2010). Cooking with humor: In-group humor as social organization. Humor, 23(2), 127-159. https://doi.org/10.1515/humr.2010.007
    Ma, W.-Y., & Chen, K.-J. (2003, July 11-12). Introduction to CKIP Chinese word segmentation system for the first international Chinese word segmentation bakeoff. Second SIGHAN Workshop on Chinese Language Processing, Sapporo, Japan.
    McGhee, P. E. (1972). On the cognitive origins of incongruity humor: Fantasy assimilation versus reality assimilation. In J. H. Goldstein & P. E. McGhee (Eds.), The Psychology of Humor: Theoretical Perspectives and Empirical Issues (pp. 61-80). Academic Press. https://doi.org/10.1016/B978-0-12-288950-9.50009-2
    Middleton, R. (1959). Negro and white reactions to racial humor. Sociometry, 22(2), 175-183. https://doi.org/https://doi.org/10.2307/2786021
    Mihalcea, R., & Pulman, S. (2007, February 18-24). Characterizing humour: An exploration of features in humorous texts. International Conference on Intelligent Computational Linguistics and Text Processing, Mexico City, Mexico.
    Mihalcea, R., & Strapparava, C. (2005, October 6-8). Making computers laugh: Investigations in automatic humor recognition. Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, Canada.
    Norrick, N. R., & Chiaro, D. (2009). Humor in interaction. John Benjamins Publishing Company. https://doi.org/10.1075/pbns.182
    Qin, L., Che, W., Li, Y., Ni, M., & Liu, T. (2020, February 7-12). DCR-Net: A deep co-interactive relation network for joint dialog act recognition and sentiment classification. The Thirty-Fourth AAAI Conference on Artificial Intelligence, New York, NY.
    Raskin, V. (1985). Semantic mechanisms of humor. Springer. https://doi.org/10.1007/978-94-009-6472-3
    Ruch, W. (1993). Exhilaration and humor. In M. Lewis & J. M. Haviland-Jones (Eds.), Handbook of Emotions (Vol. 1, pp. 605-616). The Guilford Press.
    Smith, R. E. (1973). The use of humor in the counterconditioning of anger responses: A case study. Behavior therapy, 4(4), 576-580. https://doi.org/10.1016/S0005-7894(73)80010-3
    Spencer, H. (1875). The physiology of laughter. In H. Spencer (Ed.), Illustrations of universal progress: A series of discussions. D. Appleton & Company. https://doi.org/10.1037/12203-004
    Su, H.-Y. (2009). Reconstructing Taiwanese and Taiwan Guoyu on the Taiwan-based Internet: Playfulness, stylization, and politeness. Journal of Asian Pacific Communication, 19(2), 313-335. https://doi.org/10.1075/japc.19.2.08su
    Suls, J. M. (1972). A two-stage model for the appreciation of jokes and cartoons: An information-processing analysis. In J. H. Goldstein & P. E. McGhee (Eds.), The Psychology of Humor: Theoretical Perspectives and Empirical Issues. Academic Press. https://doi.org/10.1016/B978-0-12-288950-9.50010-9
    Thielemann, N. (2020). Understanding Conversational Joking: A cognitive-pragmatic study based on Russian interactions (Vol. 310). John Benjamins Publishing Company. https://doi.org/10.1075/pbns.310
    Veatch, T. C. (1998). A theory of humor. Humor, 11(2), 161-215. https://doi.org/10.1515/humr.1998.11.2.161
    Ventis, W. L. (1973). Case history: The use of laughter as an alternative response in systematic desensitization. Behavior therapy, 4(1), 120-122. https://doi.org/10.1016/S0005-7894(73)80082-6
    Wang, Y., Cui, L., & Zhang, Y. (2020). Does Chinese BERT Encode Word Structure? arXiv preprint arXiv:2010.07711, 1-11.
    Williams, R. (1976). Keywords. Oxford University Press.
    Yang, D., Lavie, A., Dyer, C., & Hovy, E. (2015, September 17-21). Humor recognition and humor anchor extraction. 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal.
    Zhang, R., & Liu, N. (2014, November 3-7). Recognizing humor on twitter. 23rd ACM International Conference on Conference on Information and Knowledge Management, Shanghai, China.
    Zhou, J. Y. (2007). Wanglu luntan de ziwo guli yu chenmo luoxuan xianxiang —Yi PTT wei li [The segmentation and "the spiral of silence" on website: A case study of the PTT]. Chuanbo Yu Guanli Yanjiu 7(1). https://doi.org/https://doi.org/10.6430/CMR.200707.0113

    下載圖示
    QR CODE