研究生: |
鄧旭廷 Deng, Syu-Ting |
---|---|
論文名稱: |
語境化詞嵌入的視覺化解釋 Visual Interpretation for Contextualized Word Representation |
指導教授: |
王科植
Wang, Ko-Chih |
口試委員: |
紀明德
Chi, Ming-Te 王超 Wang, Chao 王科植 Wang, Ko-Chih |
口試日期: | 2022/09/22 |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 英文 |
論文頁數: | 51 |
中文關鍵詞: | 資料視覺化 、模型可解釋性 、語境化詞嵌入 |
英文關鍵詞: | Data visualization, Model Interpretation, Contextualize word representation |
DOI URL: | http://doi.org/10.6345/NTNU202201808 |
論文種類: | 學術論文 |
相關次數: | 點閱:72 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Transformer-base模型在自然語言任務中取得了優異的成績,因為它是語境化詞嵌入的絕佳實現。但其結構複雜,因此一般人難以理解。由於使用者缺乏對模型的徹底了解,很難進一步交互和利用模型,也很難理解為什麼會出現錯誤。但是,由於模型包含許多無法直接理解的複雜參數,因此很難通過簡單的參數或數學分析來解決這個問題。因此,我們針對該模型結構提出了可視化分析工具,幫助使用者詳細了解模型。包括輸入資料對模型的影響,以及模型各層的運作。我們專注於模型在在自然語言任務上的決策過程, 因此我們的工具基於自然語言任務。我們設計了一套完整的流程,使用者可以清楚的了解模型每一步的細節,可以清晰的分析輸入的數據,並且能和模型內部直接互動。用戶可以製定自己的假設並在此工具中進行驗證。
關鍵字:資料視覺化、模型可解釋性、語境化詞嵌入
The Transformer-base model has achieved excellent results in natural language tasks because it is a wonderful implement for contextualized word representation, but its structure is complex and therefore difficult to understand. Because the user does not fully understand the model, it is difficult to interact further and utilize the model, and it is difficult to understand why an error occurs. However, because the model contains many complex parameters that cannot be directly understood, it is difficult to solve this problem through simple parameters or mathematical analysis. Therefore, we propose a visual analysis tool for this model structure to help users understand the model in detail. Include the impact of input data on the model, and the operation of each layer of the model. We focus on the model's decision-making process on natural language tasks, so our tool is based on natural language tasks. We design a complete set of processes, users can clearly understand the details of each step of the model, can clearly analyze the input data, and can interact directly with the model. Users are able to formulate their own hypotheses and verify them in this tool.
Keywords: Data visualization, Model Interpretation, Contextualize word representation
Bilal Alsallakh, Amin Jourabloo, Mao Ye, Xiaoming Liu, and Liu Ren. Do convolu-tional neural networks learn class hierarchy? CoRR, abs/1710.06501, 2017.
Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. On pixel-wise explanations for non-linearclassifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140,2015.
Yonatan Belinkov and James Glass. Analysis methods in neural language processing:A survey. Transactions of the Association for Computational Linguistics, 7:49–72,2019.
Yoshua Bengio, Réjean Ducharme, and Pascal Vincent. A neural probabilistic languagemodel. In T. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural InformationProcessing Systems, volume 13. MIT Press, 2000.
Matthew Berger. Visually analyzing contextualized embeddings. In 2020 IEEEVisualization Conference (VIS), pages 276–280. IEEE, 2020.
Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam TKalai. Man is to computer programmer as woman is to homemaker? debiasing wordembeddings. Advances in neural information processing systems, 29, 2016.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805, 2018.
Sebastian Gehrmann, Hendrik Strobelt, Robert Krüger, Hanspeter Pfister, and Alexan-der M Rush. Visual interaction with deep learning models through collaborative seman-tic inference. IEEE transactions on visualization and computer graphics, 26(1):884–894, 2019.
Katja Hansen, David Baehrens, Timon Schroeter, Matthias Rupp, and Klaus-RobertMüller. Visual interpretation of kernel-based prediction models. Molecular Informatics,30(9):817–826, 2011.47
Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computa-tion, 9(8):1735–1780, 1997.
Fred Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. Visual analyticsin deep learning: An interrogative survey for the next frontiers. IEEE transactions onvisualization and computer graphics, 25(8):2674–2693, 2018.
Fred Hohman, Haekyu Park, Caleb Robinson, and Duen Horng Polo Chau. Summit:Scaling deep learning interpretability by visualizing activation and attribution summa-rizations. IEEE Transactions on Visualization and Computer Graphics, 26(1):1096–1106, 2020.
Benjamin Hoover, Hendrik Strobelt, and Sebastian Gehrmann. exbert: A visualanalysis tool to explore learned representations in transformers models. arXiv preprintarXiv:1910.05276, 2019.
Minsuk Kahng, Pierre Y. Andrews, Aditya Kalro, and Duen Horng Chau. Activis: Vi-sual exploration of industry-scale deep neural network models. CoRR, abs/1704.01942,2017.
Yoon Kim, Yacine Jernite, David Sontag, and Alexander M Rush. Character-awareneural language models. In Thirtieth AAAI conference on artificial intelligence, 2016.
Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. Principles ofexplanatory debugging to personalize interactive machine learning. In Proceedings ofthe 20th international conference on intelligent user interfaces, pages 126–137, 2015.
Vasudev Lal, Arden Ma, Estelle Aflalo, Phillip Howard, Ana Simoes, Daniel Korat,Oren Pereg, Gadi Singer, and Moshe Wasserblat. Interpret: An interactive visual-ization tool for interpreting transformers. In Proceedings of the 16th Conferenceof the European Chapter of the Association for Computational Linguistics: SystemDemonstrations, pages 135–142, 2021.
Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. A survey on deep learning fornamed entity recognition. CoRR, abs/1812.09449, 2018.
Zachary C Lipton. The mythos of model interpretability: In machine learning, theconcept of interpretability is both important and slippery. Queue, 16(3):31–57, 2018.
Mengchen Liu, Jiaxin Shi, Kelei Cao, Jun Zhu, and Shixia Liu. Analyzing the trainingprocesses of deep generative models. IEEE transactions on visualization and computergraphics, 24(1):77–87, 2017.
Shixia Liu, Xiting Wang, Mengchen Liu, and Jun Zhu. Towards better analysis ofmachine learning models: A visual analytics perspective. Visual Informatics, 1(1):48–56, 2017.48
Shusen Liu, Peer-Timo Bremer, Jayaraman J Thiagarajan, Vivek Srikumar, Bei Wang,Yarden Livnat, and Valerio Pascucci. Visual exploration of semantic relationships inneural word embeddings. IEEE transactions on visualization and computer graphics,24(1):553–562, 2017.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, OmerLevy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustlyoptimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
Julian Michael, Jan A Botha, and Ian Tenney. Asking without telling: Exploring latentontologies in contextual representations. arXiv preprint arXiv:2004.14513, 2020.
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation ofword representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
Tomas Mikolov, Martin Karafiát, Luká Burget, Jan Honza Cernocký, and SanjeevKhudanpur. Recurrent neural network based language model. In INTERSPEECH,2010.
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributedrepresentations of words and phrases and their compositionality. Advances in neuralinformation processing systems, 26, 2013.
Tim Miller. Explanation in artificial intelligence: Insights from the social sciences.Artificial intelligence, 267:1–38, 2019.
Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Methods for inter-preting and understanding deep neural networks. Digital signal processing, 73:1–15,2018.
Besmira Nushi, Ece Kamar, Eric Horvitz, and Donald Kossmann. On human intellectand machine failures: Troubleshooting integrative machine learning systems. InThirty-First AAAI Conference on Artificial Intelligence, 2017.
Kayur Patel, James Fogarty, James A Landay, and Beverly Harrison. Investigatingstatistical machine learning as a tool for software development. In Proceedings of theSIGCHI Conference on Human Factors in Computing Systems, pages 667–676, 2008.
Nicola Pezzotti, Thomas Höllt, Jan Van Gemert, Boudewijn P.F. Lelieveldt, ElmarEisemann, and Anna Vilanova. Deepeyes: Progressive visual analytics for designingdeep neural networks. IEEE Transactions on Visualization and Computer Graphics,24(1):98–108, 2018.
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Improvinglanguage understanding by generative pre-training. 2018.49
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever,et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9,2019.
Xin Rong and Eytan Adar. Visual tools for debugging neural language models. InProceedings of ICML Workshop on Visualization for Deep Learning, 2016.
Daniel Smilkov, Nikhil Thorat, Charles Nicholson, Emily Reif, Fernanda B Viégas, andMartin Wattenberg. Embedding projector: Interactive visualization and interpretationof embeddings. arXiv preprint arXiv:1611.05469, 2016.
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning,Andrew Y Ng, and Christopher Potts. Recursive deep models for semantic com-positionality over a sentiment treebank. In Proceedings of the 2013 conference onempirical methods in natural language processing, pages 1631–1642, 2013.
Brian J Taylor. Methods and procedures for the verification and validation of artificialneural networks. Springer Science & Business Media, 2006.
Ian Tenney, Dipanjan Das, and Ellie Pavlick. BERT rediscovers the classical NLPpipeline. CoRR, abs/1905.05950, 2019.
Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, SebastianGehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, et al. Thelanguage interpretability tool: Extensible, interactive visualizations and analysis fornlp models. arXiv preprint arXiv:2008.05122, 2020.
Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R Thomas McCoy,Najoung Kim, Benjamin Van Durme, Samuel R Bowman, Dipanjan Das, et al. Whatdo you learn from context? probing for sentence structure in contextualized wordrepresentations. arXiv preprint arXiv:1905.06316, 2019.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan NGomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances inneural information processing systems, 30, 2017.
Jesse Vig. A multiscale visualization of attention in the transformer model. arXivpreprint arXiv:1906.05714, 2019.
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue,Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison,Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu,Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and AlexanderRush. Transformers: State-of-the-art natural language processing. In Proceedings ofthe 2020 Conference on Empirical Methods in Natural Language Processing: System50Demonstrations, pages 38–45, Online, October 2020. Association for ComputationalLinguistics.
Wen Zhong, Cong Xie, Yuan Zhong, Yang Wang, Wei Xu, Shenghui Cheng, and KlausMueller. Evolutionary visual analysis of deep neural networks.