研究生: |
任賓森 Robinson, Mark James |
---|---|
論文名稱: |
多義使役動詞「讓」之二元分類 Binary classification of polysemous ràng as a periphrastic causative verb |
指導教授: |
陳正賢
Chen, Alvin Cheng-Hsien |
口試委員: |
張瑜芸
Chang, Yu-Yun 許展嘉 Hsu, Chan-Chia 陳正賢 Chen, Alvin Cheng-Hsien |
口試日期: | 2024/01/15 |
學位類別: |
碩士 Master |
系所名稱: |
英語學系 Department of English |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 78 |
英文關鍵詞: | machine translation, polysemy, word sense disambiguation, machine learning, ràng, periphrastic causatives |
研究方法: | 次級資料分析 、 比較研究 |
DOI URL: | http://doi.org/10.6345/NTNU202400348 |
論文種類: | 學術論文 |
相關次數: | 點閱:95 下載:7 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Polysemy in language is a significant challenge for language comprehension, particularly in the field of natural language processing. This has led to the development of word sense disambiguation tasks that attempt to determine which sense of a word is being invoked in a given sentence/context. The explosion of machine learning and various computational techniques has produced significant success in this field. Word sense disambiguation methods have been useful in the field of translation, although distinct and various challenges persist. In this paper, one such challenge will be explored. The Mandarin Chinese periphrastic causative verb ràng is polysemous and can take two causative forms: strong, weak. This thesis used translations of ràng based on an open-source corpus, OpenSubtitles, to produce an automatically annotated dataset. This dataset was then used to train three different machine learning algorithms that classify the two different forms of the verb. A bag-of-words model, a feature-engineered model, and a BERT transformer model achieved approximately 79%, 78%, and 84% percent accuracy respectively. These results indicate a potentially useful approach to machine translation research. These models yielded new insights into syntactic patterns that favor certain interpretations of ràng. Such insights give evidence to the claim that the methods used in this paper have the potential to improve machine translation and can inform word sense disambiguation task methodology.
Ahlswede, T., & Lorand, D. (1993). Word Sense Disambiguation by Human Subjects: Computational and Psycholinguistic Applications. In B. Boguraev & J. Pustejovsky (Eds.), Acquisition of Lexical Knowledge from Text , (pp.1-9). Association for Computational Linguistics. https://aclanthology.org/W93-0101
Chappell, H. (1984). A semantic analysis of passive, causative and dative constructions in Standard Chinese [Doctoral dissertation, Australian National University, Canberra].
Chiang, T.-I. (2011). Force Dynamics and Force Interaction Verbs in Mandarin [Master's Thesis, National Yang Ming Chiao Tung University]. Airiti Library. https://doi.org/10.6842/NCTU.2011.00358
Comrie, B. (1989). Language Universals and Linguistic Typology (2nd ed.). The University of Chicago Press.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. http://arxiv.org/abs/1810.04805
Edmonds, P., & Agirre, E. (2008). Word sense disambiguation. Scholarpedia, 3(7), 4358. https://doi.org/10.4249/scholarpedia.4358
Floyd, S., & Goldberg, A. E. (2021). Children make use of relationships across meanings in word learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 47(1), 29–44. https://doi.org/10.1037/xlm0000821
Klepousniotou, E. (2002). The Processing of Lexical Ambiguity: Homonymy and Polysemy in the Mental Lexicon. Brain and Language, 81(1–3), 205–223. https://doi.org/10.1006/brln.2001.2518
Liesenfeld, A., Liu, M., & Huang, C.-R. (2022). Profiling the Chinese causative construction with rang (讓), shi (使) and ling (令) using frame semantic features. Corpus Linguistics and Linguistic Theory, 18(2), 263–306. https://doi.org/10.1515/cllt-2020-0027
Lin, J., & Yao, Y. (2016). Encoding emotion in Chinese: A database of Chinese emotion words with information of emotion type, intensity, and valence. Lingua Sinica, 2, 1-22. https://doi.org/10.1186/s40655-016-0015-y
Lison, P., & Tiedemann, J. (2016). OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In N. Calzolari, K Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, Jan Doijk, S. Piperidis (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), (pp. 923-929). European Language Resource Association. https://aclanthology.org/L16-1147
Liu, M. (2016). Emotion in lexicon and grammar: Lexical-constructional interface of Mandarin emotional predicates. Lingua Sinica, 2, 1-47. https://doi.org/10.1186/s40655-016-0013-0
McCann, B., Bradbury, J., Xiong, C., & Socher, R. (2018). Learned in Translation: Contextualized Word Vectors (arXiv:1708.00107). arXiv. http://arxiv.org/abs/1708.00107
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space (arXiv:1301.3781). arXiv. http://arxiv.org/abs/1301.3781
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2018). Scikit-learn: Machine Learning in Python (arXiv:1201.0490). arXiv. http://arxiv.org/abs/1201.0490
Pennington, J., Socher, R., & Manning, C. (2014, October). GloVe: Global Vectors for Word Representation. In A. Moschitti, B. Pang, & W. Daelemans (Eds.), Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543). doi:10.3115/v1/D14-1162
Robinson, J. A. (2014). Quantifying polysemy in Cognitive Sociolinguistics. In D. Glynn & J. A. Robinson (Eds.), Human Cognitive Processing (Vol. 43, pp. 87–115). John Benjamins Publishing Company. https://doi.org/10.1075/hcp.43.04rob
Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 160, 1-21. https://doi.org/10.1007/s42979-021-00592-x
Sennet, A. (2023). Ambiguity. In Edward N. Zalta & Uri Nodelman (Eds.), The Stanford Encyclopedia of Philosophy (2023, Summer). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2023/entries/ambiguity/
Song, J. J. (2013). Periphrastic Causative Constructions. In M. S. Dryer & M. Haspelmath (Eds.), WALS Online (v2020.3) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7385533
Sun, X.-R., Lv, S.-H., Wang, X.-D., & Wang, D. (2017). Chinese Word Sense Disambiguation using a LSTM. ITM Web of Conferences, 12, 1-5. https://doi.org/10.1051/itmconf/20171201027
Talmy, L. (2000). Force Dynamics in Language and Cognition. In Toward a Cognitive Semantics. MIT Press. https://www.acsu.buffalo.edu/~talmy/talmyweb/Volume1/chap7.pdf
Teng, S. (1989). The Semantics of Causatives in Chinese. In James H-Y. Tai, Frank F. S. Hsueh (Eds.), Functionalism and Chinese Grammar (pp. 227–244). https://web.ntnu.edu.tw/~ybiq/papers/1989%20Ye%20as%20Manifested%20on%20Three%20Discourse%20Planes.pdf
Tian, X., Zhang, W. & Speelman, D. (2022). Lectal variation in Chinese analytic causative constructions: What trees can and cannot tell us. In D. Tay & M. Pan (Ed.), Data Analytics in Cognitive Linguistics: Methods and Insights (pp. 137-168). Berlin, Boston: De Gruyter Mouton. https://doi.org/10.1515/9783110687279-006
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2023). Attention Is All You Need (arXiv:1706.03762). arXiv. http://arxiv.org/abs/1706.03762
Vicente, A., & Falkum, I. L. (2017). Polysemy. In A. Vicente & I. L. Falkum, Oxford Research Encyclopedia of Linguistics. Oxford University Press. https://doi.org/10.1093/acrefore/9780199384655.013.325
Weng, C. (2007). Causative, Permissive, and Yielding: The Mandarin Chinese Verb of Rang. Nanzan Linguistics, 2(Special Issue), 69–90.
https://www.ic.nanzan-u.ac.jp/LINGUISTICS/publication/pdf/pdf/NLSI1_2-4-weng.pdf
Wolff, P., & Song, G. (2003). Models of causation and the semantics of causal verbs. Cognitive Psychology, 47(3), 276–332. https://doi.org/10.1016/S0010-0285(03)00036-7
Zhou, H., Zhang, Y., Li, Z., & Zhang, M. (2020). Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing? (arXiv:2003.03204). arXiv. http://arxiv.org/abs/2003.03204