國立臺灣師範大學博碩士論文全文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	任賓森 Robinson, Mark James
論文名稱：	多義使役動詞「讓」之二元分類 Binary classification of polysemous ràng as a periphrastic causative verb
指導教授：	陳正賢 Chen, Alvin Cheng-Hsien
口試委員：	張瑜芸 Chang, Yu-Yun 許展嘉 Hsu, Chan-Chia 陳正賢 Chen, Alvin Cheng-Hsien
口試日期：	2024/01/15
學位類別：	碩士 Master
系所名稱：	英語學系 Department of English
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	78
英文關鍵詞：	machine translation, polysemy, word sense disambiguation, machine learning, ràng, periphrastic causatives
研究方法:	次級資料分析、比較研究
DOI URL：	http://doi.org/10.6345/NTNU202400348
論文種類：	學術論文
相關次數：	點閱：380 下載：11
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

Polysemy in language is a significant challenge for language comprehension, particularly in the field of natural language processing. This has led to the development of word sense disambiguation tasks that attempt to determine which sense of a word is being invoked in a given sentence/context. The explosion of machine learning and various computational techniques has produced significant success in this field. Word sense disambiguation methods have been useful in the field of translation, although distinct and various challenges persist. In this paper, one such challenge will be explored. The Mandarin Chinese periphrastic causative verb ràng is polysemous and can take two causative forms: strong, weak. This thesis used translations of ràng based on an open-source corpus, OpenSubtitles, to produce an automatically annotated dataset. This dataset was then used to train three different machine learning algorithms that classify the two different forms of the verb. A bag-of-words model, a feature-engineered model, and a BERT transformer model achieved approximately 79%, 78%, and 84% percent accuracy respectively. These results indicate a potentially useful approach to machine translation research. These models yielded new insights into syntactic patterns that favor certain interpretations of ràng. Such insights give evidence to the claim that the methods used in this paper have the potential to improve machine translation and can inform word sense disambiguation task methodology.

Abstract	i
Table of Contents	ii
List of Tables	iv
List of Figures	v

Chapter 1: Introduction
1	Polysemy and Word Sense Disambiguation	1
2	Periphrastic constructions and the polysemy of ràng	3
3	Examples of machine translations of ràng	5
4	Research focus and research questions	7
5	Thesis outline	9

Chapter 2: Literature review	10
1	Semantics of ràng	10
1.1	Force Dynamics	10
1.2	Binary semantic status of ràng	14
2	Syntax of ràng	17
3	The pertinence of emotion predicates	21
4	Relevant computational methods	23
4.1	Traditional machine learning	23
4.2	Deep Learning	25
4.3	Transformers	27
4.4	Purpose of exploring various models	28

Chapter 3: Methodology	29
1 	Describing the raw data	29
2 	Data preparation and preprocessing	30
3	Model selection and training	34
3.1	BOW model	35
3.2	FE Model	36
3.3	Chinese pre-trained BERT model	37
4	Evaluation of models	40

Chapter 4: Results and Discussion	45
1	Summary of all three models’ performance metrics	45
2	BOW and FE significant features	50
3	LIME examples and BERT sample errors	55
4	Discussion	69

Chapter 5: Conclusion	72
1	Summary	72
2	Significance of the research	73
3	Limitations of the research	74

References	75
                                

Ahlswede, T., & Lorand, D. (1993). Word Sense Disambiguation by Human Subjects: Computational and Psycholinguistic Applications. In B. Boguraev & J. Pustejovsky (Eds.), Acquisition of Lexical Knowledge from Text , (pp.1-9). Association for Computational Linguistics. https://aclanthology.org/W93-0101
Chappell, H. (1984). A semantic analysis of passive, causative and dative constructions in Standard Chinese [Doctoral dissertation, Australian National University, Canberra].
Chiang, T.-I. (2011). Force Dynamics and Force Interaction Verbs in Mandarin [Master's Thesis, National Yang Ming Chiao Tung University]. Airiti Library. https://doi.org/10.6842/NCTU.2011.00358
Comrie, B. (1989). Language Universals and Linguistic Typology (2nd ed.). The University of Chicago Press.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. http://arxiv.org/abs/1810.04805
Edmonds, P., & Agirre, E. (2008). Word sense disambiguation. Scholarpedia, 3(7), 4358. https://doi.org/10.4249/scholarpedia.4358
Floyd, S., & Goldberg, A. E. (2021). Children make use of relationships across meanings in word learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 47(1), 29–44. https://doi.org/10.1037/xlm0000821
Klepousniotou, E. (2002). The Processing of Lexical Ambiguity: Homonymy and Polysemy in the Mental Lexicon. Brain and Language, 81(1–3), 205–223. https://doi.org/10.1006/brln.2001.2518
Liesenfeld, A., Liu, M., & Huang, C.-R. (2022). Profiling the Chinese causative construction with rang (讓), shi (使) and ling (令) using frame semantic features. Corpus Linguistics and Linguistic Theory, 18(2), 263–306. https://doi.org/10.1515/cllt-2020-0027
Lin, J., & Yao, Y. (2016). Encoding emotion in Chinese: A database of Chinese emotion words with information of emotion type, intensity, and valence. Lingua Sinica, 2, 1-22. https://doi.org/10.1186/s40655-016-0015-y
Lison, P., & Tiedemann, J. (2016). OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In N. Calzolari, K Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, Jan Doijk, S. Piperidis (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), (pp. 923-929). European Language Resource Association. https://aclanthology.org/L16-1147
Liu, M. (2016). Emotion in lexicon and grammar: Lexical-constructional interface of Mandarin emotional predicates. Lingua Sinica, 2, 1-47. https://doi.org/10.1186/s40655-016-0013-0
McCann, B., Bradbury, J., Xiong, C., & Socher, R. (2018). Learned in Translation: Contextualized Word Vectors (arXiv:1708.00107). arXiv. http://arxiv.org/abs/1708.00107
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space (arXiv:1301.3781). arXiv. http://arxiv.org/abs/1301.3781
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Müller, A., Nothman, J., Louppe, G., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2018). Scikit-learn: Machine Learning in Python (arXiv:1201.0490). arXiv. http://arxiv.org/abs/1201.0490
Pennington, J., Socher, R., & Manning, C. (2014, October). GloVe: Global Vectors for Word Representation. In A. Moschitti, B. Pang, & W. Daelemans (Eds.), Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543). doi:10.3115/v1/D14-1162
Robinson, J. A. (2014). Quantifying polysemy in Cognitive Sociolinguistics. In D. Glynn & J. A. Robinson (Eds.), Human Cognitive Processing (Vol. 43, pp. 87–115). John Benjamins Publishing Company. https://doi.org/10.1075/hcp.43.04rob
Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 160, 1-21. https://doi.org/10.1007/s42979-021-00592-x
Sennet, A. (2023). Ambiguity. In Edward N. Zalta & Uri Nodelman (Eds.), The Stanford Encyclopedia of Philosophy (2023, Summer). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/sum2023/entries/ambiguity/
Song, J. J. (2013). Periphrastic Causative Constructions. In M. S. Dryer & M. Haspelmath (Eds.), WALS Online (v2020.3) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7385533
Sun, X.-R., Lv, S.-H., Wang, X.-D., & Wang, D. (2017). Chinese Word Sense Disambiguation using a LSTM. ITM Web of Conferences, 12, 1-5. https://doi.org/10.1051/itmconf/20171201027
Talmy, L. (2000). Force Dynamics in Language and Cognition. In Toward a Cognitive Semantics. MIT Press. https://www.acsu.buffalo.edu/~talmy/talmyweb/Volume1/chap7.pdf
Teng, S. (1989). The Semantics of Causatives in Chinese. In James H-Y. Tai, Frank F. S. Hsueh (Eds.), Functionalism and Chinese Grammar (pp. 227–244). https://web.ntnu.edu.tw/~ybiq/papers/1989%20Ye%20as%20Manifested%20on%20Three%20Discourse%20Planes.pdf
Tian, X., Zhang, W. & Speelman, D. (2022). Lectal variation in Chinese analytic causative constructions: What trees can and cannot tell us. In D. Tay & M. Pan (Ed.), Data Analytics in Cognitive Linguistics: Methods and Insights (pp. 137-168). Berlin, Boston: De Gruyter Mouton. https://doi.org/10.1515/9783110687279-006
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2023). Attention Is All You Need (arXiv:1706.03762). arXiv. http://arxiv.org/abs/1706.03762
Vicente, A., & Falkum, I. L. (2017). Polysemy. In A. Vicente & I. L. Falkum, Oxford Research Encyclopedia of Linguistics. Oxford University Press. https://doi.org/10.1093/acrefore/9780199384655.013.325
Weng, C. (2007). Causative, Permissive, and Yielding: The Mandarin Chinese Verb of Rang. Nanzan Linguistics, 2(Special Issue), 69–90.
https://www.ic.nanzan-u.ac.jp/LINGUISTICS/publication/pdf/pdf/NLSI1_2-4-weng.pdf
Wolff, P., & Song, G. (2003). Models of causation and the semantics of causal verbs. Cognitive Psychology, 47(3), 276–332. https://doi.org/10.1016/S0010-0285(03)00036-7
Zhou, H., Zhang, Y., Li, Z., & Zhang, M. (2020). Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing? (arXiv:2003.03204). arXiv. http://arxiv.org/abs/2003.03204

簡易檢索 / 詳目顯示

相關論文