研究生: |
鄭諺祺 Yen-Chi Cheng |
---|---|
論文名稱: |
漢語音譯用字傾向的語料庫研究:以臺灣與中國大陸新聞為例 A Corpus-based Analysis of Character Usage in Chinese Transliteration: A Case Study of Newspapers in Taiwan and Mainland China |
指導教授: |
高照明
Gao, Zhao-Ming |
學位類別: |
碩士 Master |
系所名稱: |
翻譯研究所 Graduate Institute of Translation and Interpretation |
論文出版年: | 2013 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 123 |
中文關鍵詞: | 音譯 、用字 、選字 、語料庫 、對數概似比檢定 |
英文關鍵詞: | transliteration, character usage, character choosing, corpus, log-likelihood-ratio test |
論文種類: | 學術論文 |
相關次數: | 點閱:195 下載:18 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
漢語音譯詞與其他語言最主要的音譯不同之處,在於決定以何種譯音對應來源語發音之後,仍必須從同音漢字中選擇一個作為產出的音譯字。過去音譯研究雖不乏關於選字原則的討論,但多屬以音譯辭典等工具書為依據的質性分析,鮮有收集第一手語料的量化分析。本研究試圖利用語料庫與統計方法,找出現代漢語中有生產力的音譯字,描述現代漢語的音譯用字規範。研究者利用程式,從四個臺灣及中國大陸的新聞網站收集篇章,建置新聞語料庫,然後從中擷取帶有括號夾註原文的音譯詞,根據詞彙指涉對象的性質,加上「人名」(在可判別的情況下並標註性別)、「地名」、「其他實體名」等標記,製成四個音譯詞子語料庫,觀察子語料庫中的音譯字對應於新聞語料庫所有漢字的分布,並利用對數概似比檢定(log-likelihood-ratio test),比較各種不同條件下的音譯用字差異。研究結果揭示了音譯字當中有約80%共通出現於各種音譯詞,約20%明顯傾向使用於特定條件,顯示出漢語音譯用字規範內部的不同質。
The primary difference between transliteration in Chinese and that in other languages is the necessity of choosing one among many homophonous characters of the pronunciation that is chosen to represent the source language sound. Most previous transliteration studies that discuss the principles of the character choosing process were qualitative, using reference books such as transliteration dictionaries as sources, while few were primary-data-driven quantitative analyses. This study attempts to find the productive characters in contemporary Chinese transliteration and describe the norms of contemporary Chinese transliteration from a corpus-based, statistical approach. The researcher compiles four news corpora from Taiwan and Mainland China news websites. Four transliteration sub-corpora are then compiled by extracting from these news corpora transliterations with their corresponding source language words in parentheses and annotating them as “person” (with gender tags when possible), “place” or “other entity” according to the nature of their referents. The researcher observes the distribution of the characters in the transliteration sub-corpora vis-a-vis the news corpora as well as the difference in character usage under various conditions using log-likelihood ratio tests. The result shows that roughly 80% of the characters used in transliteration are common to all categories of transliterations, while the rest 20% tend significantly to be used under certain conditions, a sign of the non-homogeneity within the norm of character usage in Chinese transliteration.
中文資料
內政部戶政司(2012)。全國姓名探討。內政部。
王奇(2003)。外文人名漢譯選字探微。修辭學習,2003(3):33-34。
宋純(2009)。外國人名地名漢譯語音保真度研究。語文學刊(外語教育與教學),2009(10):99-102。
李知沅(2004)。現代漢語外來詞研究。國立政治大學中國文學研究所碩士論文。臺北市:文鶴出版社。
李洪華(2004)。音譯用字規範研究。山東師範大學漢語言文字學碩士論文。
周美玲(2009)。英漢人名音譯方法的研究與實現。蘇州大學計算機應用技術碩士論文。
邱明翰(2004)。臺灣地區國語中新外來詞的收集、整理與研究:1981-2004。國立臺灣師範大學翻譯研究所碩士論文。
姚娟(2008)。音譯詞研究。南京師範大學語言學及應用語言學碩士論文。
秦貽(2004)。專有名詞的翻譯原則和技巧。湖北工學院學報,19(6):60-63。
國語日報出版部編譯組(編)(1981)。國語日報外來語詞典。臺北市:國語日報社。
陳克健(1994)。素材語言學與文本處理。第三屆漢語語言學國際會議(ICCL-3),香港。
新華通訊社譯名室(編)(2007)。世界人名翻譯大辭典。第二版。北京市:中國對外翻譯出版公司。
劉丹青、石汝杰(1993)。專名翻譯規範化的兩大課題——統一與保真度。語言文字應用,1993(4)。
劉洪泉、吳長青(2009)。英文人名漢譯規範之管見。上海翻譯,2009(1):57-61。
鍾建閎(1955)。譯者序。羅素(Bertrand Russell)。西方哲學史。(pp. 1-20)。臺北市:中華文化出版事業委員會。
西文資料
Baker, M. (1995). Corpora in translation studies: an overview and some suggestions for future research. Target, 7, 223-243.
Church, K., & P. Hanks. (1989). Word association norms, mutual information and lexicography. ACL Proceedings, 27th Annual Meeting, 76-83. Vancouver: ACL.
Cochran, W. G. (1954). Some methods for strengthening the common χ2 tests. Biometrics, 10, 417-451.
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61-74.
Everitt, B. S. (1992). The analysis of contingency tables, 2nd edition. London: Chapman and Hall.
Gao, W., & Wong, K. F. (2006). Experimental studies using statistical algorithms on transliterating phoneme sequences for English-Chinese name translation. International Journal of Computer Processing of Oriental Languages, 19(1), 63-88.
Garside, R., Leech, G., & McEnery, A. (Eds.) (1997). Corpus annotation. New York: Longman.
Hofland, K., & Johansson, S. (Eds). (1982). Word frequencies in British and American English. Bergen: The Norwegian Computing Centre for the Humanities.
Ji, M. (2012). Hypothesis testing in corpus-based literary translation studies. In Oakes, M. P., & Ji, M. (Eds.), Quantitative methods in corpus-based translation studies: a practical guide to descriptive translation research (pp. 53-72). Amsterdam: John Benjamins.
Jin, C., Na, S. H., Kim, D. I., & Lee, J. H. (2008). English-Chinese transliteration word pair extraction from parallel corpora. International Journal of Computer Processing of Oriental Languages, 21(2): 169-182.
Johansson, S. (2003). Reflections on corpora and their uses in cross-linguistic research. In F. Zanettin, S. Bernardini & D. Stewart (Eds.), Corpora in translator education (pp. 133-144). Manchester: St Jerome.
Kendall, M. G. (1945). The treatment of ties in ranking problems. Biometrika, 33, 239-251.
Kilgarriff, A. (2007). Comparing corpora. In W. Teubert, & R. Krishnamurthy (Eds.), Corpus linguistics: Critical concepts in linguistics, 6(1), 232-263.
Leech, G. (1992). Corpora and theories of linguistic performance. In J. Svartvik (Ed.), Directions in corpus linguistics (pp. 105-122). Berlin: Mouton de Gruyter.
Leech, G., & Fallon, R. (1992). Computer corpora—what do they tell us about culture? ICAME Journal, 16, 29-50.
Li, W. (1992). Random texts exhibit Zipf's-law-like word frequency distribution. IEEE Transactions on Information Theory, 38(6), 1842-1845.
McEnery, T., Xiao, R., & Tono, Y. (2006). Corpus-based language studies. London: Routledge.
Oakes, M. P. (1998). Statistics for corpus linguistics. Edinburgh: Edinburgh University Press.
Olohan, M. (2004). Introducing corpora in translation studies. London: Routledge.
Pym, A. (2010). Exploring translation theories. London: Routledge.
Rayson, P., Berridge, D., & Francis, B. (2004). Extending the Cochran rule for the comparison of word frequencies between corpora. Paper presented at the 7th International Conference on Statistical analysis of textual data (JADT 2004), Louvain-la-Neuve, Belgium.
Rayson, P., & Garside, R. (2000). Comparing Corpora Using Frequency Profiling. In: WCC '00 Proceedings of the workshop on Comparing corpora. (pp. 1-6).
Scott, M. (2001). Mapping key words to problem and solution. In Scott, M. and Thompson, G. (eds.) Patterns of Text: in honour of Michael Hoey, Benjamins, Amsterdam, pp. 109 – 127.
Teubert, W., & Čermáková, A. (2007). Corpus linguistics: a short introduction. London: Continuum.
Toury, G. (1995). Descriptive translation studies—And beyond. Philadelphia: John Benjamins.
Venuti, L. (2008). The translator's invisibility: a history of translation (2nd ed.). London: Routledge.
Vinay, J. P., & Darbelnet, J. (1958). Stylistique comparée du français et de l'anglais. Paris: Didier-Harrap.
Zipf, G. K. (1935). The psychobiology of language. Boston: Houghton-Mifflin.
Zipf, G. K. (1949). Human behavior and the principle of least effort. Massachusetts: Addison-Wesley.
網路資源
“transliterate”. Oxford Dictionaries. April 2010: Oxford University Press.
What is the BNC? In British National Corpus. Retrieved from http://www.natcorp.ox.ac.uk/corpus/index.xml
Zipf's law. (2012, November 29). In Wikipedia, The Free Encyclopedia. Retrieved 10:41, February 17, 2013, from http://en.wikipedia.org/w/index.php?title=Zipf%27s_law&oldid=525539473
Everson, M., McGowan, R., Whistler, K., Umamaheswaran, V. S. (2012). Roadmap to the SIP. In The Unicode Consortium. Retrieved from http://www.unicode.org/roadmaps/sip/