簡易檢索 / 詳目顯示

研究生: 王贊育
Wang, Tsan-Yu
論文名稱: 探討華語為第二語的語詞統計學習
An Investigation of Statistical Learning of Words in Chinese as a Second Language
指導教授: 陳振宇
Chen, Jenn-Yeu
學位類別: 博士
Doctor
系所名稱: 華語文教學系
Department of Chinese as a Second Language
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 109
中文關鍵詞: 統計學習斷詞華語語詞學習銜接概率
英文關鍵詞: statistical learning, Chinese, word segmentation, word learning, CSL
DOI URL: http://doi.org/10.6345/NTNU201900957
論文種類: 學術論文
相關次數: 點閱:222下載:43
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 「統計學習」(statistical learning)為人類尋覓、計算訊號單位間的統計資訊,歸納組合規律的能力。中文的文字排版雖有字間空格,但卻缺少明顯的詞邊界訊息,因此讀者在閱讀時會遭遇斷詞挑戰。過去中文斷詞的研究多探討讀者斷詞的結果,而未討論讀者是如何斷詞的。本研究假設以華語為第二語的學習者能透過統計學習機制計算相鄰語言單位間的銜接概率(Transitional Probability, TP),並以此形成斷詞依據。實驗一至實驗五採用修改自Saffran (1997)之派典。實驗一以6個中文音節組成6個中文雙音節詞,形成一個包含3600個音節的連續音節串。音節詞內相鄰兩音節間的TP = .46 - 1,詞間相鄰兩音節的TP = 0 - .29。該銜接概率為斷詞的唯一線索。受試者聆聽材料後,由測驗中選出聽過的組合。實驗一受試者之平均答對率為 .57,顯示受試者能依據銜接概率,找出音節詞界線。實驗二至實驗四以視覺管道呈現相同統計分布的中文字串,三個實驗之統計學習表現雖僅在顯著邊緣(實驗二 .53,實驗三 .53,實驗四 .52),但合併計算後之平均答對率跨越顯著門檻,顯示受試者能通過視覺統計學習來斷詞。實驗五探討中文母語者的先前經驗是否影響其統計學習表現?結果顯示當新材料與學習經驗之統計資訊不一致時,先前經驗無益於累積新的統計資訊。實驗六至實驗八採用修改自Fiser 與 Aslin (2002)之派典。實驗六以12個抽象圖形,組成包含288個圖形的圖形串。圖形詞內相鄰兩圖間的TP = 1,詞間相鄰兩圖的TP = .33。實驗六顯示受試者能攫取抽象圖形串的組合規律( .77)。實驗七將材料置換為韓文字母,發現受試者能找出韓文字母詞的統計規律( .65)。實驗八的材料為具有較複雜統計資訊的中文字串,結果顯示在有充足處理時間的狀況下,受試者能掌握文字單位間的複雜統計資訊並以此斷詞( .67)。本研究的實驗結果指出,學習者能透過統計學習機制掌握連續中文字間的統計資訊,找到語詞界線,並據以斷詞。本研究亦討論了可能影響統計學習成效的因素,並提出由語詞統計學習觀點出發的華語教學方案。

    Statistical learning is a pattern induction ability which can trace and compute statistical information from the inputs. Previous researches have demonstrated statistical learning with auditory linguistic inputs and visual nonlinguistic inputs, but none used real language visual symbols (letters or characters) as the material. In Chinese text, there is no physical clue between Chinese words indicating word boundaries. Thus, readers of Chinese text encounter word segmentation problems as in listening to a continuous language stream. In this study, we hypothesize that readers utilize statistical information of the adjacent Chinese characters identifying word boundaries. We employed 2 types of statistical learning paradigms to investigate the statistical learning of words of CLS learners. The paradigm of Exp. 1-5 was adopted from Saffran, Newport, Aslin, Tunick, and Barrueco (1997) study. The material was made up of a continuous Chinese syllable string or a non-spaced Chinese character string. The transitional probabilities among adjacent syllables/characters were the only clue for defining word boundaries. Results showed that CSL learners could segment a continuous natural language-like syllable/character string into small units by calculating the statistical information of it. Yet, participants’ well-established statistical knowledge of material units would dilute the learning outcomes of new material which is made up of learners’ acquainted language units. In Exp. 6-8, abstract symbols, Korean letters, Chinese characters were employed into Fiser and Aslin (2002) VSL paradigm. The results suggested that participants could segment units from continuous visual inputs under different paradigm settings, but the efficiency seems to depend on how the material was presented to participants. The results of these experiments demonstrated that readers could extract statistical information of adjacent characters from a non-spaced Chinese text by reading. The possible constraints of visual statistical learning of Chinese words, as well as some teaching insights based on the research results, were also discussed in the article.

    第一章 研究背景與文獻回顧 1 第一節 前言 1 第二節 基於統計規律性所架構的語言習得理論 4 第三節 Saffran等人以銜接概率作為統計規律的語言學習研究 6 第四節 統計學習機制之特性 10 第五節 影響統計學習效果的可能因素 21 第六節 統計學習機制和語言學習機制的關係 24 第七節 中文閱讀和統計學習 29 第二章 研究問題 33 第三章 實驗 36 第一節 實驗一:外籍中文初學者的中文音節統計學習表現 36 第二節 實驗二:外籍中文初學者的視覺管道中文語詞統計學習表現 41 第三節 實驗三:增加中文初學者接觸學習材料之時間 47 第四節 實驗四:以三筆畫中文字為材料以降低受試者的視覺負荷 50 第五節 實驗一至實驗四小結 54 第六節 實驗五:中文精熟者的中文語詞統計學習表現 59 第七節 實驗六:重製Fiser與Aslin的視覺管道統計學習實驗 63 第八節 實驗七:中文受試者的韓文字母統計學習表現 69 第九節 實驗八:外籍受試者閱讀中文字串時的語詞統計學習表現 74 第四章 綜合討論 80 第一節 實驗一至實驗八結果彙整 80 第二節 主要發現與討論 85 第五章 結論與教學建議 90 第一節 以華語為第二語的語詞統計學習在教學上的啟發 90 第二節 以統計學習觀點所設計的華語語詞學習方案 97 參考文獻 102 中文參考文獻 102 英文參考文獻 103 附錄一 109 實驗六重製Fiser 與 Aslin (2002)實驗程序所用之抽象圖形材料 109

    一、中文參考文獻
    工業技術研究院資訊與通訊研究所(2011)。工研院文字轉語音Web服務【網頁資料】。2013年5月31日,取自http://tts.itri.org.tw/
    周文帥、馮速(2006)。漢語分詞技術研究現狀與應用展望。山西師範大學學報,20(1),25-29。
    林千翔、張嘉惠、陳貞伶(2010)。結合長詞優先與序列標記之中文斷詞研究。中文計算語言學期刊,15(3-4),161-179。doi:10.30019/IJCLCLP.201009.0001
    林仁一(2012)。物理的斷詞線索對一般大學生閱讀效率的影響:一個眼球軌跡追蹤的研究。國立成功大學認知科學研究,台南市(未出版之碩士論文)。
    林昱成(2009)。詞間空格對國小正常及閱讀困難學生閱讀效率之影響。國立成功大學認知科學研究所,台南市(未出版之碩士論文)。
    孫茂松、鄒嘉彥 (2001)。漢語自動分詞研究評述。當代語言學, 3(1),22-32。
    陳家興、蔡介立(2016)。詞彙邊界線索影響閱讀中文表現的眼動證據。中華心理學刊,58(1),19-44。doi:10.6129/CJP.20160304
    陳振宇(2013)。學語言是學到了什麼?從語言的多面向樣貌探討語言教學的新路徑。臺灣華語教學研究,7,1-12。
    陳稼興、謝佳倫、許芳誠(2000)。以遺傳演算法為基礎的中文斷詞研究。資訊管理研究,2(2),27-44。doi:10.6188/JEB.2000.2(2).02
    彭瑞元(2003)。探討加入詞間空格對於中文閱讀效率的影響。國立中正大學心理學研究所,嘉義縣(未出版之碩士論文)。
    黃昌寧、趙海 (2007)。中文分詞十年回顧。中文信息學報,21(3),8-19。
    楊憲明(1998)。中文詞間、詞內空格調整對閱讀的影響。台南師院學報,31,303-326。
    劉英茂、葉重新、王聯慧、張迎桂(1974)。詞單位對閱讀效率的影響。中華心理學刊,16,25-32。
    謝佳倫 (1999)。遺傳演算法應用於中文斷詞之研究。中央大學資訊管理學研究所,桃園市(未出版之碩士論文)。

    二、英文參考文獻
    Adams, M. (1990). Beginning to Read: Thinking and Learning about Print. Cambridge, MA: MIT Press.
    Abla, D., & Okanoya, K. (2009). Visual statistical learning of shape sequences: An ERP study. Neuroscience Research, 64, 185-190.
    Apfelbaum, K. S., Hazeltine, E., & McMurray, B. (2013). Statistical learning in reading: Variability in irrelevant letters helps children learn phonics skills. Developmental Psychology, 49(7), 1348-1365.
    Arciuli, J., & Simpson, I. C. (2012). Statistical learning is lasting and consistent over time. Neuroscience Letters, 517(2), 133-135.
    Arciuli, J. (2018). Reading as statistical learning. Language, Speech, and Hearing Services in Schools, 49(3S), 634-643.
    Aslin, R. N., Saffran, J. R., & Newport, E. L. (1999). Statistical learning in linguistic and nonlinguistic domains. In B. MacWhinney (Ed.), Emergentist Approaches to Language. Hillsdale, New Jersey: Erlbaum.
    Baldwin, D., Andersson, A., Saffran, J., & Meyer, M. (2008). Segmenting dynamic human action via statistical structure. Cognition, 106, 1382-1407.
    Barlow, M., & Kemmer, S. (Eds.). (2000). Usage-Based Models of Language. Stanford, CA: CSLI Publications.
    Behrens, H. (2009). Usage-based and emergentist approaches to language acquisition. Linguistics, 47(2), 383-411.
    Bonatti, L. L., Peña, M., Nespor, M., & Mehler, J. (2005). Linguistic constraints on statistical computations: The role of consonants and vowels in continuous speech processing. Psychological Science, 16(6), 451-459.
    Bybee, J. (2006). From Usage to Grammar: The mind's response to repetition. Language, 82(4), 711-733.
    Campbell, K. L., Zimerman, S., Healey, M. K., Lee, M., & Hasher, L. (2012). Age differences in visual statistical learning. Psychology and Aging, 27(3), 650-656.
    Chen, K. J., & Bai, M. H. (1998). Unknown word detection for Chinese by a corpus-based learning method. International Journal of Computational Linguistics & Chinese Language Processing, 3(1), 27-44.
    Chen, K. J., & Liu, S. H. (1992). Word identification for Mandarin Chinese sentences. Proceedings of the 14th Conference on Computational Linguistics, 1, 101-107. doi:10.3115/992066.992085
    Chen, K. J., & Ma, W. Y. (2002). Unknown word extraction for Chinese documents. Proceedings of the 19th International Conference on Computational Linguistics, 1, 1-7. doi:10.3115/1072228.1072277
    Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
    Chomsky, N. (1972). Language and Mind. New York, NY: Harcourt Brace Jovanovich.
    Conway, C. M., & Christiansen, M. H. (2005). Modality-constrained statistical learning of tactile, visual, and auditory sequences. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(1), 24-39.
    Coyle, D., Hood, P., & Marsh, D. (2010). Content and Language Integrated Learning. Cambridge, UK: Cambridge University Press.
    Dalton-Puffer, C., Nikula, T., & Smit, U. (Eds.). (2010). Language Use and Language Learning in CLIL Classrooms (Vol. 7). Amsterdam, Netherland: John Benjamins Publishing.
    Ehrich, J. & Meuter, R. (2009). Acquiring an artificial logographic orthography: The beneficial effects of a logographic L1 background and bilinguality. Journal of Cross-Cultural Psychology, 40, 711-747.
    Emberson, L. L., Conway, C. M., & Christiansen, M. H. (2011). Timing is everything: Changes in presentation rate have opposite effects on auditory and visual implicit statistical learning. The Quarterly Journal of Experimental Psychology, 64(5), 1021-1040.
    Erickson, L. C., & Thiessen, E. D. (2015). Statistical learning of language: theory, validity, and predictions of a statistical learning account of language acquisition. Developmental Review, 37, 66-108.
    Evans, J. L., Saffran, J. R., & Robe-Torres, K. (2009). Statistical learning in children with specific language impairment. Journal of Speech, Language, and Hearing Research, 52, 321-335.
    Fiser, J., & Aslin, R. N. (2002). Statistical learning of higher-order temporal structure from visual shape sequences. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28(3), 458-467.
    Frost, R., Armstrong, B. C., Siegelman, N., & Christiansen, M. H. (2015). Domain generality versus modality specificity: the paradox of statistical learning. Trends in Cognitive Sciences, 19(3), 117-125.
    Gervain, J., & Mehler, J. (2010). Speech perception and language acquisition in the first year of life. Annual Review of Psychology, 61, 191-218.
    Gómez, R. L. (2002). Variability and detection of invariant structure. Psychological Science, 13(5), 431-436.
    Graf Estes, K., Edwards, J., & Saffran, J. R. (2011). Phonotactic constraints on infant word learning. Infancy, 16(2), 180-197.
    Graf Estes, K., Evans, J. L., Alibali, M. W., & Saffran, J. R. (2007). Can infants map meaning to newly segmented words? Statistical segmentation and word learning. Psychological Science, 18(3), 254-260.
    Grunow, H., Spaulding, T. J., Gómez, R. L., & Plante, E. (2006). The effects of variation on learning word order rules by adults with and without language-based learning disabilities. Journal of Communication Disorders, 39(2), 158-170.
    Hay, J. F. & Lany, J. (2012). Sensitivity to statistical information begets learning in early language development. In P. Rebuschat & J. N. Williams (Eds.), Statistical Learning and Language Acquisition (pp. 91-118). Berlin, Germany: Walter de Gruyter.
    Hoosain, R. (1992). Psychological reality of the word in Chinese. Advances in Psychology, 90, 111-130.
    Hsu, H. J., & Bishop, D. V. (2010). Grammatical difficulties in children with specific language impairment: Is learning deficient? Human Development, 53(5), 264-277.
    Hsu, H. J., Tomblin, J. B., & Christiansen, M. H. (2014). Impaired statistical learning of non-adjacent dependencies in adolescents with specific language impairment. Frontiers in Psychology, 5, 1-10.
    Huang, C., & Zhao, H. (2007). Chinese word segmentation: A decade review. Journal of Chinese Information Processing, 21(3), 8-20.
    Kemmer, S., & Barlow, M. (2000). Introduction: A usage-based conception of language. In M. Barlow & S. Kemmer (Eds.), Usage-Based Models of Language (pp. 7-28). Chicago, IL: University of Chicago Press.
    Kim, R., Seitz, A., Feenstra, H., & Shams, L. (2009). Testing assumptions of statistical learning: Is it long-term and implicit? Neuroscience Letters, 461, 145-149.
    Kirkham, N. Z., Slemmer, J. A., & Johnson, S. P. (2002). Visual statistical learning in infancy: Evidence for a domain general learning mechanism. Cognition, 83, B35-B42.
    Kuhl, P. K. (2000). A new view of language acquisition. Proceedings of the National Academy of Sciences Oct 2000, 97(22), 11850-11857. doi: 10.1073/pnas.97.22.11850
    Kuhl, P. K. (2004). Early language acquisition: cracking the speech code. Nature Reviews Neuroscience, 5(11), 831-843. doi:10.1038/nrn1533
    Lany, J., & Gómez, R. L. (2008). Twelve-month-old infants benefit from prior experience in statistical learning. Psychological Science, 19(12), 1247-1252.
    Lany, J., & Saffran, J. R. (2010). From statistics to meaning: Infants' acquisition of lexical categories. Psychological Science, 21(2), 284-291.
    Lany, J., & Saffran, J. R. (2013) Statistical Learning Mechanisms in Infancy. In J. L. R. Rubenstein & P. Rakic (Eds.), Comprehensive Developmental Neuroscience: Neural Circuit Development and Function in the Brain (pp. 231-248). Amsterdam: Elsevier
    Luo, X., Sun, M., & Tsou, B. K. (2002). Covering ambiguity resolution in Chinese word segmentation based on contextual information. Proceedings of the 19th International Conference on Computational linguistics, 1, 1-7. doi:10.3115/1072228.1072283
    Ma, W. Y., & Chen, K. J. (2003). A bottom-up merging algorithm for Chinese unknown word extraction. Proceedings of the second SIGHAN workshop on Chinese language processing, 17, 31-38. doi:10.3115/1119250.1119255
    Macmillan, N. A., & Kaplan, H. L. (1985). Detection theory analysis of group data: estimating sensitivity from average hit and false-alarm rates. Psychological Bulletin, 98(1), 185-199.
    Marcus, G. F., & Berent, I. (2003). Are there limits to statistical learning? Science, 300(5616), 53-55.
    Marcus, G. F., Vijayan, S., Rao, S. B., & Vishton, P. M. (1999). Rule learning by seven-month-old infants. Science, 283(5398), 77-80.
    Marsh, D., & Langé, G. (2000). Using Languages to Learn and Learning to Use Languages. Finland: University of Jyväskylä.
    McBride, C. (2016). Is Chinese special? Four aspects of Chinese literacy acquisition that might distinguish learning Chinese from learning alphabetic orthographies. Educational Psychological Review, 28(3), 523-549.
    McBride-Chang, C., Zhou, Y., Cho, J. R., Aram, D., Levin, I., & Tolchinsky, L. (2011). Visual spatial skill: A consequence of learning to read? Journal of Experimental Child Psychology, 109(2), 256-262.
    Mehisto, P., Marsh, D., & Frigols, M. J. (2008). Uncovering CLIL Content and Language Integrated Learning in Bilingual and Multilingual Education. UK: Macmillan Education.
    Mehler, J., Peña, M., Nespor, M., & Bonatti, L. (2006). The “soul” of language does not use statistics: Reflections on vowels and consonants. Cortex, 42(6), 846-854.
    Mirman, D., Magnuson, J. S., Estes, K. G., & Dixon, J. A. (2008). The link between statistical segmentation and word learning in adults. Cognition, 108(1), 271-280.
    Misyak, J. B., & Christiansen, M. H. (2012). Statistical learning and language: An individual differences study. Language Learning, 62(1), 302-331.
    Misyak, J. B., Christiansen, M. H., & Tomblin, J. B. (2010). Sequential expectations: The role of prediction‐based learning in language. Topics in Cognitive Science, 2(1), 138-153.
    Musz, E., Weber, M. J., & Thompson-Schill, S. L. (2015). Visual statistical learning is not reliably modulated by selective attention to isolated events. Attention, Perception, & Psychophysics, 77(1), 78-96.
    Nespor, M., Peña, M., Mehler, J. (2003). On the different roles of vowels and consonants in speech processing and language acquisition. Lingue e Linguaggio, 2(2), 203-229.
    Newport, E. L., & Aslin, R. N. (2004). Learning at a distance I. Statistical learning of non-adjacent dependencies. Cognitive Psychology, 48(2), 127-162.
    Pacton, S., & Perruchet, P. (2008). An attention-based associative account of adjacent and nonadjacent dependency learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(1), 80-96.
    Pelucchi, B., Hay, J. F., Saffran, J. R. (2009). Learning in reverse: Eight-month-old infants track backward transitional probabilities. Cognition, 113, 244-247.
    Peña, M., Bonatti, L. L., Nespor, M., & Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298(5593), 604-607.
    Romberg, A. R., & Saffran, J. R. (2010). Statistical learning and language acquisition. Wiley Interdisciplinary Reviews: Cognitive Science, 1(6), 906-914. doi: 10.1002/wcs.78
    Saffran, J. R. (2001). Words in a sea of sounds: The output of infant statistical learning. Cognition, 81, 149-196.
    Saffran, J. R. (2002). Constraints on statistical language learning. Journal of Memory and Language, 47, 172-196.
    Saffran, J. R., & Wilson, D. P. (2003). From syllables to syntax: Multilevel statistical learning by 12-month-old infants. Infancy, 4(2), 273-284.
    Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928.
    Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.
    Saffran, J. R., Newport, E. L., Aslin, R. N., Tunick, R. A., & Barrueco, S. (1997). Incidental language learning: listening (and Learning) out of the corner of your ear. Psychological Science, 8(2), 101-105.
    Saffran, J.R., Senghas, A., & Trueswell, J.C. (2001). The acquisition of language by children. Proceedings of the National Academy of Sciences, 98 (23), 12874-12875.
    Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and Second Language Instruction (pp. 3-32). Cambridge: Cambridge University Press.
    Seidenberg, M. S., MacDonald, M. C., & Saffran, J. R. (2002). Does grammar start where statistics stop? Science, 298(5593), 553-554.
    Shukla, M., Gervain, J., Mehler, J., & Nespor, M. (2012). Linguistic constraints on statistical learning in early language acquisition. In J. Rebuschat & J. N. Williams (Eds.), Statistical Learning and Language Acquisition (pp. 171-202). Berlin: Mouton de Gruyter.
    Thiessen, E. (2009). Statistical learning. In E. Bavin (Ed.), The Cambridge Handbook of Child Language (pp. 35-50). Cambridge, England: Cambridge University Press.
    Thiessen, E. D., & Saffran, J. R. (2003). When cues collide: Use of stress and statistical cues to word boundaries. Developmental Psychology, 39(4), 706-716.
    Thiessen, E. D., Kronstein, A. T., & Hufnagle, D. G. (2013). The extraction and integration framework: A two-process account of statistical learning. Psychological Bulletin, 139(4), 792-814.
    Tomasello, M. (2000). First steps toward a usage-based theory of language acquisition. Cognitive Linguistics, 11(1-2), 61-82.
    Tomasello, M. (2005). Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press.
    Tomasello, M. (2009). The usage-based theory of language acquisition. In The Cambridge Handbook of Child Language (pp. 69-87). Cambridge, England: Cambridge University Press.
    Tong, X., & McBride-Chang, C. (2010). Chinese-English biscriptal reading: Cognitive component skills across orthographies. Reading and Writing: An Interdisciplinary Journal, 23(3), 293-314.
    Toro, J. M., Sinnett, S., & Soto-Faraco, S. (2005). Speech segmentation by statistical learning depends on attention. Cognition, 97, B25-B34.
    Turk-Browne, N. B., Jungé, J. A., & Scholl, B. J. (2005). The automaticity of visual statistical learning. Journal of Experimental Psychology: General, 134(4), 552.
    von Koss Torkildsen, J., Dailey, N. S., Aguilar, J. M., Gómez, R., & Plante, E. (2013). Exemplar variability facilitates rapid learning of an otherwise unlearnable grammar by individuals with language-based learning disability. Journal of Speech, Language, and Hearing Research, 56(2), 618-629.
    Wang, M., Koda, K., & Perfetti, C. (2003). Alphabetic and nonalphabetic L1 effects in English word identification: a comparison of Korean and Chinese English L2 learners. Cognition, 87, 129-149.
    Wang, Y., & McBride, C. (2016). Character reading and word reading in Chinese: Unique correlates for Chinese kindergarteners. Applied Psycholinguistics, 37, 371-386.
    Weiss, D. J., Gerfen, C., & Mitchel, A. D. (2010). Colliding cues in word segmentation: The role of cue strength and general cognitive processes. Language and Cognitive Processes, 25(3), 402-422.
    Yim, D., & Rudoy, J. (2013). Implicit statistical learning and language skills in bilingual children. Journal of Speech, Language, and Hearing Research, 56, 310-322.
    Zhang, M. Y., Lu, Z. D., & Zou, C. Y. (2004). A Chinese word segmentation based on language situation in processing ambiguous words. Information Sciences, 162(3-4), 275-285.

    下載圖示
    QR CODE