研究生: |
單益章 Shan Yi-Chang |
---|---|
論文名稱: |
電腦輔助記憶系統之研究及製作~自動組句之研究 A Research and Implementation for Computer-Assisted Mnemonic System:Research of Automatic Sentence Composition |
指導教授: |
林順喜
Lin, Shun-Shii |
學位類別: |
碩士 Master |
系所名稱: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
論文出版年: | 2002 |
畢業學年度: | 91 |
語文別: | 中文 |
論文頁數: | 88 |
中文關鍵詞: | 中文斷詞 、中文構詞 、中文剖析 、聲韻學 、助憶系統 |
英文關鍵詞: | word segmentation, morphology, Chinese parsing, syllable, mnemonic system |
論文種類: | 學術論文 |
相關次數: | 點閱:161 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文在於提出並擬建構一套中文化數字輔助記憶系統,透過系統的運作,將一串無意義的數字輔換成有意義的中文詞句,以輔助使用者記憶。主要利用中央研究院詞庫小組的八萬詞目中文詞庫,分析注音符號聲母與韻母,找出114組與0~9數字發音近似的注音,分別給予不同的配分,再與詞庫中的詞彙比對,找出12680組與數字發音近似的詞彙。利用字串轉換機制(Transformation mechanism)將一連串數字予以切割,找出與詞庫中發音近似的詞彙予以組合配對。
本研究中配對以dynamic programming方式處理,時間複雜度為O(n3*(m+y)),其空間複雜度為O(n2*m),其中n為數字字串長度,m為欲挑選組句的個數,y為挑選m組時,失敗的組合次數。詞句組合上,透過評估函數F(evaluation function)分別給予不同權值,經過自然語言的分析後,從中挑選評估函數值前m個組句,做為我們挑選的組句。並在組句過程中自動產生二元剖析樹(Binary parsing tree),以方便後續研究之用。實驗結果顯示我們的系統對大多數的使用者有不錯的助憶成效。
In this thesis, we develop a system called Computer-Assisted Mnemonic System (CAMS), which can assist people to memorize some meaningless numeric strings with Chinese sentences. We analyze phonetics of the numbers from 0 to 9 and find out 114 rules of similar syllables. The rules are matched with the lexicons corpora established by CKIP (Chinese Knowledge Information Processing) Group of Academia Sinica in the Republic of China. Our system finds 12680 lexicons that can be used in composing sentences. We use a transformation mechanism to segment a series of numbers and thus compose sentences.
In the research, we use dynamic programming to implement the unification process of CAMS in O(n3*(m+y)) time by using O(n2*m) space of memory, where n is the length of the input numeric stings and m is the number of compositional sentences that we want to output, y is the times that our system gets unsatisfied composition results. Among the satisfied composition results, we utilize evaluation function to judge whether each of those composition sentences is suitable for us to memorize or not. In this process, the system will generate a binary parsing tree of sentences for further research in the future. Experimental results show that our system has a high degree of mnemonic effect for most users.
[1]、王良志、貝子勝、黎偉權、黃麗卿,1991,”以剖析為導向的中文斷詞法”,電子發展月刊,民國80年7月,第163期,頁40-45。
[2]、王榮宗、王駿發,1994,”語言模式在中文語音辨識上的應用”, 中華民國第七屆計算語言學會研討會論文集,頁51-71。
[3]、何大安,1989,”聲韻學中的觀念與方法”。大安出版社。
[4]、何文雄,1983,”中文斷詞的研究”,碩士論文,國立台灣工業研究技術學院。
[5]、野村浩鄉,1980,”剖析策略與機器翻譯”,中華民國計算語言學學會,計算語言學密集課程(I)講義。
[6]、陳克健、陳正佳、林隆基,1986,”中文語句分析的研究¾斷詞與構詞”,技術報告TR-86-004,中央研究院。
[7]、陳克健等,1987,”國語中的複合詞和語言剖析”,76年全國計算機會議論文集,頁415-422。
[8]、張俊盛、陳志遠、陳舜德,1991,“限制式滿足及機率最佳化的中文斷詞方法“,中華民國第四屆計算語言學會研討會論文集,頁147-165。
[9]、彭載衍、張俊盛,1993,”中文辭彙岐義之研究¾斷詞與詞性標示”,中華民國第六屆計算語言學會研討會論文集,頁173-193。
[10]、詞庫小組,1993,”中文詞類分析(三版) ”,中文詞知識小組技術報告93-05,中央研究院資訊所。
[11]、詞庫小組,1993,”訊息為本的格位語法與其剖析方法”,中文詞知識小組技術報告95-03,中央研究院資訊所。
[12]、趙元任,1982,”中國話的文法”,中文大學出版社。
[13]、Atkinson, R. C. & Shiffrin, R. M., 1968, Human memory: A Proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds), The psychology of learning and motivation (Vol.2). New York: Academic Press.
[14]、C.K. Fan and W.H. Tsai, 1987, “Automatic word identification in Chinese sentences by the relaxation technique”, Proc. of National Computer Symposium, pp.423-431.
[15]、R. Sproat and C. Shih, 1990,“A statistical method for finding word boundaries in Chinese text, Computer Processing of Chinese & Oriental Languages, Vol. 4
[16]、S. Shieber , 1986. ”An introduction to unification-based approaches to grammar”. CSLI Lecture Notes Series, No. 4.
[17]、H. Spitz , 1973, ”The channel capacity of educable mental retardates”. In D. K. Routh (ed.), The experimental psychology of mental retardation. Chicago: Aldine.
[18]、R. Sproat, 1990,“An application of statistical optimization with dynamic programming to phonemic-input-to-character conversion for Chinese”, Proceedings of ROCKING, pp.379-390.
[19]、http://fun-with-words , Mnemonic techniques for numbers.
[20]、http://ourworld.compuserve.com/homepages/avantol/foneword.html.