簡易檢索 / 詳目顯示

研究生: 林信雄
Lin, Xinxiong
論文名稱: 可回溯的電腦化適性測驗
Computerized adaptive tests with limiting answer review
指導教授: 何榮桂
Ho, Rong-Guey
學位類別: 碩士
Master
系所名稱: 資訊教育研究所
Graduate Institute of Information and Computer Education
論文出版年: 2007
畢業學年度: 95
語文別: 中文
論文頁數: 67
中文關鍵詞: 項目反應理論回溯電腦化適性測驗
英文關鍵詞: item response theory(IRT), review, computerized adaptive testing(CAT)
論文種類: 學術論文
相關次數: 點閱:214下載:25
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 電腦化適性測驗(computerized adaptive tests, CAT)考量到能力估計值演算法之準確度,不允許受試者對已作答的題目進行回溯,因此無形中提高了受試者的測驗焦慮指數,亦造成受試者的真實能力更難以被精確地測量。本研究嘗試提出一簡單合理的演算法(RCAT演算法),以供可回溯的CAT作為能力估計值重新計算之用。研究採用模擬方式評估演算法之良窳。
    研究結果顯示,使用RCAT演算法之可回溯CAT可更精準測得受試者的真實能力值,且RCAT完成測驗後之測驗標準誤與CAT不相上下,只是在使用RCAT時,需要使用較長的測驗長度作為代價,方可達到與CAT相同水準的測驗標準誤。

    The popularity of computerized adaptive testing (CAT) has been increasing year by year. Item review is not permitted in many adaptive tests because test makers consider it does not follow the logic on which adaptive tests are based. Studies show that the no-revision policy causes increased anxiety to examinees. Consequently, this anxiety can decrease the examinee’s performance on adaptive tests, which would increase the error in their ability estimates. The purpose of this study is to suggest and to evaluate a RCAT algorithm for re-estimating examinee’s ability in CAT with limited answer review. The RCAT algorithm was examined through a simulation study.
    The results in this study show that RCAT is more precise than CAT in estimating examinee’s ability. Comparison between the RCAT and CAT conditions yields no significant differences in estimated measurement error. However, RCAT needs longer test length than CAT to achieve the level of standard error we set. It’s the cost that RCAT pays.

    中文摘要.............................................i 英文摘要............................................ii 附圖目錄.............................................v 附表目錄............................................vi 第一章 緒論.........................................1 第一節 研究背景.................................1 第二節 研究動機與目的...........................2 第三節 研究範圍與限制...........................3 第二章 相關文獻探討.................................4 第一節 項目反應理論.............. ..............4 第二節 電腦化適性測驗..........................13 第三節 可回溯的電腦適性化測驗..................18 第四節 RCAT的模擬環境設定......................21 第三章 實驗設計與研究工具..........................23 第一節 實驗設計................................23 第二節 研究工具................................34 第四章 結果與討論..................................37 第一節 能力估計值的準確度比較..................37 第二節 測驗標準誤的比較........................42 第三節 達到設定之測驗標準誤所需測驗長度比較....45 第五章 結論與建議..................................50 第一節 結論....................................50 第二節 建議....................................51 參考文獻............................................52 附錄一 虛擬題庫之試題參數資料表....................55 附錄二 模擬畫面....................................63

    何榮桂(1997)。遠距測驗—Dear CAT 的設計與實施。物理教育,1(1),51-62頁。
    何榮桂(1999)。量身訂製的測驗-適性測驗。測驗與輔導,157期,3288-3293頁。
    洪碧霞、吳鐵雄(1989),簡介電腦化適性測驗的發展及其實施要素並兼論我國大專聯考電腦適性化的可能性。測驗年刊,36期,75-83頁。
    許澤基(1995a)。項目反應理論:心理與教育測驗(235-253頁)。臺北:心理出版社。
    許澤基(1995b)。適性測驗:心理與教育測驗(459-472頁)。臺北:心理出版社。
    簡茂發(2006)。國際電腦化測驗發展趨勢之研究。95年度考選制度研討會系列二,電腦測驗發展趨勢與國家考試電腦化測驗考試研討會,與談人書面資料。
    Baker, F. B. (1985). The basic of item response theory. Portmouth, NH: Heinemann.
    Birnbaum, A.(1968). Some latent traits models and their use in inferring an examinee’s ability. In F. M. Lord and M.R. Novick, Statistical theories of mental test scores. Reading MA: Addison-Wesley.
    Gershon, R., & Bergstrom, B. (1995, April). Does cheating on CAT pay: Not. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
    Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Nijhoff Publishing.
    Hambleton, R. K., & Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. Sage Publications Inc.
    Horward, W. (1990). Computerized Adaptive Testing: A Primer.
    Kingsbury, G. G. (1996, April). Item review and adaptive testing. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.
    Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
    Lunz, M. E., Bergstrom, B. A., & Wright, B. D., (1992). The effect of review on student ability and test efficiency for computerized adaptive tests. Applied Psychological Measurement, 16, 33-40.
    Mislevy, R. J., and Bock, R. D. (1989). PC-BILOG 3: Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software Inc.
    Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical: Association, 70, 351-356.
    Papanastasiou, E. C. (2005). Item Review and the Rearrangement Procedure: Its process and its results. Educational Research and Evaluation, 11, 303-321.
    Parshall, C. G. (2002). Practical Considerations in Computer-based Testing: Springer New York.
    Stocking, M. L. (1997). Revising item responses in computerized adaptive tests: A comparison of three models. Applied Psychological Measurement, 21, 129-142.
    Vispoel, W. P. (1998). Reviewing and changing answers on computerized-adaptive and self-adaptive vocabulary tests. Journal of Educational Measurement, 35, 328-345.
    Vispoel, W. P., Hendrickson, A. B., & Bleiler, T. (2000). Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results. Journal of Educational Measurement, 37(1), 21-38.
    Waddell, D. L., & Blankenship, J. C. (1994). Answer changing: A meta-analysis of the prevalence and patterns. Journal of Continuing Education in Nursing, 25. 155-158.
    Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12, 15-20.
    Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473-492.
    Wise, S. L., Roos, L. R., Plake, B. S., & Nebelsick-Gullett, L. J. (1994). The relationship between examinee anxiety and preference for self-adapted testing. Applied Measurement in Education, 7(1), 81-91.
    Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-166.
    Wright, B. D. and Stone, M. (1979). Best test design. Chicago: MESA Press.

    QR CODE