研究生: |
林信雄 Lin, Xinxiong |
---|---|
論文名稱: |
可回溯的電腦化適性測驗 Computerized adaptive tests with limiting answer review |
指導教授: |
何榮桂
Ho, Rong-Guey |
學位類別: |
碩士 Master |
系所名稱: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 67 |
中文關鍵詞: | 項目反應理論 、回溯 、電腦化適性測驗 |
英文關鍵詞: | item response theory(IRT), review, computerized adaptive testing(CAT) |
論文種類: | 學術論文 |
相關次數: | 點閱:241 下載:25 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
電腦化適性測驗(computerized adaptive tests, CAT)考量到能力估計值演算法之準確度,不允許受試者對已作答的題目進行回溯,因此無形中提高了受試者的測驗焦慮指數,亦造成受試者的真實能力更難以被精確地測量。本研究嘗試提出一簡單合理的演算法(RCAT演算法),以供可回溯的CAT作為能力估計值重新計算之用。研究採用模擬方式評估演算法之良窳。
研究結果顯示,使用RCAT演算法之可回溯CAT可更精準測得受試者的真實能力值,且RCAT完成測驗後之測驗標準誤與CAT不相上下,只是在使用RCAT時,需要使用較長的測驗長度作為代價,方可達到與CAT相同水準的測驗標準誤。
The popularity of computerized adaptive testing (CAT) has been increasing year by year. Item review is not permitted in many adaptive tests because test makers consider it does not follow the logic on which adaptive tests are based. Studies show that the no-revision policy causes increased anxiety to examinees. Consequently, this anxiety can decrease the examinee’s performance on adaptive tests, which would increase the error in their ability estimates. The purpose of this study is to suggest and to evaluate a RCAT algorithm for re-estimating examinee’s ability in CAT with limited answer review. The RCAT algorithm was examined through a simulation study.
The results in this study show that RCAT is more precise than CAT in estimating examinee’s ability. Comparison between the RCAT and CAT conditions yields no significant differences in estimated measurement error. However, RCAT needs longer test length than CAT to achieve the level of standard error we set. It’s the cost that RCAT pays.
何榮桂(1997)。遠距測驗—Dear CAT 的設計與實施。物理教育,1(1),51-62頁。
何榮桂(1999)。量身訂製的測驗-適性測驗。測驗與輔導,157期,3288-3293頁。
洪碧霞、吳鐵雄(1989),簡介電腦化適性測驗的發展及其實施要素並兼論我國大專聯考電腦適性化的可能性。測驗年刊,36期,75-83頁。
許澤基(1995a)。項目反應理論:心理與教育測驗(235-253頁)。臺北:心理出版社。
許澤基(1995b)。適性測驗:心理與教育測驗(459-472頁)。臺北:心理出版社。
簡茂發(2006)。國際電腦化測驗發展趨勢之研究。95年度考選制度研討會系列二,電腦測驗發展趨勢與國家考試電腦化測驗考試研討會,與談人書面資料。
Baker, F. B. (1985). The basic of item response theory. Portmouth, NH: Heinemann.
Birnbaum, A.(1968). Some latent traits models and their use in inferring an examinee’s ability. In F. M. Lord and M.R. Novick, Statistical theories of mental test scores. Reading MA: Addison-Wesley.
Gershon, R., & Bergstrom, B. (1995, April). Does cheating on CAT pay: Not. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Nijhoff Publishing.
Hambleton, R. K., & Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. Sage Publications Inc.
Horward, W. (1990). Computerized Adaptive Testing: A Primer.
Kingsbury, G. G. (1996, April). Item review and adaptive testing. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.
Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Lunz, M. E., Bergstrom, B. A., & Wright, B. D., (1992). The effect of review on student ability and test efficiency for computerized adaptive tests. Applied Psychological Measurement, 16, 33-40.
Mislevy, R. J., and Bock, R. D. (1989). PC-BILOG 3: Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software Inc.
Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical: Association, 70, 351-356.
Papanastasiou, E. C. (2005). Item Review and the Rearrangement Procedure: Its process and its results. Educational Research and Evaluation, 11, 303-321.
Parshall, C. G. (2002). Practical Considerations in Computer-based Testing: Springer New York.
Stocking, M. L. (1997). Revising item responses in computerized adaptive tests: A comparison of three models. Applied Psychological Measurement, 21, 129-142.
Vispoel, W. P. (1998). Reviewing and changing answers on computerized-adaptive and self-adaptive vocabulary tests. Journal of Educational Measurement, 35, 328-345.
Vispoel, W. P., Hendrickson, A. B., & Bleiler, T. (2000). Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results. Journal of Educational Measurement, 37(1), 21-38.
Waddell, D. L., & Blankenship, J. C. (1994). Answer changing: A meta-analysis of the prevalence and patterns. Journal of Continuing Education in Nursing, 25. 155-158.
Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12, 15-20.
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473-492.
Wise, S. L., Roos, L. R., Plake, B. S., & Nebelsick-Gullett, L. J. (1994). The relationship between examinee anxiety and preference for self-adapted testing. Applied Measurement in Education, 7(1), 81-91.
Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-166.
Wright, B. D. and Stone, M. (1979). Best test design. Chicago: MESA Press.