研究生: |
陳麗如 Chen, Li-Ju |
---|---|
論文名稱: |
BRR-CAT對能力估計與測驗焦慮之影響 |
指導教授: |
何榮桂
Ho, Rong-Grey |
學位類別: |
博士 Doctor |
系所名稱: |
資訊教育研究所 Graduate Institute of Information and Computer Education |
論文出版年: | 2009 |
畢業學年度: | 97 |
語文別: | 英文 |
論文頁數: | 108 |
中文關鍵詞: | 項目反應理論 、電腦化適性測驗 、分區式回溯 、分區 、測驗焦慮 |
英文關鍵詞: | Item response theory (IRT), computer adaptive testing (CAT), block review, rearrangement procedure, test anxiety |
論文種類: | 學術論文 |
相關次數: | 點閱:243 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
傳統電腦化適性測驗(computerized adaptive test; CAT)中,提供分區式的回溯 (Block-Review;BR)機制,可給予受試者修正誤解題意、輸入錯誤及計算錯誤的機會,有助受試者發揮其部分知識,展現最大能力表現,亦滿足受試者對回溯及改答機會的需求,舒緩其因無法回溯及改答而引發的挫折與焦慮。然而,受試者改答後,隨之改變的作答反應(response pattern)可能會形成不合理的情況,進而影響能力重新估計的結果。受試者改答後,於變動的作答反應中加入重排程序(rearrangement procedure),調整為合理的作答反應是否可行?重排後的作答反應是否增加受試者能力重估計的精確度?皆為BR-CAT中待解決的問題。
本研究旨在提出於BR-CAT中加入作答反應重排程序(rearrangement procedure)之可行性,及探究分區重排式CAT(BRR-CAT)對受試者能力估計之精確度與效率,以及測驗焦慮之影響。第一階段模擬實驗,比較BRR-CAT、BR-CAT與傳統CAT於能力估計值、測驗標準誤及施測長度的差異情形。研究結果顯示,與傳統CAT相比,當受試者為中高能力考生,BRR-CAT和BR-CAT演算法測得的能力估計值更為接近受試者的真實能力值,而測量標準誤(standard error, SE)則無顯著差異。此外,當測驗結束條件較寬鬆時(例SE .35), BRR-CAT和BR-CAT之施測長度與CAT無顯著差異。第二階段為實徵實驗,即依模擬研究結果,建置BRR-CAT、BR-CAT與CAT之線上測驗系統,蒐集真實受試者之作答反應及測驗焦慮資料,比較BRR-CAT、BR-CAT與傳統CAT對受試能力估計及測驗時間的差異情形,以及受試者對測驗焦慮的反應狀況。資料分析結果顯示,與CAT和BR-CAT相較,BRR-CAT演算法測得的能力估計值與SE皆無顯著差異,而接受BRR-CAT和BR-CAT時,學生的擔心和緊張程度較低。
The block-review computerized adaptive test (BR-CAT) offers the review and change mechanism, provides examinees with the opportunities to check answers and revise key-in errors or mistakes after completing a block items. It also helps examinees to mitigate their test anxiety. However, changing answers would make an unsuitable estimator in ability re-estimation if the new response pattern was unreasonable. Is it practicable to incorporate the rearrangement procedure into BR-CAT so as to arrange a reasonable order of new response patterns and improve the precision of ability re-estimation? We would like to find a solution to this problem.
In this study, the BRR-CAT algorithm incorporating the rearrangement procedure into BR-CAT was designed. Two phase experiments were carried out to investigate the precision and the efficiency of BRR-CAT on ability estimation and the effect of BRR-CAT on examinees’ test anxiety. In Phase 1 (the simulated experiment), compared with CAT, the estimations of BRR-CAT were closer to examinees’ true ability and kept the equal SE when examinees’ ability was middle and high. Moreover, the precision of BRR-CAT and BR-CAT was equal. The efficiency of BRR-CAT would not be decreased when a moderate SE (SE .35) as a stopping criterion was set. In Phase 2 (an empirical experiment), three versions of testing system were implemented and the response patterns and test anxiety records from 112 participants were collected. The analytical results showed that the estimators and SE of BRR-CAT on ability estimation was equal to those of BR-CAT and CAT. Additionally, compared with CAT, the participants have lower worry and tenseness in a reviewable CAT environment such as BRR-CAT and BR-CAT.
何榮桂 (1999) , 量身訂製的測驗—適性測驗, 測驗與輔導 , 157期 , 3289-3293。
村上公一, 砂岡和子和劉松 (2005), Computerized Adaptive Testing(CAT)方式的網路中文口語測驗的開發, 第四屆全球華文網路教育研討會, 2005, 7月3 日-5日.
洪碧霞、吳鐵雄(1989), 簡介電腦化適性測驗的發展及其實施要素並兼論我國大專聯考電腦適性化的可行性, 測驗年刊, 36輯, 75-94。
溫福星(1994), 回顧與時限對電腦化線上測驗結果的影響, 國立臺灣師範大學資訊教育研究所碩士論文。
Assessment Systems Corporation. (1989). User’s manual for the MicroCAT testing system (3rd ed.). St. Paul, MN: Author.
Assessment Systems Corporation. (1995). XCALBRE for Windows user manual. St. Paul, MN: Author.
Baker, F. B. (1985). The basic of item response theory. Portmouth, NH: Heinemann.
Baker, F. B.(1992). Item response theory: Parameter estimation techniques. NY: Marcel.
Bejar, I. I., & Weiss, D.J. (1979). Computer programs for scoring test data with item characteristic curve models. (Research Report No. 79-1). Minneapolis: Department of Psychology, Psychometric Methods Program, University of Minnesota.
Birnbaum, A. (1968). Some latent traits models and their use in inferring an examinee’s ability. In F. M. Lord and M.R. Novick, Statistical theories of mental test scores. Reading MA: Addison-Wesley.
Bock, R. D., & Aikin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-46.
Boekkooi-Timminga, E.(1990). A method for designing IRT-based item banks. (Research Report 90-7. Twente Univ.). Twente Univ., Enschede (Netherlands). Dept. of Education.
Brown, J. D.(1997). Computers in language testing: present research and some future directions. Language Learning & Technology, 1(1), 44-59.
Brown, J. D.(1999). Considerations in developing or using second/foreign language proficiency computer-adaptive tests. Language Learning & Technology, 2(2), 88-93.
Cassady, J. C., & Johnson, R. E. (2002). Cognitive test anxiety and academic performance. Contemporary Educational Psychology, 27, 270-295.
Chapell, M. S., & Overton, W. (1998). Development of logical reasoning in the context of parental style and test anxiety. Merrill-Palmer Quarterly, 44(2), 141-156.
Deffenbacher, J. L. (1980). Worry and emotionality in test anxiety. In I. G. Sarason (Ed.), Test anxiety: Theory, research, and applications (pp111-124). Hillsdale, NJ: Lawrence Erlbaum, Inc.
Dunkel, P. (1997). Computer-adaptive testing of listening comprehension: A blueprint for CAT development. The Language Teacher, 21, 1-8. Retrieved August 15, 2008 from the World Wide Web: http://www.jalt-publications.org/tlt/files/97/oct/dunkel.html
Friedman, I. A., & Bendas-Jacob, O.(1997). Measuring perceived test anxiety in adolescents: A self-report scale. Educational and Psychological Measurement, 57(6), 1035-1046.
Gershon, R., & Bergstrom, B. (1995, April). Does cheating on CAT pay: Not. Paper Presented at the annual meeting of the American Educational Research Association, San Francisco.
Green, D. R., Yen, W. M., & Burket, G. R. (1989). Experiences in the application of item response theory in test construction. Applied Measurement in Education, 2(4), 297-312.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Nijhoff Publishing.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Sage Publications Inc.
Hancock, D. R. (2001). Effects of test anxiety and evaluative threat on students' achievement and motivation. Journal of Educational Research, 94(5), 284-29.
Higgins, J., Russell, M., & Hoffmann, T. (2005). Examining the effect of computer-based passage presentation on reading test performance. The Journal of Technology, Learning, and Assessment, 3(4), 1-35
Ho, R. G. (1989). Computerized adaptive testing. Psychological Testing, Vol.XXXVI, 117-13.
Ho, R. G., & Hsu, T. C. (1989). A comparison of three adaptive testing strategies using MicroCAT. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
Ito, K., & Sykes, R. C. (1994, April). The effect of restrict in ability distributions in the estimation of item difficulties: implications for a CAT implementation. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.
Kaiser, H. F. (1974). An index of factorial simplicity, Psychometrika, 39, 31-36.
Kingsburg, G. G., & Weiss, D. J. (1983). A comparison of IRT-bases adaptive mastery testing and a sequential mastery testing procedure. In D. J. Weiss (Ed.). New horizons in testing (pp.257-283). N.Y.: Academic Press.
Kingsbury, G. G. (1996, April). Item review and adaptive testing. Paper presented at the annual meeting of the National Council on Measurement in Education, New York.
Lazarte, A. A. (1999). Modeling time to respond and probability of correct answer in a simulated computerized test-taking situation. Paper presented at the annual meeting of the American Educational Research Association, Montreal, Quebec, Canada, April 19-23, 1999. ERIC Document Reproduction Service No. ED 430039.
Lord, F. M. (1977). Practical applications of item characteristic curve theory. Journal of Educational Measurement, 14, 117-138.
Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Lunz, M. E., Bergstrom, B. A., & Wright, B. D., (1992). The effect of review on student ability and test efficiency for computerized adaptive tests. Applied Psychological Measurement, 16, 33-4.
Madsen, H. S., & Larson, J. W. (1985). Computerized adaptive language testing: moving beyond computer assisted testing. CALICO (Computer Assisted Language Instruction Consortium) Journal, 2(3), 32-36.
McMorris, R. F., & Leonard, G. (1976). Item response changes and cognitive style. Paper presented at the 22nd annual meeting of the National Council on Measurement in Education, San Francisco, California. ERIC Document Reproduction Service No. ED 12991839.
Millman, J., & Arter J. A.(1984). Issue in banking. Journal of Educational Measurement, 21(4),315-33.
Mills, C. N., & Stocking, M. L. (1995) Practical issue in large-scale high-stakes computerized adaptive testing. (Research Report RR-95-23). Princeton, NJ: Educational Testing Service.
Mislevy, R. J., & Bock, R. D. (1989). PC-BILOG 3: Item analysis and test scoring with binary logistic models. Mooresville, IN: Scientific Software Inc.
Mislevy, R.J., & Bock, R. D. (1993). Bilog3: Item analysis and test scoring with binary logistic models. Chicago, IN: Scientific Software, Inc.
Mislevy, R.J., & Stocking, M.L. (1989). A consumer’s guide to LOGIST and BILOG. Applied Psychological Measurement, 13, 57-75.
O'Neil, H. F. Jr., & Richardson, F. C. (1980). Test anxiety reduction and computer-based learning environment. In I. G. Sarason (Ed.), Test anxiety: Theory, research and applications (pp. 311-326). Hillsdale, NJ: Lawrence Erlbaum, Inc.
Owen, R. J. (1969). A Bayesian approach to tailored testing. (Research Bulletin), 69-92, Princeton, N. J.:Educational Testing Service.
Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical: Association, 70, 351-356.
Papanastasiou, E. C. (2005). Item review and the rearrangement procedure: Its process and its results. Educational Research and Evaluation, 11, 303-321.
Parshall, C. G., Kalhn, J. C., & Davey, T. (Eds.). (2002). Practical considerations in computer based testing. New York: Springer-Verlag.
Sarason, I. G. (1978). The test anxiety scale: concept and research. In C.D. Spielberger, & I.G. Sarason (Eds.), Stress and anxiety (Vol. 5). Washington, D.C.: Hemisphere Publishing Corp.
Sarason, I. G. (1980). Introduction to the study of test anxiety. In I. G. Sarason (Ed.), Test anxiety: Theory, research and applications (pp. 3-14). Hillsdale, NJ: Lawrence Erlbaum, Inc.
Shermis, M. D., & Lombard, D. (1998). Effects of computer-based test administrations on test anxiety and performance. (ERIC Document Reproduction Service No. EJ56140)
Spielberger, C.D. (1980). Preliminary professional manual for the test anxiety inventory. Palo Alto , CA : Consulting Psychologists Press.
Stocking, M. L. (1988) Some considerations in maintaining adaptive test item pools. (Research Report RR-88-33). Princeton, NJ: Educational Testing Service.
Stocking, M. L. (1994). Three practical issue for mordern adaptive testing item pools. (Research Report RR-94-5). Princeton, NJ: Educational Testing Service.
Stocking, M. L. (1997). Revising item responses in computerized adaptive tests: A comparison of three models. Applied Psychological Measurement, 21, 129-142.
Stone, G. E., & Lunz, M. E. (1994). The effect of review on the psychometric characteristics of computerized adaptive tests. Applied Measurement in Education, 7, 211–222.
Urry, V.W. (1977). Tailolored testing: A successful application of latent trait theory. Journal of Education Measurement, 14, 181-196.
Vispoel, W. P. (1998). Reviewing and changing answers on computerized-adaptive and self-adaptive vocabulary tests. Journal of Educational Measurement, 35, 328-345.
Vispoel, W. P. (2000). Reviewing and changing answers on computerized fixed-item vocabulary tests. Educational and Psychological Measurement, 60, 371–384.
Vispoel, W. P., Clough, S. J., & Bleiler, T. (2005). A closer look at using judgments of item difficulty to change answers on computerized adaptive tests, Journal of Educational Measurement, 42(4), 331–35.
Vispoel, W. P., Hendrickson, A. B., & Bleiler, T. (2000). Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results. Journal of Educational Measurement, 37(1), 21-38.
Waddell, D.L., Blankenship, J.C. (1995). Answer changing: A meta-analysis of the prevalence and patterns. Journal of Continuing Education in Nursing, 25, 155–158.
Wainer, H. (1993). Some practical considerations when converting a linearly administered test to an adaptive format. Educational Measurement: Issues and Practice, 12, 15-2.
Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., Mislevy, R. J., Steinberg, L., & Thissen, D. (1990). Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates.
Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6, 473-492.
Wise, S. L., Plake, B. S., Johnson, P.L., & Roos, L. L. (1992). A comparison of self-adapted and algorithmic adaptive achievement tests. Journal of Educational Measurement, 29, 329–339.
Wise, S. L., Roos, L. R., Plake, B. S., & Nebelsick-Gullett, L. J. (1994). The relationship between examinee anxiety and preference for self-adapted testing. Applied Measurement in Education, 7(1), 81-91.
Wise, S.L. (1996, April). A critical analysis of the arguments for and against item review in computerized adaptive testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New York City.
Wright, B. D., & Stone, M. (1979). Best test design. Chicago: MESA Press.