研究生: |
陶君浩 Chun-Hao Tao |
---|---|
論文名稱: |
局部試題依賴偵測方法之偵測效果比較 Compare the Detection Result by Using Different Local Item Dependent Detection Methods |
指導教授: |
陳柏熹
Chen, Po-Hsi |
學位類別: |
碩士 Master |
系所名稱: |
教育心理與輔導學系 Department of Educational Psychology and Counseling |
論文出版年: | 2012 |
畢業學年度: | 101 |
語文別: | 中文 |
論文頁數: | 71 |
中文關鍵詞: | 題組 、局部試題依賴 、題組效果 |
英文關鍵詞: | testlet, local item dependent, testlet effect |
論文種類: | 學術論文 |
相關次數: | 點閱:162 下載:10 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本研究旨在比較Rasch題組模式之題組效果估計、題組-殘差主成分分析及Q3指標等局部試題依賴偵測方法的偵測效果,包含兩個子研究。研究一為模擬研究,主要是操弄不同的題組效果高低及不同的題組內試題數,模擬了500名與1500名受測者在不同題組情境下的作答反應,再利用Rasch題組模式之題組效果估計、題組-殘差主成分分析及Q3指標分別對各個題組進行局部試題依賴的偵測,最後透過題組效果的參數回覆情形、題組-殘差主成分分析及Q3指標的偵測結果以及各偵測方法之偵測結果與題組效果真值的Spearman's ρ係數來瞭解各種局部試題依賴偵測方法之偵測效果的差異情形。而在研究二的實徵研究部分,研究者則是以93~98年國民中學基本學力測驗英文科的題組資料為例,進行上述三種局部試題依賴偵測方法之偵測結果的比較。
主要發現如下:
一、隨著受測者人數及題組內試題數的增加,題組效果參數的回覆情形會逐漸變好,即Rasch題組模式的題組效果估計會越來越準確;但在高題組效果時,其估計的精準度卻相對較差。
二、不論是在何種情境之下,Q3指標的偵測效果均一致地優於其他兩個偵測方法的偵測效果。
三、對於93~98年國民中學基本學力測驗英文科的題組而言,不同方法的偵測結果有差異,其中又Rasch題組模式之題組效果估計的偵測結果與其他兩者差異最大。
四、根據Q3指標的偵測結果,93~98年國中基測英文科的題組大致沒有局部試題依賴的情形,僅93-2-3、93-2-5、93-2-6、94-1-8及97-2-3這幾個題組可能是存有局部試題依賴的問題。
The purpose of this research is to compare the detection result by using testlet effect estimates of the Rasch testlet model, testlet-residual based principal component analysis and the Q3 statistics. The research is composed of two sub-researches. Study 1 is a simulation study. In study 1, first, testlet effects (high/ low), sample sizes (500/1500) and the item numbers within testlet (2/4/6/8) were manipulated. Testlet effect estimates of the Rasch testlet model, testlet-residual based principal component analysis and the Q3 statistics were used to detected local item dependent for each testlet. The parameters recovery of testlet effect, the detection result of testlet-residual based principal component analysis and the Q3 statistics, and the Spearman's ρ coefficient of local item dependent detection result with the true value of testlet effect were used to compare the detection result of different local item dependent detection methods. Study 2 is an empirical study. These three local item dependent detection methods were compared and applied to the data from the English subject of Basic Competence Test for Junior High School Students(2004~2009).
The main results are the following:
1.As sample sizes and the item numbers in each testlet were increased, the parameters recovery of testlet effect decreased to an acceptable level. It means the testlet effect estimates of the Rasch testlet model will be more and more accurate when the sample sizes or item numbers increases. However, in the condition of high testlet effect, the estimated accuracy of testlet effect were decreased instead.
2.No matter in what situations, the detection result of the Q3 statistics was better than the other two detection methods.
3.The detection results have varied between these three local item dependent detection methods on the english subject of Basic Competence Test for Junior High School Students(2004~2009), testlet effect estimates of the Rasch testlet model especially.
4.According to the detection result of the Q3 statistics, there were no local item dependent for each testlet on the english subject of Basic Competence Test for Junior High School Students(2004~2009), except for 93-2-3, 93-2-5, 93-2-6, 94-1-8, and 97-2-3 .
一、中文部分
王文中、呂金燮、吳毓瑩、張郁雯、張淑慧(2008)。教育測驗與評量-教室學習觀點。台北:五南。
教育部(2000)。國民中小學九年一貫課程試辦工作輔導手冊:Q&A問題與解答篇。台北市:教育部。
陳柏熹、黃宏宇、王文中(2008)。題組之相關特性對電腦化適性測驗測量精準度的影響。測驗學刊,55(1),129-150。
二、英文部分
Andrich, D. (1985). A latent trait model for items with response dependencies: implications for test construction and analysis. In S. Embretson (Ed.), Test design: contributions from psychology, education, and psychometrics (pp. 245-273) . New York, NY: Academic Press.
Adams, R. J., Wilson, M. R., & Wang, W.-C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1-23. Allen, S., & Sudweek, R. R. (2001). Identify and managing local item dependencies in context-dependent item sets. Paper presented at the annual meeting of American Educational Research Association, Seattle, WA.
Bishop, Y. M. ML, Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis. Cambridge, MA: MIT Press.
Bradlow, E.T., Wainer, H., & Wang, X. (1999). A Bayesian random effects models for testlet. Psychometrika, 64(2), 153-168.
Chen, W. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Educational and Behavioral Statistics, 22, 265-289.
Chou, Y.-T. & Wang, W. (2010). Checking Dimensionality in Item Response Models With Principal Component Analysis on Standardized Residuals. Educational and Psychological Measurement,70(5), 717-731.
DeMars, E. (2006). Application of the Bi-factor Multidimensional Item Response Theory Model to Testlet-Base Test. Journal of Educational Measurement,43(2), 145-168.
Ferrara, S., Huynh, H., & Baghi, H. (1997). Contextual characteristics of locally dependent open-ended item clusters on a large-scale performance assessment. Applied Measurement in Education, 12, 123-144.
Ferrara, S., Huynh, H., & Michaels, H. (1999). Contextual explanations of local dependence in item clusters in a large-scale hands-on science performance assessment. Journal of Educational Measurement, 36, 119-140.
Haladyna T. M. (1992). Context-Dependent Item Sets. Educational Measurement: Issue and Practice, 11(4), 21-25.
Huynh, H., Michaels, H. & Ferrara, S. (1995, April). Statistical procedures to identify clusters of items with local dependency. In H. Huynh (Chair), Technical advances in partial credit models and their applications to performance assessments. Symposium conducted at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.
Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: principles and applications. Boston, MA: Kluwer-Nijhoff.
Hambleton, R.K. (1989). Principles and selected applications of item response theory. In R. L. Linn(Ed.), Educational Measurement (3rd ed.)(pp. 147-200). New York, NY: American Council on Education and Macmillan.
Hambleton, R.K., & Jones, R.W. (1994). Comparsion of empirical and judgmental procedures for detecting differential item functioning. Educational Research Quarterly, 18, 23-26.
Lee, Y. (1998). Examining the suitability of an IRT-based testlet approach to the construction and analysis of passage-based items in an EFL reading comprehension test in the Korean High School Context. Unpublished doctoral dissertation, The Pennsylvania State University, University Park, PA.
Lee G., Brennan R. L., & Frisbie D. A. (2000). Incorporating the testlet concept in test score analyses. Educational Measurement: Issue and Practice, 19(4), 9-15.
Linacre, J. M. (2001). WINSTEPS Rasch measurement computer program. Chicago, IL: WINSTEPS.com.
Lee, Y. (2004). Examining passage-related local item dependence (LID) and measurement construct using Q3 statistics in an EFL reading comprehension test. Language Testing, 21(1), 74-100.
Lisa A. Keller, Hariharan Swaminathan, & Stephen G.Sireci. (2003). Evaluating Scoring Procedures for Context-Dependent Item Sets. Applied Measurement in Education, 16(3), 207-222.
Mehrens, W.A. & Lehman, I.J. (1978). Measurement and evaluation in education and psychology. 2nd edition. New York, NY: Holt, Reinhart, and Winston.
Muraki, E. & Lee, Y. (2001). Detecting local item dependency in the TOEFL reading comprehension section: an application of the fullinformation item factor analysis. Draft research report. Princeton, NJ: ETS.
Pommerich, M. & Segall, D. O. (2008). Local Dependence in an Operational CAT: Diagnosis and Implications. Journal of Educational Measurement, 26(3), 201-223.
Rosenbaum, P.R. (1988). Item bundles. Psychometrika, 53, 349-59.
Samejima, F. (1969). Estimation of latent ability using a response patten of graded scores. Psychometrika Monograph Supplement, 17, 1-100.
Sheehan, K. M., Ginther, A., & Schedl M. (1999). Understanding performance on the TOEFL reading comprehension section: A tree-based regression approach. Paper presented at annual conference of the American Association of Applied Linguistics, Stamford, CT.
Thissen, D., Steinberg, L. & Mooney J.A. (1989). Trace lines for testlets: a use of multiple-categorical response models. Journal of Educational Measurement, 26, 247-260.
Thissen, D., Billeaud, K., McLeod, L., & Nelson, L. (1997, August). A brief introduction to item response theory for items scored in more than two categories. Paper presented at the National Assessment Governing Board Achievement Levels Workshop, Boulder, CO.
Wilson, M., & Adams, R. (1995). Rasch models for item bundles. Psychometrika, 60, 181-198.
Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago, IL: MESA.
Wainer, H., & Kiely, G. N. (1987). Item clusters and computer adaptive testing: A case of testlets Journal of Educational Measurement, 24, 185-201.
Wainer, H., & Lewis, C. (1990). Toward a psychometrics for testlets. Journal of Educational Measurement, 27, 1-14.
Wainer, H., & Thissen, D. (1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability? Educational Measurement: Issues and Practice, 15, 22-29.
Wainer, H., Bradlow, E.T. & Du, Z. (1999). Testlet response theory: an analog for the 3-PL. In W.J. van der Linden & C.A.W. Glas (Eds.) , Computerized adaptive testing (pp. 245-269). Boston, MA: Kluwer-Nijhoff.
Wainer, H., Bradlow, E. T., & Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in adaptive testing. In W. J. van der Linden & C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245-270). Boston: Kluwer-Nijhoff.
Wang, W. C., & Wilson, M. R. (2005). The Rasch testlet model. Applied Psychological Measurement, 29, 126-149.
Wu, M. L., Adams, R. J., Wilson, M. R., & Haldane, S. A. (2007). ACER ConQuest version 2.0: Generalised item response modelling software [Computer software and manual]. Camberwell, Australia: ACER Press.
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the threeparameter logistic model. Applied Psychological Measurement, 8, 125-145.
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30(3), 187-213.
Zenisky, A. L., Hambletom, R. K., & Stephen S. G.(2002). Identification and Evaluation of Local Item Dependencies in the Medical College Admissions Test. Journal of Educational Measurement, 39(4), 291-309.