研究生: |
吳冠瑩 |
---|---|
論文名稱: |
A Comparison of Three Polytomous DIF Detection Methods |
指導教授: |
蔡容青
Tsai, Rung-Chin |
學位類別: |
碩士 Master |
系所名稱: |
數學系 Department of Mathematics |
論文出版年: | 2004 |
畢業學年度: | 92 |
語文別: | 中文 |
論文頁數: | 49 |
英文關鍵詞: | Cognitive Abilities Screening Instrument (CASI), Dementia, DFIT, Differential Item Functioning, Graded Response Model, Likelihood Ratio Test, Logistic Regression Procedure |
論文種類: | 學術論文 |
相關次數: | 點閱:216 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
The performance of the three procedures -- the Logistic Regression
procedure (Logi; French & Miller, 1996), the Likelihood Ratio test (LR;
Thissen, Steinberg, & Gerard, 1986), and the Differential Functioning of
Items and Tests procedure (DFIT; Flowers, Oshima, & Raju, 1999) in
detecting differential item functioning (DIF) under the graded response
model (GRM) were compared in a simulation study. Factors manipulated
included sample size, differences in the ability distributions between the
focal and the reference groups, and four different percentages of DIF
items contained in a test. For each of the sixteen combinations, 100
replications of DIF detection were simulated. All three DIF procedures
adhered to nominal Type I error rates under most conditions. LR was the
most powerful among the three under all situations. DFIT was less
powerful than LR, but also useful for DIF detection especially with
groups of different ability distributions and relatively large percentage of
DIF items. Logi, with mean Powers lower than 0.4 in all conditions,
appeared to be sensitive only to items with large DIF size. In addition, the
three procedures were used to assess DIF of the Cognitive Ability
Screening Instrument (CASI) and the results of the DIF analysis were
compared to previous studies.
Andersen, E. B., & Madsen, M. (1977). Estimating the parameters of a latent population
distribution. Psychometrika, 42, 357-374.
Bock, R. D., & Aitkin, M. (1981). Maximum likelihood estimation of item parameters: an
application of the EM algorithm. Psychometrika, 46, 443-459.
Bolt, D. M.(2002). A monte carlo comparison of parametric and nonparametric polytomous
dif detection methods. Applied Measurement in Education, 15, 113-141.
Baker, F. B.(1993). EQUATE2: Computer program for equating two metrics in item re-
sponse theory [Computer program]. Madison: University of Wisconsin, Laboratory of
Experimental Design.
Camilli, G., & Shepard, L. A.(1994). Methods for Identifying Biased Test Items. Sage: Thou-
sand Oaks.
Chang, H. H. & Mazzeo, J.(1994). The unique correspondence of the item response function
and item category response functions in polytomously scored item response models.
Psychometrika, 59, 391-404.
Chang, H. H., Mazzeo, J., & Roussos, L.(1996). Detecting DIF for polytomously scored items:
An adaptation of the SIBTEST procedure. Journal of Educational Measurement, 32,
79-96.
Cohen, A. S., Kim, S. H., & Baker, F. B.(1993). Detection of dierential item functioning
in the graded response model. Applied Psychological Measurement, 17(4), 335-350.
Crane, P. K., Belle, G. V., & Larson, E. B.(2004). Test bias in a cognitive test: dierential
item functioning in the CASI. Statistics in Medicine, 23, 241-256.
Embretson, S. E., & Reise, S. P.,(2000). Item Response Theory for Psychologists. Lawrence
Erlbaum: Mahwah, NJ, 2000.
Flowers, C. P., Oshima, T. C., & Raju, N. S.(1999). A description and demonstration of the
polytomous-DFIT framework. Applied Psychological Measurement, 23, 309-326.
French, A. W., & Miller, T. R.(1996). Logistic regression and its use in detecting dierential
item functioning in polytomous items. Journal of Educational Measurement, 33, 315-
332.
Holland, P. W., & Thayer, D. T. (1988). Dierential item performance and the Mantel-
Haenszel Procedure. In H. Wainer, & H. I.Braun (Eds.), Test validity (pp. 129-145).
Hillsdale, NJ: Lawrence Erlbaum.
Jodoin, M. G.(2001). Evaluating type I error and power rates using an eect size measure
with the logistic regression procedure for DIF detection. Applied Measurement in Edu-
cation, 14(4), 329-349.
Kim, S. H., & Cohen, A. S.(1991). A comparison of two area measures for detecting dier-
ential item functioning. Applied Psychological Measurement, 15(3), 269-278.
Kim, S. H., & Cohen, A. S.(1998). Detection of dierential item functioning under the graded
response model with the likelihood ratio test. Applied Psychological Measurement, 22,
345-355.
Lin, K. N., Wang, P.N., Liu, C. H., Chen, W. T., Lee, Y. C., & Liu, H. C.(2002). Cuto scores
of the cognitive abilities screening instrument, Chinese version in screening dementia.
Dementia Geriatric Cognitive Disorders, 14, 176-182.
Lord, F. M.(1980). Applications of item response theory to practical testing problems. Hills-
dale NJ: Erlbaum.
Maldonado, G., & Greenland, S.(1993). Simulation study of confounder-selection strategies.
American Journal of Epidemiology, 138, 923-936.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174.
Miller, T. R., & Spray, J. A.(1993). Logistic discriminant function analysis for DIF identi-
cation of polytomously scored items. Journal of Educational Measurement, 30, 107-122.
Millsap, R. E., & Everson, H. T.(1993). Methodology review: statistical approaches for
assessing measurement bias. Applied Psychological Measurement, 17, 297-334.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm.
Applied Psychological Measurement, 16, 159-176.
Narayanan, P., & Swaminathan, H. (1994). Performance of the Mantel-Haenszel and si-
multaneous item bias procedures for detecting dierential item functioning. Applied
Psychological Measurement, 18, 315-338.
Narayanan, P., & Swaminathan, H. (1996). Identication of items that show nonuniform
DIF. Applied Psychological Measurement, 20, 257-274.
Oshima, T. C., McGinty, D., & Flowers, C. P.(1994). Dierential item functioning for a test
with a cuto score: use of limited closed-interval measures. Applied Measurement in
Education, 7(3), 195-209.
Oshima, T. C., Raju, N. S., & Flowers, C.P.(1997). Development and demonstration of
multidimensional IRT-based internal measures of dierential functioning of items and
tests. Journal of Educational Measurement, 34, 253-272.
Peneld, R. D., & Lam, T. C. M.(2000). Assessing dierential item functioning in perfor-
mance assessment: Review and recommendations. Educational Measurement: Issues
and Practice, 19, 5-15.
Potenza, M. T., & Dorans, N. J.(1995). DIF assessment for polytomously scored items:
a framework for classication and evaluation. Applied Psychological Measurement, 19,
23-37.
Raju, N. S., van der Linden, W. J., & Fleer, P. F.(1995). IRT-based internal measures
of dierential functioning of items and tests. Applied Psychological Measurement, 19,
353-368.Rogers, H. J., & Swaminathan, H.(1993). A comparison of the logistic regression and Mantel-
Haenszel procedures for detecting dierential item functioning. Applied Psychological
Measurement, 17, 105-116.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores.
Psychometrika Monograph, No. 17.
Shealy, R., & Stout, W. (1993). A model-based standardization approach that separates true
bias / DIF from group ability dierences and detects test bias / DTF as well as item
bias / DIF. Psychometrika, 58, 159-194.
Stroud, A. H., & Sechrest, D. (1966). Gaussian quadrature formulas. New York: Prentice
Hall.
Swaminathan, H., & Rogers, H.J.(1990). Detecting dierential item functioning using logistic
regression procedures. Journal of Educational Measurement, 27, 361-370.
Thissen, D.(2003). MULTILOG. In M. du Toit(Ed.), IRT from SSI (pp.345-409). Lincol-
nwood, IL:Scientic Software International, Inc.
Thissen, D., Steinberg, L., & Gerard, M.(1986). Beyond mean group dierence: The concept
of item bias. Psychological Bulletin, 99, 118-128.
Zumbo, B. D.(1999). A Handbook on the Theory and Methods of Dierential Item Func-
tioning (DIF): Logistic Regression Modelling as a Unitary Framework for Binary andLikert-type (Ordinal) Item Scores. Directorate of Human Resources Research and Eval-
uation, Department of National Defense: Ottawa, Ont.
Zwick, R., Donoghnue, J. R., & Grima, A.(1993). Assessment of dierential item functioning
for performance tasks. Journal of Educational Measurement, 30, 233-251.