簡易檢索 / 詳目顯示

研究生: 鄭惠文
Huei-Wen Cheng
論文名稱: 運用寫作評量軟體批改高中生英文作文之研究
Utilizing Automated Writing Evaluation Software in Correcting High School Students' Compositions
指導教授: 陳浩然
Chen, Hao-Jan
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2012
畢業學年度: 101
語文別: 英文
論文頁數: 81
中文關鍵詞: 錯誤回饋自動寫作評量英文作文
英文關鍵詞: error feedback, automated writing evaluation, English composition
論文種類: 學術論文
相關次數: 點閱:238下載:30
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在英文學習領域中,隨著網際網路的發達與全球化的影響,英文寫作扮演日益重要的角色。學生期盼能有更多的練習,以便因應將來各種需要以英文寫作的場合。但在台灣的高中裡,因班級人數眾多,批改作文及給予適當回饋建議對英文老師是一大負擔。近年來有些寫作評量軟體可以提供學生立即的批改與回饋,可以減輕老師的負擔。但隨著越來越多人使用這些軟體,有必要針對系統優缺點進行進一步研究。本研究檢驗一個新的寫作評量軟體Correct English所提供的寫作錯誤回饋並發現此軟體的優點與缺點。研究對象為90位來自兩所公立高中的高三學生。本研究收集此軟體針對146篇高中學生作文所提供的各種寫作錯誤回饋並加以分析;並進一步比較軟體提供的回饋與老師提供的回饋之間的差異。結果顯示此軟體提供了40種不同的回饋訊息,但大約有三分之一的回饋是錯誤訊息。此外,與老師提供的回饋相比,此系統仍然無法偵測許多學生的常見錯誤,如介系詞、時態、詞性與冠詞等;也無法提供句子改寫的功能。本研究建議使用此類寫作評量軟體的老師須注意其在教學上的使用。融合老師的引導與詳細的指示,英文老師便能善用現有的軟體功能來幫助學生寫作。

    In respect of English learning, the ability of writing well has been more and more important with the advances of the Internet and the trend of globalization. Learners expect to have more practices to prepare themselves for various occasions in which they have to write in English. However, for high schools in Taiwan, with more than 40 students in a class, grading and giving feedback on students’ writing has been a burden for English teachers in Taiwan. In recent years, there have been some automated writing evaluation (AWE) systems developed to provide learners with computerized feedback. These systems seem to be an alternative way to help teachers in the process of correcting essays since AWE systems promise to provide immediate score and feedback. Targeting at a newly-developed AWE system, this study aims to investigate the employment of the AWE system in correcting high school students’ compositions and find out whether the system can provide usable feedback for its users to revise their writing.
    A total of 90 12th grade students from two senior high schools in Taipei were recruited in the study. Each of the students was asked to write two compositions on the assigned topics. An automated writing evaluation program called Correct English was used to generate the computerized feedback messages. At the same time, two human raters corrected and commented on the compositions. Afterwards, the computer-generated feedback messages on writing errors were compared with those of human raters’.
    The results showed that Correct English provided 40 types of error feedback. Among the error feedback messages provided by the AWE system, one third of them were false alarms, which would confuse the learners. In addition, compared with the errors identified by human raters, there were still many common errors left untreated by the AWE system, such as errors in prepositions, verb tense, word form, and articles. Besides, human raters rewrote the sentences or provide suggestions when there were unclear or ungrammatical expressions, but the AWE system was not able to offer feedback on sentence level.
    Based on the major findings of this study, it is suggested that language teachers should pay attention to the use of AWE systems in class. Teacher’s guidance, specific instructions and follow-up activities should be incorporated so that instructors can make the best use of the available functions to assist learners to become better writers.

    List of Tables………………………………………………………………………vi List of Figures……………………………………………………………………viii Chapter One Introduction 1 1.1 Background 1 1.2 Purpose of the study 4 1.3 Research Questions 4 1.4 Significance of the Study 5 1.5 Organization of the Thesis 5 Chapter Two Literature Review 7 2.1 Issues in error correction 7 2.1.1 The case against grammar correction 7 2.1.2 The case for grammar correction 8 2.1.3 To correct or not to correct? 9 2.2 Brief Introduction of the Development of Automated Writing Evaluation 10 2.3 Previous studies of using AWE systems in EFL contexts 12 2.3.1 Previous studies on students’ perceptions 12 2.3.2 Previous studies concerning AWE grammar feedback accuracy 18 2.3.3 Previous studies about the comparison between AWE and peer feedback 21 2.4 Overall Strengths and Weaknesses of AWE Systems 24 2.5 Summary of Chapter Two 26 Chapter Three Methodology 27 3.1 Subjects 27 3.2 Instrument 28 3.3 Procedure 32 Chapter Four 34 4.1 Types and numbers of corrective feedback identified by Correct English and human raters 34 4.2 The analysis of the error feedback types provided by Correct English 36 4.2.1 The top twenty error types identified by Correct English and their accuracy rates 37 4.2.2 Discussion of the performance of Correct English 48 4.2.3 Other findings in the error feedback provided by Correct English 51 4.3 The analysis of the error feedback types identified by human raters 53 4.3.1 The analysis of the top twenty error types identified by human raters and the comparison with feedback provided by Correct English 54 4.3.2 Comparison of the top twenty error feedback types provided by human raters and Correct English 60 4.3.3 Discussion of the error feedback messages provided by human raters and Correct English 61 4.4 Summary of Chapter Four 63 Chapter Five 65 5.1 Summary 65 5.2 Pedagogical Implications 66 5.3 Limitations of the Present Study and Suggestions for Further Research 68 REFERENCES 70 Appendix: Teacher’s Rewrite 75 List of Tables Table 1. The Number and Error Types Provided by Correct English and Human Raters 35 Table 2. The Types and Numbers of Error Messages Provided by Correct English 36 Table 3. Examples of Correct Detection and False Alarms for Spelling Errors by Correct English 37 Table 4. Examples of Correct Detection and False Alarms for Clause Errors by Correct English 38 Table 5. Examples of Correct Detection and False Alarms for Subject-Verb agreement Errors by Correct English 39 Table 6. Examples of Correct Detection and False Alarms for Word Form Errors by Correct English 39 Table 7. Examples of Correct Detection and False Alarms for Punctuation Errors by Correct English 40 Table 8. Examples of Correct Detection and False Alarms for Noun Phrase Consistency Errors by Correct English 40 Table 9. Examples of Correct Detection and False Alarms for Infinitive or –ing Forms by Correct English 41 Table 10. Examples of Correct Detection and False Alarms for Verb Group Consistency Errors by Correct English 42 Table 11. Examples of Correct Detection and False Alarms for Weak / Non-standard Modifiers by Correct English 42 Table 12. Examples of Correct Detection and False Alarms for Adverb Placement Errors by Correct English 43 Table 13. Examples of Correct Detection and False Alarms for Capitalization Errors by Correct English 43 Table 14. Examples of Correct Detection and False Alarms for Missing/ Unnecessary/ Incorrect Articles by Correct English 44 Table 15. Examples of Correct Detection and False Alarms for Wordy Expressions by Correct English 44 Table 16. Examples of Correct Detection and False Alarms for Redundant Expressions by Correct English 45 Table 17. Examples of Correct Detection and False Alarms for A vs. An Errors by Correct English 46 Table 18. Examples of False Alarms for Vague quantifiers by Correct English 46 Table 19. Examples of Correct Detection and False Alarms for Preposition Errors by Correct English 47 Table 20. Examples of Correct Detection and False Alarms for Nouns: Mass or Count Errors by Correct English 47 Table 21. Examples of Correct Detection for Open vs. Closed Spelling Errors by Correct English 48 Table 22. Examples of Correct Detection and False Alarms for Word Confusion by Correct English 48 Table 23. Summary of the Accuracy Rates of the Top Twenty Error Feedback Messages by Correct English 49 Table 24. Examples of Correct Detection and False Alarms for Homonyms by Correct English and Human Raters 51 Table 25. Examples of False Alarms for Passive Voice Usages by Correct English 52 Table 26. Examples of False Alarms for Clichés by Correct English 52 Table 27. The Types and Numbers of Error Messages Identified by Human Raters 53 Table 28. Examples of Teacher’s Rewrite by Human raters and the Comparison with Feedback by Correct English 55 Table 29. Examples of Word Confusion Errors Identified by Human Raters 57 Table 30. Examples of Misused Words Detected by Human Raters 57 Table 31. Examples of Detection of Idiomatic Expressions Errors by Correct English and Human Raters 58 Table 32. Examples of Detection for Relative Pronoun Errors by Human raters 59 Table 33. Examples of Detection for Word Order by Human raters and Correct English 59 Table 34. Examples of Detection for Run-on Sentences by Human raters 59 Table 35. Examples of Detection for Unclear Meaning by Human raters 60 Table 36. The Top Twenty Types and Numbers of Error Messages Identified by Correct English and Human Raters 60 Table 37. The Number and Error Types Shared by Correct English and Human Raters ……………………………………………………………………………62 List of Figures Figure 1. Basic Check Marked by Correct English ….29 Figure 2. Grammar and Usage Feedback Provided by Correct English 30 Figure 3. Style Choice Suggested by Correct English 30 Figure 4. The Function of Writing Help in Correct English 31 Figure 5. The Function of Reference in Correct English 32 Figure 6. Occurrence of the Error Types Provided by Correct English 37 Figure 7. The Distribution of Accuracy Rate for the Top Twenty Error Types Detected by Correct English 50 Figure 8. Occurrence of the Error Types Provided by Human Raters…………….54

    Attali, Y. (2004) Exploring the feedback and revision
    features in Criterion. Paper presented at the National Council on Measurement in Education (NCME) held between April12-16, 2004, in San Diego, CA
    Attali, Y. & Burstein, J. (2006). Automated essay scoring with e-rater V.2.0. Journal of Technology, Learning and Assessment, 4(3). Retrieved September 29, 2009 from http://www.jtla.org.
    Attali, Y., Bridgeman, B., & Trapani, C. (2010). Performance of a Generic Approach in Automated Essay Scoring. Journal of Technology, Learning and Assessment, 10(3). Retrieved July 1, 2011 from http://www.jtla.org
    Ben-Simon, A. & Bennett, R.E. (2007). Toward more substantively meaningful automated essay scoring. Journal of Technology, Learning and Assessment, 6(1). Retrieved September 29, 2009 from http://www.jtla.org.
    Burstein, J. & Chodorow, M.(1999). Automated essay scoring for nonnative English speakers. In M. B. Olson (Ed.), Proceedings of the ACL99 workshop on computer mediated language assessment and evaluation of natural language processing. College Park, MD: Association of Computational Linguistics.
    Burstein, J. & Chodorow, M.(2002). Directions in automated essay analysis. In R. Kaplan (Ed.), The Oxford handbook of applied linguistics (p.p. 487-497). New York: Oxford University Press.
    Burstein, J., Chodorow, M., & Leacock, C. (2003). CriterionSM: Online essay evaluation: An application for automated evaluation of student essays. Proceeding of the Fifteenth Annual conference on Innovative Application of Artificially Intelligence, Acapulco , Mexico.
    Burstein, J. & Chodorow, M. (2003) The e-rater scoring engine: Automated Essay Scoring with natural language processing. In M. D. Shermis and J. C. Burstein (Eds.) Automated Essay Scoring: A cross disciplinary approach (pp. 113-121). Mahwah, NJ: Lawrence Erlbaum Associates.
    Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing service. AI Magazine, 25(3), 27-36
    Burston, J. (2001) Computer-mediated feedback in composition correction. CALICO Journal, 19 (1) 37-50
    Calfee, R. (2000) To grade or not to grade. IEEE Intelligent Systems, 15 (5), 35-37.
    Chen, C-F. & Cheng, W-Y. (2006). The Use of a Computer-based Writing Program: Facilitation or Frustration? Paper presented at the 23rd International Conference on English Teaching and Learning in the R.O.C. May 27, 2006. Kaohsiung: Wenzao Ursuline College of Languages.
    Chen, C-F. & Cheng, W-Y. (2010) Beyond the Design of Automated Writing Evaluation: Pedagogical Practices and Perceived Learning Effectiveness in EFL Writing Classes. Language Learning & Technology, 12(2), 94-112.
    Chen, H. J. (2006). Examining the Scoring Mechanism and Feedback Quality of My Access. Proceedings of Tamkang University Conference on Second Language Writing.
    Cheville, J. (2004) Automated scoring technologies and the rising influence of error. English Journal, 93 (4), 47-52
    Cohen, Y., Ben-Simon, A. & Hovav, M. (2003) The effect of specific language features on the complexity of systems for automated essay scoring. Paper presented at the 29th Annual Conference, Manchester, UK.
    Dikli, S. (2006). An Overview of automated essay scoring. Journal of Technology, Learning and Assessment, 5(1). Retrieved September 29, 2009 from http://www.jtla.org.
    Elliot, S. (2003) Intellimetric: from here to validity. In M. D. Shermis and J. C. Burstein (Eds.) Automated Essay Scoring: A cross disciplinary approach (pp. 71-86). Mahwah, NJ: Lawrence Erlbaum Associates.
    Ferris, D. (1999). The Case for Grammar Correction in L2 Writing Classes: A Response to Truscott. Journal of Second Language Writing, 8(1), 1-11.
    Ferris, D. (2002). Treatment of Error in Second Language Student Writing. Ann Arbor , MI : University of Michigan Press.
    Ferris, D. (2004) The “Grammar Correction” Debate in L2 Writing: Where are we, and where do we go from here? (and what do we do in the meantime…?) Journal of Second Language Writing, 13, 49-62.
    Ferris, D. (2006). Does error feedback help student writers? New evidence on the short- and long-term effects of written error correction. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 81–104). Cambridge : Cambridge University Press.
    Francis, N. (2007) Corrective feedback in L2 assessment: negative evidence and interactive practice. Selected papers from the 16th International Symposium on English Teaching. Taipei: English Teachers Association, Y. N. Leung (Ed.) 376-386.
    Goldstein, L. (2006). Feedback and revision in second language writing: Contextual, teacher, and student variables. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 185–205). Cambridge : Cambridge University Press.
    Guenette, D. (2007). Is feedback pedagogically correct? Research design issues in studies of feedback on writing. Journal of Second Language Writing, 16, 40-53.
    Herrington, A. & Moran, C. (2001) What happens when machines read our students’ writing? College English, 63 (4), 480-499
    Hutchison, D. (2007) An evaluation of computerized essay marking for national curriculum assessment in the UK for 11-year-olds. British Journal of Educational Technology, 38 (6), 977-989.
    Grimes, D. & Warschauer, M. (2010) Utility in a Fallible Tool: A Multi-Site Case Study of Automated Writing Evaluation. Journal of Technology, Learning and Assessment, 8(6), Retrieved August 30, 2010 from http://www.jtla.org.
    Kellogg, R.T., Whiteford, A.P. & Quinlan, T. (2010) Does automated feedback help students learn to write? Journal of Educational Computing Research, 42 (2), 173-196.
    Kukich, K. (2000) Beyond automated essay grading. IEEE Intelligent Systems, 15(5), 22-27.
    Lai, Y. H. (2009) Which do students prefer to evaluate their essays: peers or computer program. British Journal of Educational Technology, 38 (6), 1-23.
    Otoshi, J. (2005). An Analysis of the Use of Criterion in a Writing Classroom in Japan . The JALT CALL Journal, 1 (1), 30-38.
    Page, E. B. (2003) Project Essay Grade: PEG. In M. D. Shermis and J. C. Burstein (Eds.) Automated Essay Scoring: A cross disciplinary approach (pp. 43-54). Mahwah, NJ: Lawrence Erlbaum Associates.
    Rudner, L. M, Garcia, V., & Welch, C. (2006) An Evaluation of the Intellimetric Essay Scoring System. Journal of Technology, Learning, and Assessment, 4 (4). Available from http:// www. Jtla.org
    Powers, D. E., Burstein, J. C., Chodorow, M. E., &Kukich, K. (2002) Stumping e-rater: challenging the validity of automated essay scoring. Computers in Human Behavior, 18, 103-134
    Scharber, C., Dexter, S. & Riedel, E. (2008) Students’ experiences with and automated essay scorer. Journal of Technology, Learning and Assessment, 7(1) Retrieved September 29, 2009 from http://www.jtla.org
    Shermis, M. D. & Burstein, J. C. (2003) Automated Essay Scoring: A cross disciplinary approach. Mahwah, NJ: Lawrence Erlbaum Associates.
    Shermis, M. D., Rayment, M. V., & Barrera, F. (2003) Assessing writing through the curriculum with automated essay scoring. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL.
    Shermis, M.D., Shneyderman, A., & Attali, Y. (2008) How important is content in the ratings of essay assessments? Assessment in Education: Principles, Policy & Practice, 15(1), 91-105.
    Shermis, M.D., Garvan, G. W., & Diao, Y. (2008) The impact of automated essay scoring on writing outcomes. Paper presented at the Annual Meetings of the National Council on Measurement in Education, New York, NY.
    Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46, 327–369.
    Truscott, J. (1999). The case for “The case against grammar correction in L2 writing classes”: A response to Ferris. Journal of Second Language Writing, 8(2), 111-122.
    Truscott, J. (2007). The effect of error correction on learners’ ability to write accurately. Journal of Second Language Writing, 16, 255-272.
    Truscott, J. & Hsu, A. Y. (2008). Error correction, revision, and learning. Journal of Second Language Writing, 17, 292-305.
    Wang, J. & Brown, M. (2007). Automated essay scoring versus human scoring: a comparative study. Journal of Technology, Learning and Assessment, 6(2) Retrieved September 29, 2009 from http://www.jtla.org
    Ware, P. & Warschauer, M. (2006) Electronic feedback and second language writing. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 105-122). Cambridge : Cambridge University Press.
    Ware, P. & Warschauer, M. (2006) Automated writing evaluation: defining the classroom research agenda. Language Teaching Research, 10 (2), 157-180.
    Warschauer, M. & Grimes, D. (2008) Automated writing assessment in the classroom. Pedagogies: An International Journal, 3, 22-36
    Yang, N. D. (2004). Using My Access in EFL writing. The proceedings of 2004 International Conference and Workshop on TEFL & Applied Linguistics (pp. 550-564). Taipei , Ming Chuan University .
    Yeh, Y., Liou, H. C. & Yu, Y. T. (2007). The influence of automated essay evaluation and bilingual concordancing on EFL students’ writing. English Teaching & Learning, 31 (1), 117-160.

    下載圖示
    QR CODE