簡易檢索 / 詳目顯示

研究生: 林柏陞
Lin, Po-Sheng
論文名稱: 英文寫作中的人工智能:四種大型語言模型驅動寫作工具與Grammarly的語法錯誤更正性能和反饋提供的比較分析
Artificial Intelligence in English Writing: A Comparative Analysis of Error Correction Performance and Feedback Provision Across Four LLM-Powered Writing Tools and Grammarly
指導教授: 陳浩然
Chen, Hao-Jan
口試委員: 陳浩然
Chen, Hao-Jan
王宏均
Wang, Hung-Chun
賴淑麗
Lai, Shu-Li
口試日期: 2024/06/14
學位類別: 碩士
Master
系所名稱: 英語學系
Department of English
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 134
中文關鍵詞: 大型語言驅動模型ChatGPTGeminiClaude 3常見文法錯誤文法解釋
英文關鍵詞: Large Language Models, ChatGPT, Gemini, Claude 3, grammatical error, grammatical error explanation
DOI URL: http://doi.org/10.6345/NTNU202401552
論文種類: 學術論文
相關次數: 點閱:140下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • ACKNOWLEDGEMENTS i 摘要 ii ABSTRACT iii TABLE OF CONTENTS iv LIST OF TABLES vi LIST OF FIGURES vii CHAPTER ONE INTRODUCTION 1 1.1 Background 1 1.2 Purpose of the Study 9 1.3 Research Questions 10 1.4 Significance of the Study 11 1.5 Organization of the Thesis 13 CHAPTER TWO LITERATURE REVIEW 14 2.1 Evolution of Large Language Models 14 2.2 The Four LLM-powered Writing Tools and Grammarly 17 2.3 Strengths of Using LLM-powered Writing Tools in the Classroom 25 2.4 Weaknesses of Using LLM-powered Writing Tools in the Classroom 31 CHAPTER THREE METHODOLOGY 35 3.1 The Dataset 37 3.2 Four LLM-powered Writing Tools and Grammarly for Evaluation 41 3.2.1 ChatGPT-3.5 41 3.2.2 ChatGPT-4 43 3.2.3 Gemini 45 3.2.4 Claude 3 47 3.2.5 Grammarly Premium 49 3.3 Testing of four LLM-powered writing tools and Grammarly Premium 51 3.4 Data Analysis Procedure 53 CHAPTER FOUR RESULTS 66 4.1 Overall Accuracy Rate of Each Tool 66 4.2 Inaccurate Corrections 67 4.3 False Alarms 68 4.4 Error Detection Performance on Different Error Types 71 4.5 Detailed Explanations 81 CHAPTER FIVE DISCUSSION 86 5.1 Summary of Findings 86 5.2 Discussion on Research Findings 88 5.2.1 Overall Performance 88 5.2.2 Error Types that the Five Tools Can and Cannot Identify 91 5.2.3 Detailed Explanation Provided by the Four LLM-powered Writing Tools 94 5.3 Pedagogical Implications 102 5.4 Limitations of the Study 105 References 107 Appendix A The Dataset 120 Appendix B Detailed Explanations 129

    Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., & Anadkat, S. (2023). GPT-4 technical report. arXiv preprint, arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774
    Al-Garaady, J., & Mahyoob, M. (2023). ChatGPT's capabilities in spotting and analyzing writing errors experienced by EFL learners. Arab World English Journals(9), 3-17. https://doi.org/10.24093/awej/call9.1
    Allen, L. K., Jacovina, M. E., & McNamara, D. S. (2016). Computer-based writing instruction. In Handbook for Writing Research (2nd ed., pp. 316-329). The Guilford Press.
    Anthropic. (2024). Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family
    Barrot, J. S. (2023). Using automated written corrective feedback in the writing classrooms: effects on L2 writing accuracy. Computer Assisted Language Learning, 36(4), 584-607. https://doi.org/10.1080/09588221.2021.1936071
    Baskara, F. R. (2023). Integrating ChatGPT into EFL writing instruction: Benefits and challenges. International Journal of Education and Learning, 5(1), 44-55. https://doi.org/10.31763/ijele.v5i1.858
    Bibi, Z., & Atta, A. (2024). The role of ChatGPT as AI English writing assistant: A study of student’s perceptions, experiences, and satisfaction. Annals of Human and Social Sciences, 5(1), 433-443. https://doi.org/10.35484/ahss.2024(5-I)39
    Bok, E., & Cho, Y. (2023). Examining Korean EFL college students’ experiences and perceptions of using ChatGPT as a writing revision tool. Journal of English Teaching through Movies and Media, 24(4), 15-27. https://doi.org/10.16875/stem.2023.24.4.15
    Borji, A., & Mohammadian, M. (2023). Battle of the wordsmiths: Comparing ChatGPT, GPT-4, Claude, and Bard. GPT-4, Claude, and Bard (June 12, 2023). http://dx.doi.org/10.2139/ssrn.4476855
    Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385-405. https://doi.org/10.1177/0265532214565386
    Chen, S., Nassaji, H., & Liu, Q. (2016). EFL learners’ perceptions and preferences of written corrective feedback: a case study of university students from Mainland China. Asian-Pacific Journal of Second and Foreign Language Education, 1(1), 5. https://doi.org/10.1186/s40862-016-0010-y
    Chen, X., Ye, J., Zu, C., Xu, N., Zheng, R., Peng, M., Zhou, J., Gui, T., Zhang, Q., & Huang, X. (2023). How robust is GPT-3.5 to predecessors? A comprehensive study on language understanding tasks. arXiv preprint arXiv:2303.00293. https://doi.org/10.48550/arXiv.2303.00293
    Coyne, S., Sakaguchi, K., Galván-Sosa, D., Zock, M., & Inui, K. (2023). Analyzing the performance of GPT-3.5 and GPT-4 in grammatical error correction. arXiv preprint arXiv:2303.00293. https://doi.org/10.48550/arXiv.2303.14342
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://doi.org/10.48550/arXiv.1810.04805
    Dikli, S., & Bleyle, S. (2014). Automated Essay Scoring feedback for second language writers: How does it compare to instructor feedback? Assessing Writing, 22, 1-17. https://doi.org/10.1016/j.asw.2014.03.006
    Dodigovic, M., & Tovmasyan, A. (2021). Automated writing evaluation: The accuracy of Grammarly’s feedback on form. International Journal of TESOL Studies, 3(2), 71-87. https://doi.org/10.46451/ijts.2021.06.06
    Face, H. (2024). Chatbot arena leaderboard. Retrieved 05/28 from https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
    Fang, T., Yang, S., Lan, K., Wong, D. F., Hu, J., Chao, L. S., & Zhang, Y. (2023). Is chatgpt a highly fluent grammatical error correction system? A comprehensive evaluation. arXiv preprint arXiv:2304.01746. https://doi.org/10.48550/arXiv.2304.01746
    Ferris, D. (2006). Does error feedback help student writers? New evidence on the short- and long-term effects of written error correction. In F. Hyland & K. Hyland (Eds.), Feedback in Second Language Writing: Contexts and Issues (pp. 81-104). Cambridge University Press. https://doi.org/10.1017/CBO9781139524742.007
    Ferris, D., & Kurzer, K. (2019). Does error feedback help L2 writers?: Latest evidence on the efficacy of written corrective feedback. In F. Hyland & K. Hyland (Eds.), Feedback in Second Language Writing: Contexts and Issues (2 ed., pp. 106-124). Cambridge University Press. https://doi.org/10.1017/9781108635547.008
    Fitikides, T. J. (2000). Common mistakes in English: With exercises. Longman.
    Fitria, T. N. (2021). Grammarly as AI-powered English writing assistant: Students’ alternative for writing English. Metathesis: Journal of English Language, Literature, and Teaching, 5(1), 65-78. http://dx.doi.org/10.31002/metathesis.v5i1.3519
    Ghufron, M. A., & Rosyida, F. (2018). The role of Grammarly in assessing English as a Foreign Language (EFL) writing. Lingua Cultura, 12(4), 395-403. https://doi.org/10.21512/lc.v12i4.4582
    Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12146-0
    Guo, Q., Feng, R., & Hua, Y. (2022). How effectively can EFL students use automated written corrective feedback (AWCF) in research writing? Computer Assisted Language Learning, 35(9), 2312-2331. https://doi.org/10.1080/09588221.2021.1879161
    Han, J., Yoo, H., Kim, Y., Myung, J., Kim, M., Lim, H., Kim, J., Lee, T. Y., Hong, H., & Ahn, S.-Y. (2023). RECIPE: How to integrate ChatGPT into EFL writing education. arXiv preprint arXiv:2305.11583. https://doi.org/10.48550/arXiv.2305.11583
    Hartshorn, K. J., & Evans, N. W. (2015). The effects of dynamic written corrective feedback: A 30-week study. Journal of response to writing, 1(2), 2. https://scholarsarchive.byu.edu/journalrw/vol1/iss2/2
    Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, E., Casas, D. d. L., Hendricks, L. A., Welbl, J., & Clark, A. (2022). Training compute-optimal large language models. arXiv preprint arXiv:2203.15556. https://doi.org/10.48550/arXiv.2203.15556
    Huang, H.-W., Li, Z., & Taylor, L. (2020). The effectiveness of using grammarly to improve students' writing skills. Proceedings of the 5th International Conference on Distance Education and Learning, 122-127. https://doi.org/10.1145/3402569.3402594
    Hyland, K., & Hyland, F. (2019). Contexts and issues in feedback on L2 writing. In F. Hyland & K. Hyland (Eds.), Feedback in second language writing: Contexts and issues (2 ed., pp. 1-22). Cambridge University Press. https://doi.org/10.1017/9781108635547.003
    John, P., & Woll, N. (2020). Using grammar checkers in an ESL context: An Investigation of automatic corrective feedback. CALICO Journal, 37(2), 169-192. https://doi.org/10.1558/cj.36523
    Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. arXiv preprint arXiv:2001.08361. https://doi.org/10.48550/arXiv.2001.08361
    Karim, K., & Nassaji, H. (2019). The effects of written corrective feedback: A critical synthesis of past and present research. Instructed Second Language Acquisition, 3(1), 28-52. https://doi.org/10.1558/isla.37949
    Khoshnevisan, B. (2019). The affordances and constraints of automatic writing evaluation (AWE) tools: A case for Grammarly. ARTESOL EFL Journal, 2(2), 12-25.
    Kim, Y., Choi, B., Kang, S., Kim, B., & Yun, H. (2020). Comparing the effects of direct and indirect synchronous written corrective feedback: Learning outcomes and students' perceptions. Foreign Language Annals, 53(1), 176-199. https://doi.org/10.1111/flan.12443
    Link, S., Mehrzad, M., & Rahimi, M. (2022). Impact of automated writing evaluation on teacher feedback, student revision, and writing improvement. Computer Assisted Language Learning, 35(4), 605-634. https://doi.org/10.1080/09588221.2020.1743323
    Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments, 11(1), 9. https://doi.org/10.1186/s40561-024-00295-9
    Meyer, J. G., Urbanowicz, R. J., Martin, P. C., O’Connor, K., Li, R., Peng, P.-C., Bright, T. J., Tatonetti, N., Won, K. J., & Gonzalez-Hernandez, G. (2023). ChatGPT and large language models in academia: Opportunities and challenges. BioData Mining, 16(1), 20. https://doi.org/10.1186/s13040-023-00339-9
    Nguyen Thi Thu, H. (2023). EFL teachers’ perspectives toward the use of ChatGPT in writing classes: A case study at Van Lang University. International Journal of Language Instruction, 2(3), 1-47. https://doi.org/10.54855/ijli.23231
    Ningrum, S. (2023). ChatGPT’s impact: The AI revolution in EFL writing. Borneo Engineering & Advanced Multidisciplinary International Journal, 2(Special Issue (TECHON 2023)), 32-37. https://beam.pmu.edu.my/index.php/beam/article/view/109
    O'Neill, R., & Russell, A. M. (2019). Grammarly: Help or hindrance? Academic learning advisors’ perceptions of an online grammar checker. Journal of Academic Language and Learning, 13(1), A88-A107. https://journal.aall.org.au/index.php/jall/article/view/591
    ONeill, R., & Russell, A. (2019). Stop! Grammar time: University students’ perceptions of the automated feedback program Grammarly. Australasian Journal of Educational Technology, 35(1), 42-56. https://doi.org/10.14742/ajet.3795
    OpenAI. (2023). GPT. https://platform.openai.com/docs/guides/gpt
    Pfau, A., Polio, C., & Xu, Y. (2023). Exploring the potential of ChatGPT in assessing L2 writing accuracy for research purposes. Research Methods in Applied Linguistics, 2(3), 100083. https://doi.org/10.1016/j.rmal.2023.100083
    Ranalli, J. (2018). Automated written corrective feedback: how well can students make use of it? Computer Assisted Language Learning, 31(7), 653-674. https://doi.org/10.1080/09588221.2018.1428994
    Rasul, T., Nair, S., Kalendra, D., Robin, M., de Oliveira Santini, F., Ladeira, W. J., Sun, M., Day, I., Rather, R. A., & Heathcote, L. (2023). The role of ChatGPT in higher education: Benefits, challenges, and future research directions. Journal of Applied Learning and Teaching, 6(1), 41-56. https://doi.org/10.37074/jalt.2023.6.1.29
    Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching, 6(1), 342-363. https://doi.org/10.37074/jalt.2023.6.1.9
    Sahu, S., Vishwakarma, Y. K., Kori, J., & Thakur, J. S. (2020). Evaluating performance of different grammar checking tools. International Journal, 9(2), 2227-2233. https://doi.org/10.30534/ijatcse/2020/201922020
    Schmidt-Fajlik, R. (2023). ChatGPT as a grammar checker for Japanese English language learners: A comparison with Grammarly and ProWritingAid. AsiaCALL Online Journal, 14(1), 105-119. https://doi.org/10.54855/acoj.231417
    Shanahan, M. (2024). Talking about large language models. Communications of the ACM, 67(2), 68-79. https://doi.org/10.1145/3624724
    Sinha, T. S., & Nassaji, H. (2022). ESL learners’ perception and its relationship with the efficacy of written corrective feedback. International Journal of Applied Linguistics, 32(1), 41-56. https://doi.org/10.1111/ijal.12378
    Song, C., & Song, Y. (2023). Enhancing academic writing skills and motivation: Assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14, 1260843. https://doi.org/10.3389/fpsyg.2023.1260843
    Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894
    Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752
    Sundar, P., & Demis, H. (2023, June 2). Introducing Gemini: our largest and most capable AI model. Google Blog. https://blog.google/technology/ai/google-gemini-ai/#sundar-note
    Truscott, J. (2007). The effect of error correction on learners’ ability to write accurately. Journal of Second Language Writing, 16(4), 255-272. https://doi.org/10.1016/j.jslw.2007.06.003
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. arXiv preprint, arXiv:1706.03762. https://arxiv.org/abs/1706.03762
    Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English Language Arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94-109. https://doi.org/10.1016/j.compedu.2016.05.004
    Wirantaka, A. (2022). Effective written corrective feedback on EFL students’ academic writing. Jurnal Pendidikan Bahasa Asing Dan Sastra, 6(2), 387-399. https://doi.org/10.26858/eralingua.v6i2.34996
    Wu, H., Wang, W., Wan, Y., Jiao, W., & Lyu, M. R. (2023). ChatGPT or Grammarly? Evaluating ChatGPT on grammatical error correction benchmark. arXiv preprint, arXiv:2303.13648. https://doi.org/10.48550/arXiv.2303.13648
    Xu, L., & Zhang, T. (2023). Engaging with multiple sources of feedback in academic writing: postgraduate students’ perspectives. Assessment & Evaluation in Higher Education, 48(7), 995-1008. https://doi.org/10.1080/02602938.2022.2161089
    Yan, D. (2023). Impact of ChatGPT on learners in a L2 writing practicum: An exploratory investigation. Education and Information Technologies, 28, 13943-13967. https://doi.org/10.1007/s10639-023-11742-4
    Younis, H. A., Alyasiri, O. M., Muthmainnah, Sahib, T. M., Akhtom, D. a., Hayder, I. M., Salisu, S., & Shahid, M. (2023). ChatGPT evaluation: Can it replace Grammarly and Quillbot tools? British Journal of Applied Linguistics, 3(2), 34-46. https://doi.org/10.32996/bjal.2023.3.2.4
    Zhang, J., Zorluel Özer, H., & Bayazeed, R. (2020). Grammarly vs. face-to-face tutoring at the writing center: ESL student writers' perceptions. Praxis: A Writing Center Journal, 17(2), 33-47. http://dx.doi.org/10.26153/tsw/8523
    Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., & Dong, Z. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223. https://doi.org/10.48550/arXiv.2303.18223
    Zhou, J.-L. (2022). A study on the comparison and accuracy evaluation of grammar auto-detection tools (Master's thesis). National Taiwan Normal University. Taiwan Dissertation and Thesis Knowledge Value-Added System. https://hdl.handle.net/11296/k24npd

    無法下載圖示 電子全文延後公開
    2025/09/01
    QR CODE