國立臺灣師範大學博碩士論文全文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林柏陞 Lin, Po-Sheng
論文名稱：	英文寫作中的人工智能：四種大型語言模型驅動寫作工具與Grammarly的語法錯誤更正性能和反饋提供的比較分析 Artificial Intelligence in English Writing: A Comparative Analysis of Error Correction Performance and Feedback Provision Across Four LLM-Powered Writing Tools and Grammarly
指導教授：	陳浩然 Chen, Hao-Jan
口試委員：	陳浩然 Chen, Hao-Jan 王宏均 Wang, Hung-Chun 賴淑麗 Lai, Shu-Li
口試日期：	2024/06/14
學位類別：	碩士 Master
系所名稱：	英語學系 Department of English
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	134
中文關鍵詞：	大型語言驅動模型、ChatGPT 、Gemini 、Claude 3 、常見文法錯誤、文法解釋
英文關鍵詞：	Large Language Models, ChatGPT, Gemini, Claude 3, grammatical error, grammatical error explanation
DOI URL：	http://doi.org/10.6345/NTNU202401552
論文種類：	學術論文
相關次數：	點閱：577 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

ACKNOWLEDGEMENTS i
摘要 ii
ABSTRACT iii
TABLE OF CONTENTS iv
LIST OF TABLES vi
LIST OF FIGURES vii
CHAPTER ONE INTRODUCTION 1
1.1 Background 1
1.2 Purpose of the Study 9
1.3 Research Questions 10
1.4 Significance of the Study 11
1.5 Organization of the Thesis 13
CHAPTER TWO LITERATURE REVIEW 14
2.1 Evolution of Large Language Models 14
2.2 The Four LLM-powered Writing Tools and Grammarly 17
2.3 Strengths of Using LLM-powered Writing Tools in the Classroom 25
2.4 Weaknesses of Using LLM-powered Writing Tools in the Classroom 31
CHAPTER THREE METHODOLOGY 35
3.1 The Dataset 37
3.2 Four LLM-powered Writing Tools and Grammarly for Evaluation 41
3.2.1 ChatGPT-3.5 41
3.2.2 ChatGPT-4 43
3.2.3 Gemini 45
3.2.4 Claude 3 47
3.2.5 Grammarly Premium 49
3.3 Testing of four LLM-powered writing tools and Grammarly Premium 51
3.4 Data Analysis Procedure 53
CHAPTER FOUR RESULTS 66
4.1 Overall Accuracy Rate of Each Tool 66
4.2 Inaccurate Corrections 67
4.3 False Alarms 68
4.4 Error Detection Performance on Different Error Types 71
4.5 Detailed Explanations 81
CHAPTER FIVE DISCUSSION 86
5.1 Summary of Findings 86
5.2 Discussion on Research Findings 88
5.2.1 Overall Performance 88
5.2.2 Error Types that the Five Tools Can and Cannot Identify 91
5.2.3 Detailed Explanation Provided by the Four LLM-powered Writing Tools 94
5.3 Pedagogical Implications 102
5.4 Limitations of the Study 105
References 107
Appendix A The Dataset 120
Appendix B Detailed Explanations 129
                                

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., & Anadkat, S. (2023). GPT-4 technical report. arXiv preprint, arXiv:2303.08774. https://doi.org/10.48550/arXiv.2303.08774
Al-Garaady, J., & Mahyoob, M. (2023). ChatGPT's capabilities in spotting and analyzing writing errors experienced by EFL learners. Arab World English Journals(9), 3-17. https://doi.org/10.24093/awej/call9.1
Allen, L. K., Jacovina, M. E., & McNamara, D. S. (2016). Computer-based writing instruction. In Handbook for Writing Research (2nd ed., pp. 316-329). The Guilford Press.
Anthropic. (2024). Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family
Barrot, J. S. (2023). Using automated written corrective feedback in the writing classrooms: effects on L2 writing accuracy. Computer Assisted Language Learning, 36(4), 584-607. https://doi.org/10.1080/09588221.2021.1936071
Baskara, F. R. (2023). Integrating ChatGPT into EFL writing instruction: Benefits and challenges. International Journal of Education and Learning, 5(1), 44-55. https://doi.org/10.31763/ijele.v5i1.858
Bibi, Z., & Atta, A. (2024). The role of ChatGPT as AI English writing assistant: A study of student’s perceptions, experiences, and satisfaction. Annals of Human and Social Sciences, 5(1), 433-443. https://doi.org/10.35484/ahss.2024(5-I)39
Bok, E., & Cho, Y. (2023). Examining Korean EFL college students’ experiences and perceptions of using ChatGPT as a writing revision tool. Journal of English Teaching through Movies and Media, 24(4), 15-27. https://doi.org/10.16875/stem.2023.24.4.15
Borji, A., & Mohammadian, M. (2023). Battle of the wordsmiths: Comparing ChatGPT, GPT-4, Claude, and Bard. GPT-4, Claude, and Bard (June 12, 2023). http://dx.doi.org/10.2139/ssrn.4476855
Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385-405. https://doi.org/10.1177/0265532214565386
Chen, S., Nassaji, H., & Liu, Q. (2016). EFL learners’ perceptions and preferences of written corrective feedback: a case study of university students from Mainland China. Asian-Pacific Journal of Second and Foreign Language Education, 1(1), 5. https://doi.org/10.1186/s40862-016-0010-y
Chen, X., Ye, J., Zu, C., Xu, N., Zheng, R., Peng, M., Zhou, J., Gui, T., Zhang, Q., & Huang, X. (2023). How robust is GPT-3.5 to predecessors? A comprehensive study on language understanding tasks. arXiv preprint arXiv:2303.00293. https://doi.org/10.48550/arXiv.2303.00293
Coyne, S., Sakaguchi, K., Galván-Sosa, D., Zock, M., & Inui, K. (2023). Analyzing the performance of GPT-3.5 and GPT-4 in grammatical error correction. arXiv preprint arXiv:2303.00293. https://doi.org/10.48550/arXiv.2303.14342
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://doi.org/10.48550/arXiv.1810.04805
Dikli, S., & Bleyle, S. (2014). Automated Essay Scoring feedback for second language writers: How does it compare to instructor feedback? Assessing Writing, 22, 1-17. https://doi.org/10.1016/j.asw.2014.03.006
Dodigovic, M., & Tovmasyan, A. (2021). Automated writing evaluation: The accuracy of Grammarly’s feedback on form. International Journal of TESOL Studies, 3(2), 71-87. https://doi.org/10.46451/ijts.2021.06.06
Face, H. (2024). Chatbot arena leaderboard. Retrieved 05/28 from https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
Fang, T., Yang, S., Lan, K., Wong, D. F., Hu, J., Chao, L. S., & Zhang, Y. (2023). Is chatgpt a highly fluent grammatical error correction system? A comprehensive evaluation. arXiv preprint arXiv:2304.01746. https://doi.org/10.48550/arXiv.2304.01746
Ferris, D. (2006). Does error feedback help student writers? New evidence on the short- and long-term effects of written error correction. In F. Hyland & K. Hyland (Eds.), Feedback in Second Language Writing: Contexts and Issues (pp. 81-104). Cambridge University Press. https://doi.org/10.1017/CBO9781139524742.007
Ferris, D., & Kurzer, K. (2019). Does error feedback help L2 writers?: Latest evidence on the efficacy of written corrective feedback. In F. Hyland & K. Hyland (Eds.), Feedback in Second Language Writing: Contexts and Issues (2 ed., pp. 106-124). Cambridge University Press. https://doi.org/10.1017/9781108635547.008
Fitikides, T. J. (2000). Common mistakes in English: With exercises. Longman.
Fitria, T. N. (2021). Grammarly as AI-powered English writing assistant: Students’ alternative for writing English. Metathesis: Journal of English Language, Literature, and Teaching, 5(1), 65-78. http://dx.doi.org/10.31002/metathesis.v5i1.3519
Ghufron, M. A., & Rosyida, F. (2018). The role of Grammarly in assessing English as a Foreign Language (EFL) writing. Lingua Cultura, 12(4), 395-403. https://doi.org/10.21512/lc.v12i4.4582
Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12146-0
Guo, Q., Feng, R., & Hua, Y. (2022). How effectively can EFL students use automated written corrective feedback (AWCF) in research writing? Computer Assisted Language Learning, 35(9), 2312-2331. https://doi.org/10.1080/09588221.2021.1879161
Han, J., Yoo, H., Kim, Y., Myung, J., Kim, M., Lim, H., Kim, J., Lee, T. Y., Hong, H., & Ahn, S.-Y. (2023). RECIPE: How to integrate ChatGPT into EFL writing education. arXiv preprint arXiv:2305.11583. https://doi.org/10.48550/arXiv.2305.11583
Hartshorn, K. J., & Evans, N. W. (2015). The effects of dynamic written corrective feedback: A 30-week study. Journal of response to writing, 1(2), 2. https://scholarsarchive.byu.edu/journalrw/vol1/iss2/2
Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, E., Casas, D. d. L., Hendricks, L. A., Welbl, J., & Clark, A. (2022). Training compute-optimal large language models. arXiv preprint arXiv:2203.15556. https://doi.org/10.48550/arXiv.2203.15556
Huang, H.-W., Li, Z., & Taylor, L. (2020). The effectiveness of using grammarly to improve students' writing skills. Proceedings of the 5th International Conference on Distance Education and Learning, 122-127. https://doi.org/10.1145/3402569.3402594
Hyland, K., & Hyland, F. (2019). Contexts and issues in feedback on L2 writing. In F. Hyland & K. Hyland (Eds.), Feedback in second language writing: Contexts and issues (2 ed., pp. 1-22). Cambridge University Press. https://doi.org/10.1017/9781108635547.003
John, P., & Woll, N. (2020). Using grammar checkers in an ESL context: An Investigation of automatic corrective feedback. CALICO Journal, 37(2), 169-192. https://doi.org/10.1558/cj.36523
Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling laws for neural language models. arXiv preprint arXiv:2001.08361. https://doi.org/10.48550/arXiv.2001.08361
Karim, K., & Nassaji, H. (2019). The effects of written corrective feedback: A critical synthesis of past and present research. Instructed Second Language Acquisition, 3(1), 28-52. https://doi.org/10.1558/isla.37949
Khoshnevisan, B. (2019). The affordances and constraints of automatic writing evaluation (AWE) tools: A case for Grammarly. ARTESOL EFL Journal, 2(2), 12-25.
Kim, Y., Choi, B., Kang, S., Kim, B., & Yun, H. (2020). Comparing the effects of direct and indirect synchronous written corrective feedback: Learning outcomes and students' perceptions. Foreign Language Annals, 53(1), 176-199. https://doi.org/10.1111/flan.12443
Link, S., Mehrzad, M., & Rahimi, M. (2022). Impact of automated writing evaluation on teacher feedback, student revision, and writing improvement. Computer Assisted Language Learning, 35(4), 605-634. https://doi.org/10.1080/09588221.2020.1743323
Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: A mixed methods intervention study. Smart Learning Environments, 11(1), 9. https://doi.org/10.1186/s40561-024-00295-9
Meyer, J. G., Urbanowicz, R. J., Martin, P. C., O’Connor, K., Li, R., Peng, P.-C., Bright, T. J., Tatonetti, N., Won, K. J., & Gonzalez-Hernandez, G. (2023). ChatGPT and large language models in academia: Opportunities and challenges. BioData Mining, 16(1), 20. https://doi.org/10.1186/s13040-023-00339-9
Nguyen Thi Thu, H. (2023). EFL teachers’ perspectives toward the use of ChatGPT in writing classes: A case study at Van Lang University. International Journal of Language Instruction, 2(3), 1-47. https://doi.org/10.54855/ijli.23231
Ningrum, S. (2023). ChatGPT’s impact: The AI revolution in EFL writing. Borneo Engineering & Advanced Multidisciplinary International Journal, 2(Special Issue (TECHON 2023)), 32-37. https://beam.pmu.edu.my/index.php/beam/article/view/109
O'Neill, R., & Russell, A. M. (2019). Grammarly: Help or hindrance? Academic learning advisors’ perceptions of an online grammar checker. Journal of Academic Language and Learning, 13(1), A88-A107. https://journal.aall.org.au/index.php/jall/article/view/591
ONeill, R., & Russell, A. (2019). Stop! Grammar time: University students’ perceptions of the automated feedback program Grammarly. Australasian Journal of Educational Technology, 35(1), 42-56. https://doi.org/10.14742/ajet.3795
OpenAI. (2023). GPT. https://platform.openai.com/docs/guides/gpt
Pfau, A., Polio, C., & Xu, Y. (2023). Exploring the potential of ChatGPT in assessing L2 writing accuracy for research purposes. Research Methods in Applied Linguistics, 2(3), 100083. https://doi.org/10.1016/j.rmal.2023.100083
Ranalli, J. (2018). Automated written corrective feedback: how well can students make use of it? Computer Assisted Language Learning, 31(7), 653-674. https://doi.org/10.1080/09588221.2018.1428994
Rasul, T., Nair, S., Kalendra, D., Robin, M., de Oliveira Santini, F., Ladeira, W. J., Sun, M., Day, I., Rather, R. A., & Heathcote, L. (2023). The role of ChatGPT in higher education: Benefits, challenges, and future research directions. Journal of Applied Learning and Teaching, 6(1), 41-56. https://doi.org/10.37074/jalt.2023.6.1.29
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching, 6(1), 342-363. https://doi.org/10.37074/jalt.2023.6.1.9
Sahu, S., Vishwakarma, Y. K., Kori, J., & Thakur, J. S. (2020). Evaluating performance of different grammar checking tools. International Journal, 9(2), 2227-2233. https://doi.org/10.30534/ijatcse/2020/201922020
Schmidt-Fajlik, R. (2023). ChatGPT as a grammar checker for Japanese English language learners: A comparison with Grammarly and ProWritingAid. AsiaCALL Online Journal, 14(1), 105-119. https://doi.org/10.54855/acoj.231417
Shanahan, M. (2024). Talking about large language models. Communications of the ACM, 67(2), 68-79. https://doi.org/10.1145/3624724
Sinha, T. S., & Nassaji, H. (2022). ESL learners’ perception and its relationship with the efficacy of written corrective feedback. International Journal of Applied Linguistics, 32(1), 41-56. https://doi.org/10.1111/ijal.12378
Song, C., & Song, Y. (2023). Enhancing academic writing skills and motivation: Assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14, 1260843. https://doi.org/10.3389/fpsyg.2023.1260843
Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894
Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752
Sundar, P., & Demis, H. (2023, June 2). Introducing Gemini: our largest and most capable AI model. Google Blog. https://blog.google/technology/ai/google-gemini-ai/#sundar-note
Truscott, J. (2007). The effect of error correction on learners’ ability to write accurately. Journal of Second Language Writing, 16(4), 255-272. https://doi.org/10.1016/j.jslw.2007.06.003
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. arXiv preprint, arXiv:1706.03762. https://arxiv.org/abs/1706.03762
Wilson, J., & Czik, A. (2016). Automated essay evaluation software in English Language Arts classrooms: Effects on teacher feedback, student motivation, and writing quality. Computers & Education, 100, 94-109. https://doi.org/10.1016/j.compedu.2016.05.004
Wirantaka, A. (2022). Effective written corrective feedback on EFL students’ academic writing. Jurnal Pendidikan Bahasa Asing Dan Sastra, 6(2), 387-399. https://doi.org/10.26858/eralingua.v6i2.34996
Wu, H., Wang, W., Wan, Y., Jiao, W., & Lyu, M. R. (2023). ChatGPT or Grammarly? Evaluating ChatGPT on grammatical error correction benchmark. arXiv preprint, arXiv:2303.13648. https://doi.org/10.48550/arXiv.2303.13648
Xu, L., & Zhang, T. (2023). Engaging with multiple sources of feedback in academic writing: postgraduate students’ perspectives. Assessment & Evaluation in Higher Education, 48(7), 995-1008. https://doi.org/10.1080/02602938.2022.2161089
Yan, D. (2023). Impact of ChatGPT on learners in a L2 writing practicum: An exploratory investigation. Education and Information Technologies, 28, 13943-13967. https://doi.org/10.1007/s10639-023-11742-4
Younis, H. A., Alyasiri, O. M., Muthmainnah, Sahib, T. M., Akhtom, D. a., Hayder, I. M., Salisu, S., & Shahid, M. (2023). ChatGPT evaluation: Can it replace Grammarly and Quillbot tools? British Journal of Applied Linguistics, 3(2), 34-46. https://doi.org/10.32996/bjal.2023.3.2.4
Zhang, J., Zorluel Özer, H., & Bayazeed, R. (2020). Grammarly vs. face-to-face tutoring at the writing center: ESL student writers' perceptions. Praxis: A Writing Center Journal, 17(2), 33-47. http://dx.doi.org/10.26153/tsw/8523
Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., & Dong, Z. (2023). A survey of large language models. arXiv preprint arXiv:2303.18223. https://doi.org/10.48550/arXiv.2303.18223
Zhou, J.-L. (2022). A study on the comparison and accuracy evaluation of grammar auto-detection tools (Master's thesis). National Taiwan Normal University. Taiwan Dissertation and Thesis Knowledge Value-Added System. https://hdl.handle.net/11296/k24npd

電子全文延後公開
2025/09/01

簡易檢索 / 詳目顯示

相關論文