Article contents
A Corpus-Based Multidimensional Analysis of Linguistic Features between Human-Authored and ChatGPT-Generated Compositions
Abstract
This study presents a corpus-based multidimensional comparative analysis of linguistic features in human-authored and ChatGPT-generated English compositions, with a focus on four core dimensions: lexical difficulty, syntactic complexity, textual cohesion, and error patterns. A total of 120 compositions were analyzed—60 produced by ChatGPT-4 and 60 authored by Chinese L2 English learners from the Ten-thousand English Compositions of Chinese Learners corpus—equally distributed across three educational proficiency levels: primary, secondary, and tertiary. Quantitative analyses indicate that human-authored compositions exhibit a progressive increase in lexical complexity aligned with educational advancement, while ChatGPT-generated texts demonstrate limited differentiation between primary and secondary levels, followed by a sharp lexical elevation at the tertiary level. This pattern suggests an algorithmic reliance on generalized discourse rather than sensitivity to developmental variation. In terms of syntactic complexity, ChatGPT consistently produces structurally uniform texts with high usage of subordinate clauses and logical subordination, whereas human writing displays greater contextual flexibility, albeit with occasional simplification. Regarding textual cohesion, ChatGPT-generated compositions—particularly at the tertiary level—rely heavily on overt logical connectors and referential markers, resulting in structurally coherent but stylistically formulaic discourse. In contrast, human-authored texts, while sometimes lacking explicit cohesion markers, employ more nuanced devices such as collocations and implicit semantic links. Error analysis reveals a near absence of grammatical, lexical, and orthographic errors in ChatGPT outputs, contrasting with the relatively high error frequency in human compositions, especially at lower proficiency levels. These findings highlight ChatGPT’s strengths in producing grammatically accurate and syntactically complex texts, yet also underscore its limitations in mimicking authentic learner development and stylistic variability. The study concludes that while generative AI can serve as an effective auxiliary tool in L2 writing instruction, its pedagogical integration should be carefully calibrated to avoid undermining learners’ development of rhetorical sensitivity, authorial voice, and context-appropriate expression.
Article information
Journal
International Journal of Linguistics, Literature and Translation
Volume (Issue)
8 (5)
Pages
102-110
Published
Copyright
Copyright (c) 2025 JinLiang Wu
Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.