A MULTI-FEATURE ACCURATE DETECTION (MFAD) APPROACH FOR LARGE LANGUAGE MODEL-GENERATEDTEXT | ||
International Journal of Intelligent Computing and Information Sciences | ||
Volume 25, Issue 3, September 2025, Pages 107-122 PDF (1.22 M) | ||
Document Type: Original Article | ||
DOI: 10.21608/ijicis.2025.410515.1416 | ||
Authors | ||
doaa ahmed sayed* 1; sally saad ismail2; Mostafa Aref3 | ||
1computer science , faculty of computer and information,Ain shimas | ||
2Computer Science, Faculty of Computer and Information science, Ainshams University, Cairo, Egypt | ||
3Department Computer Science, Faculty of Computer and Information Sciences,Ain Shams University, Cairo, Egypt. | ||
Abstract | ||
Advanced Large Language Models (LLMs) generate highly complex text that closely resembles human writing. However, their rapid development raises significant concerns, such as misinformation and academic cheating. As the responsible use of LLMs becomes increasingly critical, the ability to detect LLM-generated content has emerged as a critical challenge. Existing detection methods often rely on single-feature analysis, traditional feature extraction techniques, and conventional classification models. Many also require full access to the underlying models and are sensitive to variations in text length, limiting their overall effectiveness. This paper proposes a novel Multi-Feature Accurate Detection (MFAD) approach for identifying LLM-generated text by integrating syntactic and statistical attributes with high-level semantic representations. A case study using the Human ChatGPT Comparison Corpus (HC3) is conducted to evaluate the proposed architecture. MFAD comprises six phases: text preprocessing, syntactic and statistical feature extraction, text representation, semantic feature extraction, feature concatenation, and text classification. Results show that MFAD effectively distinguishes between human-written and LLM-generated text, achieving a peak confidence score of 98%, highlighting its reliability and strong performance. | ||
Keywords | ||
Large language models (LLMs); Machine-generated text; AI-generated text; Feature-based detection | ||
Statistics Article View: 8 PDF Download: 7 |