A Preliminary Comparative Study on the Diagnostic Accuracy of Machine Learning AI Systems in Medical Diagnosis

Authors : Prashant Kumar Jha

DOI : 10.62502/ijmi/v3i1art5

Volume : 3

Issue : 1

Year : 2026

Page No : 21-24

Background: Artificial intelligence (AI) and machine learning (ML) systems are increasingly being explored in medical imaging to support radiological diagnosis. Aim: This study aimed to perform a preliminary comparative assessment of the diagnostic accuracy of an ML-based ChatGPT reporting system versus manual radiologist interpretation in general radiography. Materials and Methods: A prospective study was conducted on 30 radiographic examinations (n = 30), including chest X-rays (PA view), spine (AP and lateral), upper extremity, and lower extremity radiographs, performed using an X-Tech 500 mA X-ray machine over a period from 2nd January 2026 to 16th January 2026. Each image was first reported by a radiologist, then independently analyzed by a ChatGPT-based ML system. Both reports were finally reviewed by a senior radiologist as the reference standard. Results: Manual radiologist interpretation showed higher diagnostic accuracy (96.7%, n = 29/30) compared to the ML system (80.0%, n = 24/30), with a statistically significant difference (p < 0.05). The ML system performed better in chest radiographs but showed reduced sensitivity in musculoskeletal imaging. It frequently failed to detect subtle findings such as hairline fractures and non-displaced fractures, with significantly lower sensitivity (33.3% vs 100%, p < 0.01). Conclusion: The ML-based ChatGPT system demonstrated moderate diagnostic performance but was inferior to manual radiologist interpretation, particularly for subtle skeletal injuries. Keywords: Machine Learning, Artificial Intelligence, ChatGPT, General Radiography


Citation Data