AI Is Getting ‘Dumber’—Here’s the Proof!
A study confirms that AI models suffer from posterior cortical atrophy, a variant of Alzheimer’s disease.
Table of Contents
AI is ruling present and is expected to simplify complex problems like diagnosis, line of treatment, facilities, and much more with its coding solutions. But, what if AI shows signs of cognitive decline like humans?
- International Journal published a study in December 2024 that AI models are not as perfect as they seem, particularly in the medical field. After a time they show cognitive decline. The chatbots and large language models (LLMs) were included in the test.
This study comes to light when most medical diagnoses are relayed on AI tools. They are thought to be machines that simplify complex medical terminology.
By evaluating the cognitive abilities of ChatGPT versions 4 and 4o, Claude 3.5 Sonnet (developed by Anthropic), and Gemini versions 1 and 1.5 (developed by Alphabet), researchers reached this conclusion of AI getting old with time They conducted the Montreal Cognitive Assessment (MoCA) test.
- The results revealed that older versions scored lower than the younger versions. With humans, it is the same, who show this pattern of decline.
What is the MoCA test?
- The MoCA test is performed to judge the early signs of dementia.
Like humans for LLMs, included questions are attention, memory, language, spatial skills, and executive mental function.
For humans, a score of 26 out of 30 is regarded as a passing grade which confirms that there is no cognative impairment.
- The results showed that only ChatGPT 4o achieved the threshold of 26, while ChatGPT 4 and Claude fell just short with 25 points. Gemini 1.0 scored the lowest among the models, with 16 points.
With humans, the attention test includes the tapping by patients whenever they hear the letter A in a series of letters read aloud by the physician.
Since these models lack auditory and motor functions researchers provided them with the written form of the letters and asked the model to mark the letter “A” by asterisks or by printing out tap.
- Some of the models require explicit instructions while others perform the task autonomously.
Following the MoCA guidelines researchers set the passing score of 26/30.
Highlighting Poor Skills
The study confirmed that all chatbots performed poorly in skills like rail-making exercise, and clock drawing.
In rail making exercise one has to connect the encircling numbers and letters in increasing orders. Clock drawing is sketching a clock face to show a specific time.
- Some models like Gemini fail to complete the delayed recall test. In this task, one has to remember a five-word sequence.
ChatGPT 4o scored the highest marks i.e. 26/30 while ChatGPT 4 and Claude got 25. Gemini 1.0 ranked the lowest just scoring 16.
None of them were able to score full marks suggesting cognitive impairment. This study challenges the thought of AI replacing human doctors in the future. Also, this study will surely diminish patient confidence in AI.
Limitations of this research
Future development may enhance the performance of AI in cognitive and visuospatial tasks.
But this is sure that the differences between human and AI cognition are likely to persist besides the development.
The study added, that these terms for AI were solely added as a metaphor, it does not mean AI or computer program can have neurogenerative diseases like humans.