Observed Impulse: Streamlined Minds: An Analogy Between Compressed AI Models and Forms of Intellectual Disability

Artificial intelligence engineers work hard to take language models and streamline them. They do this in order to make them cheaper, less energy intensive, and faster. Different techniques are used such as quantization, pruning, or distillation to decrease the size of the model without sacrificing too much performance. The new GPT 4o mini that came out recently is an example of this and is preferred by many customers due to its speed and lower costs.

I see a number of forms of neuropathology in a similar light. I believe that certain neurological and psychological disorders could represent a streamlining of intelligence. In previously published articles I have called this evolutionary neuropathology. And in the new book I’m writing I refer to the phenomenon as a “cognitive razor.” The razor being an evolutionary force, like Occam‘s razor, that excises the superfluous.

In AI and computer science these small models are very important because they are suitable for many uses even though they require much less energy, training, and money. I talk about this analogy between computers and human mental disorders in my new book Adaptive Neurodiversity, a very early version of which can be found here:

www.adaptiveneurodiversity.com

The work at Adaptive Neurodiversity attempts to show that neurodiverse conditions may have unappreciated and undiscovered adaptive qualities both in the ancestral past and today.

Neurodiversity refers to the idea that brain differences—whether in intellectual capacity, sensory processing, or emotional regulation—are natural variations rather than deficits. These variations could have conferred evolutionary advantages in certain environments, especially in small, cooperative groups. For example, individuals with working memory impairments might still excel in repetitive or routine tasks, which require focus on the present rather than complex problem-solving or future planning.

People with smaller brains or reduced intellectual capacity often perform well in tasks requiring concrete, routine, or emotionally focused processing. A streamlined cognitive system could be less prone to overthinking or distraction. Brain size does not always correlate directly with intelligence or practical functionality. Smaller brains might have evolved for energy efficiency, balancing performance with lower metabolic costs.

Just as AI systems are built to handle a diversity of problems using different architectures—some optimized for speed and efficiency, others for depth and complexity—human cognitive diversity allows for different strengths to emerge in different contexts. In a diverse, cooperative society, individuals with more streamlined cognitive processes can take on roles that suit their abilities, enhancing group resilience by providing specialized support. Strengthening this analogy, people with neurodiverse conditions are usually taught, instructed, and "programmed" by parents and family members without conditions.

Individuals with working memory impairments might struggle with complex multitasking or holding many details in mind simultaneously, but they often excel in tasks requiring focus on the present or repetition. This could be referred to as cognitive efficiency which could be defined as prioritizing important tasks while discarding or minimizing less relevant information. It is also related to potential terms such as cognitive narrowing, cognitive specialization, minimalist cognition, of adaptive cognitive reduction.

Before we consider individual mental disorders and how they may represent a form of model compression or downsizing, let first talk about how this works in AI. Specifically, let’s focus on model distillation.

Model Compression in Artificial Intelligence

Model distillation is a machine learning technique used to transfer the knowledge from a larger, more complex model (often called the teacher model) into a smaller, simpler model (called the student model), without sacrificing much of the original model's performance. The larger model is trained on a larger dataset and may have billions of parameters making it “resource heavy.” This technique is primarily used to create more efficient models that can be deployed in environments where computational resources are limited (e.g., mobile devices, edge computing, and embedded systems). Smaller models can be more adaptable to specialized tasks, as well as be easier to maintain, scale, upgrade, and fine tune.

The smaller models are computationally efficient while retaining high performance. They actually run faster, and this is beneficial in real-time applications such as video processing, autonomous driving, voice assistants, or recommendation systems where latency matters. Smaller models also use less energy and are more sustainable, making them ideal for environments where power consumption needs to be minimized (e.g., battery-powered devices). These smaller models can learn the essence of what the teacher learned, internalizing the complex patterns without unnecessary details. They may also generalize better without overfitting to the training data (regularization). They also learn intricate relationships from the parent models that they would not be able to capture themselves.

Now that we have discussed how this works in AI, let’s relate it to human mental disorders.

Natural Cognitive Aging and Alzheimer’s Disease

As people age, the brain tends to shed or prune unnecessary connections (synaptic pruning) and prioritize efficiency over raw computational power, similar to how AI engineers reduce model complexity without drastically sacrificing performance. This may serve an adaptive purpose. Evolution may have selected for this streamlining process because it allows older individuals to conserve energy while maintaining enough cognitive function to navigate daily tasks and social roles. This would be especially beneficial in hunting and gathering environments where efficiency was crucial for survival. In other words, cognitive aging might be a form of adaptive streamlining that mirrors AI's quantization and distillation techniques.

The focus might shift from high-complexity tasks, such as rapid problem-solving or learning new skills, toward knowledge-based tasks like pattern recognition, wisdom, and long-term memory retrieval. This could result in what we observe as crystallized intelligence (accumulated knowledge and wisdom) improving or staying stable, while fluid intelligence (problem-solving and new learning) declines.

Here is what the pruning process can look like in the domain of AI and neural networks:

While normal cognitive aging might resemble adaptive streamlining or distillation, Alzheimer’s disease is a pathological process where the "streamlining" goes too far, resulting in the loss of critical functionality. This is akin to over-distilling an AI model to the point where it no longer performs well or loses its ability to generalize. Just as AI engineers balance model performance and resource efficiency, evolution might have favored a brain that naturally reduces resource demands over time, at least in non-pathological aging.

You can read much more about this in the article I wrote on the topic here:

https://behavioralandbrainfunctions.biomedcentral.com/articles/10.1186/1744-9081-5-13

Intellectual Disability and Neuropathology

In some intellectual disabilities, cognitive functioning might be “streamlined” in the sense that the brain may prioritize some aspects of adaptive functionality (e.g., social bonding, routine behavior, basic survival skills) while reducing capacity in other areas, such as abstract reasoning, memory, or learning new complex tasks.

A brain that is more simplified, similar to a “distilled” AI model, may function with less cognitive noise or fewer distractions, allowing focus on repetitive tasks, concrete experiences, or specific social roles. In this way, intellectual disability could be seen as retaining critical adaptive functions while sacrificing more complex or unnecessary (for survival) cognitive operations. This may be an example of simplicity as a strength.

In cases where intellectual disability is nonsyndromic (without specific identifiable features like those found in syndromes), the brain might still exhibit a “cheaper model” in terms of capacity, but one that is more generalized rather than specialized in certain strengths. Here, the trade-off might be a more global reduction in cognitive complexity without any significant compensatory strengths.

Like AI model distillation, this process of cognitive simplification might have been selected for under certain evolutionary pressures, where conserving energy and focusing on critical survival functions outweighed the need for broad, abstract reasoning or novel problem-solving. Understanding intellectual disabilities in this way could provide new perspectives on support, education, and interventions aimed at enhancing the quality of life for individuals with these conditions.

You can find out more about this in my article here:

https://www.sciencedirect.com/science/article/abs/pii/S030698770600185X?via%3Dihub

Schizophrenia and Stress:

Schizophrenia is characterized by disturbances in cognition, perception, and emotion. These disturbances often include paranoia, delusions, impulsivity, and impaired working memory. If we view these symptoms through the lens of cognitive streamlining, it might suggest that the brain, under extreme stress or threat, prioritizes certain functions—such as heightened vigilance or rapid emotional reactions—over more complex, slower forms of reasoning and memory processing. Individuals with schizophrenia often exhibit impairments in working memory, which could be seen as the brain reducing its cognitive load by focusing on immediate survival rather than long-term planning or complex decision-making.

One plausible biological mechanism behind this idea is the role of cortisol, a stress hormone. Chronic exposure to high cortisol levels, particularly in the womb and early childhood, is known to be associated with schizophrenia. Epigenetically, prolonged cortisol exposure can alter gene expression and potentially lead to changes in brain function, particularly in regions involved in memory, emotion regulation, and the fight-or-flight response.

If the environment is inherently dangerous, such as in war zones or predator-rich areas, a brain adapted to anticipate danger, even where it might not be immediately present, could theoretically be advantageous. Paranoia and hypervigilance, often maladaptive in modern, stable environments, could have helped individuals in ancient or hostile settings where threats were constant and unpredictable. Schizophrenia is also associated with cognitive disorganization and impaired executive functioning. These deficits might be seen as a form of reduction in cognitive complexity, where the brain narrows its focus to immediate concerns and responses, while forgoing higher-order cognitive processes that are not immediately necessary for survival in stressful situations.

For more information you might want to peruse my article on schizophrenia here:

https://www.sciencedirect.com/science/article/abs/pii/S0306987707000254

If you found this interesting, please visit aithought.com. The site explores my model of working memory and its application to artificial intelligence, demonstrating how human thought patterns can be emulated to achieve machine consciousness and superintelligence. With over 50 detailed figures, the article offers a visually compelling examination of how bridging psychology and neuroscience can pave the way for the future of intelligent machines.

Observed Impulse

Thursday, October 24, 2024

Streamlined Minds: An Analogy Between Compressed AI Models and Forms of Intellectual Disability

1 comment: