Monday, March 23, 2020

Artificial Intelligence Needs to Utilize the Process of Myelination




Modern AI is capable of some fantastic feats, yet is still very limited compared to the human mind. The disciplines of machine learning and deep learning have shown us that with a powerful computer and a bunch of neurons organized into a network we can throw easy psychological problems at these networks and expect good answers. However, when researchers try to construct more complex networks, to tackle more cognitively complex problems, the networks fail to deliver. This is because they have not yet used a simple trick that animals have been using for hundreds of millions of years: gradual and progressive myelination.   

As we progress from infanthood through childhood our brains make various biological changes. These changes cause our level of analysis to slowly progress from analyzing brief sensory experiences, to analyzing complex, abstract scenarios. We begin our lives only being able to notice and attend to interactions occurring on short time scales. By adulthood, with the prefrontal cortex fully developed, we find ourselves able to follow interactions occurring on long time scales. In order to develop the ability to think about complex things we had to spend almost two decades gradually altering our brain’s processing strategy. It is a scaffolding process where we focus on the simplest things first, and use basic knowledge about them to advance incrementally to more complex things. The fact that all humans, and mammals in general, do this strongly suggests that it plays a role in the acquisition of advanced intelligence. In this entry I will argue that this developmental process will be instrumental in training superintelligent AI.  

This gradual process of brain development is made possible by myelination. Myelin is a fatty substance surrounding the connections between neurons (axons). Vertebrate animals use it to speed up information transmission between cells. The myelin increases the rate at which the electrical impulses travel. But vertebrates aren’t born with all the myelin that they will need as adults. Instead myelin develops slowly in specific areas, one at a time. Once a brain area has developed valuable, reliable, and consistent knowledge the connections formed by learning are solidified by the introduction of myelin.

The order of brain areas affected by myelin is consistent across all mammals. The early sensory areas are the first cortical areas to develop myelin. One of these, the primary visual area, starts to myelinate shortly after birth as the infant gains visual experiences. These early visual areas are responsible for basic visual perception and don’t rely on trial and error interactions with the environment. Rather they involve responding to visual stimuli that are presented simultaneously without any time delay between appearances. This happens when you see a picture of a house; you generally see the roof, windows, and door all at once without experiencing much of a time delay between these stimuli.

The last areas to myelinate are the association cortices and the prefrontal cortex (PFC). The PFC does not generally finish myelinating until one reaches the age of 18 or older. This means that the PFC does not “trust” that it has been wired up correctly until almost two decades into life. Whereas the visual system “trusts” that it has been wired correctly before the first two years. This is because sensory stimuli are generally honest, and all show up at the same time. Whereas complex events are constructed from stimuli that are removed from each other by delays in time. Understanding the relationships between events that are not simultaneous requires careful, logical inferences about causality. For example, the sale of a house is an abstract concept that involves parties, contracts, and delays that can last for weeks or months. This is why children aren’t licensed to sell houses.

It takes time to learn to make complex inferences that involve delays in time. It is probably the case that the process of myelination during development involves the progressive accumulation of knowledge that supports and buttresses more complex knowledge. In other words, as simple things are mastered in early cortical areas they provide the basis for new learning in the late cortical areas. In the same way, many brief, simple experiences create the knowledgebase to start to understand long, complex experiences with more advances probabilistic structures. The layers at the bottom of the hierarchy must be trained before the higher layers can find regularities and statistical structure within them. But as you can see in the diagram below the top of the hierarchy falls in the middle between sensory input and motor output. To properly train sensory input and motor output it is imperative that they be connected to each other, and can interact with each other to drive behavior, long before the association areas interposed between them are brought to the table.   



Many AI researchers point out that the things that AI and neural network systems today can accomplish are things that can generally be accomplished by an adult human brain in under a second. This means that they can only do things that we do unconsciously, such as near instantaneous pattern recognition. Today’ AIs can recognize houses but could not recognize, understand, or broker, the sale of a house. What AI is able to do are the kinds of things that we are able to do with our primary sensory and motor areas. This is because they are designed like a primary cortical areas. They do not feature reciprocal interactions between various structures organized into a brain like hierarchy. Very few AI architectures exist today that connect primary areas with association areas and a PFC. Those that do, don’t use anything like the process of myelination. Rather, in existing AI all of the areas from the simple to the advanced come online at the same time. I think these systems should use something analogous to the process of myelination because it would help them in their acquisition of knowledge. If they did, here’s how they should go about it:

First you would need a number of neural networks of pattern recognizing nodes. These networks must take inputs from the environment, each corresponding to a different sensory modality. These early networks must be linked to one another. Then these would have to be linked together in a hierarchy where unimodal networks form inputs to multimodal networks, which then form inputs themselves to even more densely multimodal networks above them. This “multimodal fusing” is depicted in the figure. The nodes of the densely multimodal networks would be the association networks and at the top of this hierarchy would be the PFC which would also be connected directly to the early motor networks. The nodes of the association and PFC networks would exhibit sustained firing. Importantly this sustained firing, the activity of the association networks, and their influence over ongoing processing elsewhere would start out extremely meager, and increase over time. These capacities could be increased as the system exhibits proficiency at simple tasks, such as object recognition, scene classification, and simple motor movements. As the association areas are added to the system a capacity to plan, and make higher order inferences and classifications could be expected.  

One important concept that I haven’t explained yet is that the first areas to myelinate in the brain, the sensory areas, have neurons of a single modality (e.g. either vision or hearing) that fire for short durations. The association areas and the PFC on the other hand have multimodal neurons (e.g. both vision and hearing) that fire for long durations. As in the mammalian brain (Huttenlocher & Dabholkar, 1997), sensory areas should mature (myelinate) early in development, and association areas should mature late. This will cause the capacity for sustained firing to start low, but increase over developmental time.

Postponing the initialization of association networks in this way would allow the formation of low-order associations between causally linked events that typically occur close together in time. This would focus the system on easy-to-predict aspects of its reality (e.g. correlations between occurrences in close temporal proximity). The consequent learning would erect a reliable scaffolding of highly probable associations that can be used to substantiate higher-order, time-delayed associations later in development (Reser, 2016). In other words, the rate of iterative updating from one state to the next (Fig. 9) would start very high. This would be reversed over the course of weeks to years as an increasing capacity for working memory would be folded in to the system.

Nature has found that it doesn’t pay to let the multimodal, neurons capable of sustained firing come online until the basics are learned first. I strongly suspect that AI network engineers will find this too. For the sake of progress I just hope that this myelination/development feature is implemented and perfected sooner rather than later. Given the rapid processing in computers, and the sheer amount of data available to them I don’t think that this process will take 18 in an AI as it does in a human. But I strongly believe that it is necessary for any developing thinker to start with the elementary inferences first.


An article that I wrote which can be found here explains this in more detail. 

https://www.sciencedirect.com/science/article/pii/S0031938416308289

Here is an excerpt from that article. 



"Due to their sustained activity, neurons in the PFC can span a wide delay time or input lag between associated occurrences [35][89] and thereby allow elements of prior events to become coactive with elements of subsequent events. Sustained activity allows neurons that would otherwise never fire together to both fire and wire together, and also allows features that never co-occur in the environment to be present together in topographic imagery. Thus, it may be reasonable to assume that SSC underlies the brain's ability to make internally derived associations between representations that never occur simultaneously in the environment. The longer sustained firing in association cortex lasts, the better the animal will be at capturing information about causally linked stimuli that present apart in time. The longer the sustained firing, the longer the delay can be. The same regularity may happen persistently in the environment, where a stimulus is followed several seconds later by another stimulus, concern, or opportunity; however, if the animal lacks sufficient sustained firing, this statistical regularity will not be captured by the neocortical system because the ensembles for them will never be exposed to each other.
Few if any mammals have evolved a human-like capacity for sustained firing in PFC neurons, and thus the mental lives of most mammals likely involve associations made between temporally proximate stimuli and concepts. This may suggest that in most ecological niches it is not helpful to create memories for relationships between stimuli that occur in delayed succession and instead it is better to focus on analyzing stimuli that present in quick succession [68][72]. There may therefore be two strategies, on opposite ends of a continuum, for holding recent information active: immediate and delayed succession strategies. The delayed succession strategy, involving high sustained firing and a low rate of working memory updating, is optimal for environmental scenarios that are prolonged over time, where temporally distant cues may retain contextual relevance. This strategy is likely associated with certain ecological or life-history conditions such as low extrinsic mortality, intergenerational resource flows, meme transference, and the K-selection strategy in general.
How can the brain trust that an association between two concepts that are removed in time and never co-occur simultaneously in the environment is valid? Each of the contents of working memory contribute to the selection of the next addition to working memory, and this may help to ensure that the contents held in working memory at any moment are veridically concordant rather than incongruous. This is because the system is narrowly constrained to only combining ensembles that have been highly associated in the past. If this is true, it suggests that at an early age the first associations are between stimuli that are nearly simultaneous, but that these can create foundational knowledge upon which to base reliable inferences about associations between stimuli that are removed from each other by a delay in time.
Because the frontal lobes of infants are underdeveloped, their brains probably exhibit far less continuity between brain states. Very young children can trust the connections that their early sensory areas have made concerning the spatiotemporal associations between near simultaneous features because these events show high order and regularity. This may be why sensory areas myelinate so early in life. Perhaps association areas are programmed genetically not to finish myelinating until early adulthood because it is a time-intensive process to form and test higher-order hypotheses about relationships between constructs that are more distributed through time."



No comments:

Post a Comment