I have developed a cognitive architecture for a form of artificial intelligence that I think could become conscious and could exhibit capabilities above and beyond what humans are capable of. If implemented and trained correctly I think that this system could exhibit qualities of "general AI," “strong AI,” and superintelligence. To find out more read below, or visit http://www.jaredreser.com/ai.html to see the related patent application.
ARTIFICIAL INTELLIGENCE SOFTWARE STRUCTURED TO SIMULATE MENTAL CONTINUITY
Jared Edward Reser Ph.D.
University of Southern California, Brain and Cognitive Sciences
16380 Meadow Ridge Road Encino, CA 91436
This article presents a hierarchically organized, artificial intelligence (AI) architecture that features reciprocating transformations between a working memory updating function and multiple imagery generation systems. This system couples these components by embedding them within a multilayered neural network of pattern recognizing nodes. Nodes low in the hierarchy are trained to recognize and represent sensory features and are capable of combining individual features or patterns into composite, topographical maps or images. Nodes high in the hierarchy are multimodal, and have a capacity for sustained activity allowing the maintenance of pertinent, high-level features through elapsing time. The higher-order nodes select new features from each mapping to add to the store of temporarily maintained features. This updated set of features, that the higher-order nodes maintain, are fed back into lower-order sensory nodes where they are continually used to guide the construction of successive topographic maps. Like information processing in the cerebral cortex, this system will demonstrate gradual shift in the distribution of coactive representations. The present article will describe and explore how this architecture can lead to mental continuity between processing states, and thus to human-like cognition. Multilayered neural networks of pattern recognizing nodes are connected to emulate the prefrontal cortex and its interactions with early sensory and motor cortex. In an effort to capture the imagery guidance functions of the human brain.
“In order for a mind to think, it has to juggle fragments of its mental states.”
Marvin Minsky, 1985.
The present article introduces a novel processing architecture to be implemented by a neural network that aims to simulate human intelligence. The method involves the emulation of the mammalian cerebral cortex utilizing a system of pattern recognizing nodes for: selecting priority stimulus features, temporarily maintaining these features in a limited-capacity working memory store, and allowing them to direct imagery generation as long as they remain active. The sustained firing of higher-order nodes allows representations to be maintained over multiple perception-action cycles permitting complex sequences of interrelated mental states. The overall distribution of active nodes in the neural network will shift gradually during contextual updating because the activity of certain neural nodes will persist. This will ensure that the activity of prioritized, goal or motor-relevant representations will be uninterrupted over time. The representations that demonstrate this continuity are a subset of the active representations from the previous state and may act as referents to which newly introduced representations of succeeding states relate. The limited-capacity store of coactive representations in association areas is updated as: 1) the nodes that continue to receive sufficient spreading activation energy are maintained; 2) the nodes that receive reduced energy are released from activation; 3) new nodes that are tuned so as to receive sufficient energy from the current constellation of coactivates are converged upon, and incorporated into the remaining pool of active nodes from the previous cycle.
The general intention of the present article is to propose a qualitative model delineating the fundamental processes involved in mental continuity and to explore how these could be simulated in a neural network. Properly integrated with existing AI technology, this method may have the potential to enhance the capabilities of problem solving agents with respect to pattern recognition, analytics, prediction, adaptive control, decision making, and response to query.
Modeling Mental Continuity
Continuity is defined as being uninterrupted in time. As proposed here, “mental continuity” involves a process where a gradually changing collection of mental representations held in attention/working memory exhibits a measure of uninterrupted activity across time and over sequential processing states. Because a number of neural nodes can be sustained continuously, each brain state is embedded recursively in the previous state, amounting to an iterative process that can progress toward a complex result. The term short-term continuity (STC) will be used to refer to the activity of the neural nodes responsible for these representations when sustained in a continuous way during the span of several seconds (analogous to human short-term memory (STM)).
If it were not for the phenomenon of persistent neural activity, instantaneous information processing states would be time-locked and isolated (as in most serial and parallel computing architectures), rather than continuous with the states before and after them. This article explores how sustained neural firing in association areas allows goal-relevant representations to be maintained over multiple perception-action cycles, in order to direct complex sequences of interrelated mental states. The individual states in a sequence of such states are interrelated because they share representational content. The associations linking the shared contents are saved to memory, impacting future searches, and ultimately resulting in semantic knowledge, planning, and systemizing.
The field of AI research is involved in creating a computing system that is capable of emulating certain functions that are traditionally associated with intelligent human behavior. Most early AI systems were only capable of responding in the manner in which human programmers provided for when the program was written. It became recognized that it would be valuable to have a computer which does not respond in a preprogrammed manner (Moravec, 1988). AI systems capable of adaptive learning have since become important. Neural networks have attempted to get around the programming problem by using layers of artificial neurons or nodes. Neural networks and genetic algorithms are widely implemented in research and industry for their capabilities involving adaptive learning and advanced pattern recognition. However, they are used for processing tasks that are narrowly constrained and highly specialized, and there has not yet been any strong form of intelligence derived from them. There are currently no neural networks, or AI systems whatsoever, that are structured to model the primate neocorex in order to guide the progressive generation of successive topographic maps. The present neural network software architecture is structured around identifying potentially goal-relevant information and holding it online to inform reciprocal cycles of imagery generation and feature extraction for the purpose of systemizing the environment.
Information Processing in the Mammalian Neocortex
The present model is consistent with connectionism and parallel distributed processing in that it conceptualizes mental representations as being built from interconnected networks of decentralized, semi-hierarchically organized, pattern-recognizing nodes that have multiple inputs and outputs (Gurney, 2009; Johnson-Laird, 1998). Like other biologically plausible neural network models, it envisions these nodes as microscopic, modular neural units and assumes that each individual unit represents an elementary feature or stable “microrepresentation” of LTM (Meyer & Damasio, 2009). Like other models (Cowan, 2005; Moscovich, 1992), this model views cognition as a system responsible for using active representations from LTM to guide goal-directed processing (Postle, 2007).
The structure of the cerebral cortex is highly repetitive and is marked by the employment of millions of nearly identical structures called cortical minicolumns (Lansner, 2009). Minicolumns are composed of closely connected neural cell bodies and span the six layers of grey matter in the neocortex. These minicolumns share the same basic structure, and are thought to employ the same cortical algorithm (Fuji et al., 1998). There are supposedly around 20,000,000 minicolumns in the human cortex, each of which is about 30 to 40 micrometers in diameter comprising perhaps 80-120 neurons (Lansner, 2009). Each column has its own inputs and outputs, and each performs neural computation to determine if its inputs from other columns are sufficient to activate its outputs to other columns (Rochester et al., 1956). Columns and other similar groups of neurons with the same tuning properties are often referred to as cell assemblies, and this term will be used here. Most neurons in an assembly share very similar receptive fields, and thus even though they may play different roles within the assembly they contribute to the assembly’s ability for encoding a unitary feature (Moscovich et al., 2007). Such an assembly of neurons is thought to embody a stable microrepresentation or fragment of long term memory. All of the millions of pattern recognizers in the neocortex are simultaneously considering their inputs, and continually determining whether or not to fire. In general, when a neuron or assembly fires, the pattern that it represents has been recognized. Assemblies, like the neurons that compose them, function as “coincidence detectors” or “pattern recognition nodes” (Fuji et al., 1998). The spread of activity in the cortex involves many-to-one (convergence) and one-to-many (divergence) interactions within a massively interconnected network of assemblies.
Assemblies in lower-order sensory areas identify sensory features from the environment and combine them into composite representations that mirror the geometric, and topographic orientations present in the sensory input. The early visual system uses retinotopic maps that are organized with a geometry that is identical to that used in the retina, and the auditory system uses tonotopic maps, where the mapping of stimuli is organized by tone frequency (Moscovich, 2007). Early sensory areas create topographic mappings from patterns recognized in the external environment, but also combine top-down inputs from higher association cortex into internally-derived imagery as well (Damasio, 1989). This internally-derived imagery, such as that seen in the “mind’s eye” is also topographically organized because it is created by the same lower-order networks. As you move up the neocortical hierarchy, from posterior sensory areas to anterior association areas, assemblies code for patterns that are more abstract. This is because higher-order assemblies have larger receptive fields, retain features from larger spatial areas, and involve longer stretches of time (Fuster, 2009). Because cortical assemblies are essentially pattern recognition nodes organized in a hierarchical system, they should be able to be modeled by computers. The best way to do this with modern technology is to use an artificial neural network.
Information Processing in Artificial Neural Networks
An artificial neural network is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionistic approach to computation (Russel et al., 2003). Like most neural networks, the present network should be an adaptive system capable of complex global behavior, that alters its own structure based on the nonlinear processing of either external or internal information that flows through the network. The software would require a massively parallel, distributed computing architecture, that could be run on a conventional computer. Most neural networks ordinarily achieve intelligent behavior through parallel computations, without employing formal rules or logical structures, and thus can be used for pattern matching, classification, and other non-numeric, nonmonotonic problems (Nilsson, 1998). The applications for the present device could be widened if it were designed to accept and process formal rules.
The traditional neural network is a multilayer system composed of computational elements (nodes) and weighted links (arcs). These networks are based on the human brain where the nodes are analogous to neurons, or neural assemblies, and the arcs are analogous to axons and dendrites. Each node receives signals from specific other nodes, processes these signals and then decides whether to “fire” at the nodes that it sends output to. Like the artificial neurons first described by McCullough and Pitts (1943), the nodes in the present system could feature a number of excitatory inputs whose weights range between 0 and 1 and inhibitory inputs whose weights range between -1 and 0. Each of the incoming inputs and its corresponding network weight are summed to equal an activity level. If this activity level exceeds the neurons’s firing threshold, it will cause the neuron to fire. The neuron can be made to learn from its experience, it processing activity causes either the threshold or weights to be changed. Neural networks are typically defined by three types of parameters: 1) The interconnection pattern between different layers of neurons; 2) The learning process for updating the weights of the interconnections; 3) The activation function that converts a neuron’s weighted input to its output activation.
DESCRIPTION OF THE ARCHITECTURE
The present architecture is a modular, hierarchically organized, artificial intelligence (AI) system that features reciprocating transformations between a working memory updating function and an imagery generation system. This device features a recursive, algorithmic, imagery guidance process to be implemented by a multilayered neural network of pattern recognizing nodes. The software models a large set of programming constructs or nodes that work together to continually determine, in real time, which from their population should be newly activated, which should be deactivated and which should remain active to best inform imagery generation.
The device necessitates a highly interconnected neural network that features a hierarchically organized collection of pattern recognizers capable of both transient and sustained activity. These pattern recognition nodes mimic assemblies (minicolumns) of cells in the mammalian neocortex and are arranged with a similar connection geometry. Like neural assemblies the nodes exhibit a continuous gradient from low-order nodes that code for sensory features, to high-order nodes that code for temporally or spatially extended relationships between such features. The lower order nodes are organized into modules by sensory modality. In each module, nodes work both competitively and cooperatively to create topographic maps. Nodes are grouped according to the feature they are being trained to recognize. These maps can be generated by external input, by internal input from higher-order nodes, or a mix of the two. The architecture will feature backpropagation, self-organizing maps, bidirectionality, Hebbian learning as well as a combination between principal-components learning and competitive learning. The program should have an embedded processing hierarchy composed of many content feature nodes between the input modalities and its output functions.
Nodes lower in the hierarchy are trained to recognize and represent sensory features and are capable of combining individual features or patterns into metric, topographical maps or images. Lower-order nodes are unimodal, and organized by sensory modality (visual, auditory, somatosensory etc.) into individual modules. Nodes high in the hierarchy are multimodal, module independent, and have a capacity for sustained activity allowing the conservation of pertinent, high-level features through elapsing time. The higher nodes are integrated into the architecture in a way that makes them capable of identifying a plurality of goal-relevant features from both internal imagery and environmental input, and temporarily maintaining these as a form prioritized information. The system is structured to allow repetitive, reciprocal interactions between the lower, bottom-up, and higher, top-down nodes. The features that the higher nodes encode are utilized as inputs that are fed back into lower-order sensory nodes where they are continually used for the construction of successive topographic maps. The higher nodes select new features from each mapping to add to the store of temporarily maintained features. Thus the most salient or goal-relevant features from the last several mappings are maintained. The group of active, higher-order nodes is constantly updated, where some nodes are newly added, some are removed, yet a relatively large number are retained. This updated list is then used to construct the next sensory image which will be necessarily similar, but not identical to, the previous image. The differential, sustained activity of a subset of high-order nodes allows thematic continuity to persist over sequential processing states.
All of the nodes within the device function as a continuous whole and are highly interconnected, but can be decomposed into separate, modular, neural networks. Nodes belonging to an individual module are highly interconnected with each other. These modules consists of a bottom layer of input cells, succeeded by alternating layers of local-feature extracting cells, and a top layer of output cells. The individual neural networks interface through the connections between top layer output cells and bottom layer input cells. Each network is organized so that multiple lower-order nodes can converge on higher order nodes, and single higher-order nodes can diverge upon multiple lower-order nodes. The component neural networks range from unimodal, feature representation nodes, to multimodal, concept representation nodes. Multiple interfacing neural networks could be arranged biomimetically as in figure 8 below.
A plausible biomimetic arrangement of interfacing neural networks.
The bottom-up to top-down reciprocations are organized into very precise oscillations that propagate in regularly timed intervals across the network so that they do not interfere with each other. The oscillations reciprocate back and forth at just the right speed so that each area has the time to process its inputs and send an output before the next complement of inputs arrive. It is important to carefully structure timing mechanisms in the present device so that messaging is not muddled or noisy. It is also important to structure the architecture so that continuity in the representations held active by the buffer can be disrupted when attention shifts. Repeated loops of conserved, higher-order features can be ended when attention is captured by an object or concept that competes for attention. The ability to free up resources higher-order nodes to attend to a new stimulus will be programmed by training. Before proper training is accomplished the system may not be able to reallocate its resources properly when its attention shifts.
The present model could be used to inspire a neurocomputational AI architecture to be used for deep neural reasoning. This is an engineering issue and could involve:
Computer programmers design AI systems to hold information online only for as long as they will need it. They hold data in a temporary store merely to compute what they are programmed to compute or execute whatever process they have pending. These systems are limited; however, because they are generally programmed to solve a narrow range of problems. The mammal brain, on the other hand, has a strategy dedicated to holding potentially relevant information online because there is a high probability that it will be useful in the near future. It wagers that this information will be used in processing without yet knowing how. It is not decided before hand how long items should remain active, rather, it is re-decided every second and during each state.
Soft computing approaches using the state-space approach contain incidental aspects of iteration, and other architectures such as neural networks use spreading activation. However, no machine yolks these together to create iterative updating and polyassociationism. Doing so provides a clear way to structure a system that does not suspend its activity every time it finishes a task. It will be important for the system to exhibit continuous endogenous processing and a working memory that updates continuously to allow for uninterrupted learning.
Typical AI systems are designed to perceive the environment, evaluate objects therein, select an action, act, and record the action, along with its efficacy and the results thereof to memory. There are no forms of artificial intelligence that do this using a succession of maps guided by a continually updating buffer of salient features. The present invention will do this with a novel information processing approach based on the architecture of the human brain, but implemented with available computer hardware and input/output devices.
It seems that the fundamental units of representation in the brain are cortical assemblies which are perhaps congruent with cortical minicolumns. This is the case because all of the cell in an assembly share tuning properties and constitute “coincidence detectors” or “pattern recognizers.” Because cortical assemblies are essentially pattern recognition nodes organized in a hierarchical system, they should be able to be modeled by computers. The best way to do this with modern technology is to use an artificial neural network. An artificial neural network is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionistic approach to computation. It is generally an adaptive system capable of complex global behavior, that alters its own structure based on the nonlinear processing of either external or internal information that flows through the network. Neural networks are usually software, generally require a massively parallel, distributed computing architecture, and are ordinarily run on conventional computers (Russel et al., 2003). The neural network ordinarily achieves intelligent behavior through parallel computations, without employing formal rules or logical structures, and thus can be used for pattern matching, classification, and other non-numeric, nonmonotonic problems (Nilsson, 1998).
To create a strong form of AI it is necessary to have an understanding of what is taking place that allows intelligence, thought, cognition, consciousness or working memory to move through space and time, or in another word, to “propagate.” Such an understanding must be grounded in physics because it must explain how the physical substrate of intelligence operates through space and time (Chalmers, 2010). The human brain is just such an intelligent physical system that AI researchers have attempted to understand and replicate using a biomimetic approach (Gurney, 2009). Features of the biological brain have been key in the evolution of neural networks, but the brain holds other information processing principles that have not been harnessed by A.I. efforts. The present device will be constructed to mimic this biological system.
Description of the Device
The device is a modular, hierarchically organized, artificial intelligence (AI) system that features reciprocating transformations between a working memory updating function and an imagery generation system. This device features a recursive, algorithmic, imagery guidance process to be implemented by a multilayered neural network of pattern recognizing nodes. The software models a large set of programming constructs or nodes that work together to continually determine, in real time, which from their population should be newly activated, which should be deactivated and which should remain active to best inform imagery generation.
The device necessitates a highly interconnected neural network that features a hierarchically organized collection of pattern recognizers capable of both transient and sustained activity. These pattern recognition nodes mimic assemblies (minicolumns) of cells in the mammalian neocortex and are arranged with a similar connection geometry. Like neural assemblies the nodes exhibit a continuous gradient from low-order nodes that code for sensory features, to high-order nodes that code for temporally or spatially extended relationships between such features. The lower order nodes are organized into modules by sensory modality. In each module, nodes work both competitively and cooperatively to create topographic maps. Nodes are grouped according to the feature they are being trained to recognize. These maps can be generated by external input, by internal input from higher-order nodes, or a mix of the two. The architecture will feature backpropagation, self-organizing maps, bidirectionality, Hebbian learning as well as a combination between principal-components learning and competitive learning. The program will have an embedded processing hierarchy composed of many content feature nodes between the input modalities and its output functions.
Nodes lower in the hierarchy are trained to recognize and represent sensory features and are capable of combining individual features or patterns into metric, topographical maps or images. Lower-order nodes are unimodal, and organized by sensory modality (visual, auditory, etc.) into individual modules. Nodes high in the hierarchy are multimodal, module independent, and have a capacity for sustained activity allowing the conservation of pertinent, high-level features through elapsing time. The higher nodes are integrated into the architecture in a way that makes them capable of identifying a plurality of goal-relevant features from both internal imagery and environmental input, and temporarily maintaining these as a form prioritized information. The system is structured to allow repetitive, reciprocal interactions between the lower, bottom-up, and higher, top-down nodes. The features that the higher nodes encode are utilized as inputs that are fed back into lower-order sensory nodes where they are continually used for the construction of successive topographic maps. The higher nodes select new features from each mapping to add to the store of temporarily maintained features. Thus the most salient or goal-relevant features from the last several mappings are maintained. The group of active, higher-order nodes is constantly updated, where some nodes are newly added, some are removed, yet a relatively large number are retained. This updated list is then used to construct the next sensory image which will be necessarily similar, but not identical to, the previous image. The differential, sustained activity of a subset of high-order nodes allows thematic continuity to persist over sequential processing states.
The agent discussed here would be capable of integrating multiple specialized AI programs into a larger composite of coordinated systems. To do this, it would be necessary to interface these systems with the input side of the imagery generation system. Existing AI technology could be integrated with the system that is described here in order to more quickly expand its behavioral repertoire and knowledge base. For example, databases and encyclopedic content could be used as sensory input and the functions of other AI, adaptive control and robotics systems could be added to its repertoire of available motor outputs and premotor representations. The system should have open access to a memory bank of text including dictionaries, thesauri, newswire articles, literary works and encyclopedic entries. The system should be able to integrate with multiple applications such as rule-based systems, expert systems, fuzzy logic systems, genetic algorithms, and archived digital text. The present system would benefit from the integration of existing programs for input and output e.g. visual perception programs, and robotic movement programs. This patent does not laboriously describe these components because they already exist in well-developed forms.
1. A modular, hierarchically organized, artificial intelligence (AI) system that features a working memory updating function and the capacity for imagery generation. The system comprises an algorithmic, imagery guidance process to be implemented by neural network software that will simulate the neurocognitive functioning of the mammalian prefrontal cortex.
2. The network’s connectivity allows reciprocating cross-talk between fleeting bottom-up imagery in early sensory networks and lasting top-down priming in association and PFC networks. The features that are maintained over time by sustained neural firing are used to create and guide the construction of topographic maps (imagery). The PFC and other association area neural networks direct progressive sequences of mental imagery in the visual, auditory and somatosensory networks.
3. Cognitive control stems from the active maintenance of features/patterns in the PFC module that allow the orchestration of processing and the generation of imagery in accordance with internally selected priorities.
4. The network contains nodes that are capable of “sustained firing,” allowing them to bias network activity, transmit their weights, or otherwise contribute to network processing for several seconds at a time (generally 1-30 seconds).
5. The network is an information processing system that has the ability to maintain a large list of representations that is constantly in flux as new representations are constantly being added, some are being removed and still others are being maintained. This distinct pattern of activity, where some individual nodes persist during processing makes it so that particular features of the overall pattern will be uninterrupted or conserved over time.
6. Because nodes in the PFC network are sustained, and do not fade away before the next instantiation of topographic imagery, there is a continuous and temporally overlapping pattern of features that mimics consciousness and the psychological juggling of information in working memory. This also allows consecutive topographic maps to have related and progressive content.
7. If this sustained firing is programmed to happen at even longer intervals, in even larger numbers of nodes, the system will exhibit even more mental continuity over elapsing time. This would increase the ability of the network to make associations between temporally distant stimuli and allow its actions to be informed by more temporally distant features and occurrences.
FIG.1 is a diagram depicting how high-level features are displaced, newly activated, and coactivated in the neural network to form a “stream” or “train” of thought. Each feature is represented by a letter. 1) Shows that feature A has already been deactivated and that now B, C, D and E are coactivated. When coactivated, these features spread their activation energy resulting in the convergence of activity onto a new feature, F. Once F is active it immediately becomes a coactivate, restarting the cycle. 2) Shows that feature B has been deactivated, that C, D, E and F are coactivated, and G is newly activated. 3) Shows that feature D, but not C has been deactivated. In other words, what is deactivated is not necessarily what entered first, but what has proven, within the network, to receive the most converging activity. C, E, F and G coactivate and converge on H causing it to become active.
FIG 2. is a diagram depicting the reciprocal transformations of information between lower-order sensory nodes and higher-order PFC nodes. Sensory areas can only create one sensory image at a time, whereas the PFC is capable of holding the salient or goal-relevant features of several sequential images at the same time.
FIG. 3 is a diagram depicting the behavior of features that are held active in the PFC. 1) Shows that features B, C, D and E which are held active in the PFC all spread their activation energy to lower-order sensory areas where a composite image is built that is based on prior experience with these features. 2) Shows that features involved in the retinotopic imagery from time sequence 1 converge on the PFC neurons responsible for feature F. Feature B drops out of activation, and C, D, E and F remain active and diverge back onto visual cortex. 3) Shows that this same process leads to G being activated and D being deactivated.
FIG. 4. is a list of processes involved in the central AI algorithm implemented by the present device.
1) Either sensory information from the environment, or top-down internally held specifications, or both are sent to low-order sensory neural network layers that contain feature extracting cells. This includes either feedforward sensory information from sense receptors (experiential perception) or from downstream retroactivation from higher-level nodes (internally guided imagery).
2) A topographic sensory map is made by each low-order, sensory neural network. These topographic maps represent the networks best attempt at integrating and reconciling the disparate stimulus and feature specifications into a single composite, topographic depiction. The map that is created is based on prior probability and training experience with these features.
3) In order to integrate the disparate features into a meaningful image, the map making neurons will usually be forced to introduce new features. The salient or goal-relevant features that have been introduced are extracted through a perceptual process where active, lower-order nodes spread their activity to higher-order nodes. As the new features pass through the neural networks, some are given priority and are used to update the limited-capacity, working memory, storage buffer that is composed of active high-level nodes.
4) Salient features that cohere with features that are already active in the higher-order nodes are added to the active features there. The least relevant, least fired upon features in higher-order areas are dropped from activation. The sustained firing of a subset of higher-order nodes allows the important features of the last few maps to be maintained in an active state.
5) At this point it is necessary for the system to implement a program that allows it to decide if it will continue operating on the previously held nodes or redirect its attention to the newly introduced nodes. Each time the new features garnered from the topographic maps are used to update the working memory store, the agent must decide what percentage of previously active higher-order nodes should be deactivated in order to reallocate processing resources to the newest set of salient features. Prior probability with saliency training will determine the extent to which previously active nodes will continue to remain active.
6) The updated subset of higher-order nodes will then spread its activity backwards toward lower-order sensory nodes in order to activate a different set of low-order nodes culminating in different topographic sensory map.
7) A. The process repeats.
B. Salient sensory information from the actual environment interrupts the process. The lower-order nodes and their imagery, as well as the higher-order nodes and their priorities, are refocused on the new incoming stimuli.
FIG 5. Demonstrates the architecture of the interfacing neural networks.
FIG 6: illustrates how relevant features can be maintained through time using nodes with sustained firing. The figure compares the number of past nodes that remain active at the present time period (“now”) in a normal human, a human with PFC dysfunction, and the hypothetical AI agent. The AI agent is able to maintain a larger number of higher-order nodes though a longer time span, ensuring that its perceptions and actions, now, will be informed by a larger amount of recent information. Note how the lower-order sensory and motor features are the same in each graph with respect to their number and duration, yet those in association areas are highest in both number and duration for agent C.
FIG 7. Depicts an octopus within a brain in an attempt to communicate how continuity is made possible in the brain and the in present device. When an octopus exhibits seafloor walking, it places most of its arms on the sand and gradually repositions arms in the direction of its movement. Similarly, the mental continuity exhibited by the present device is made possible because even though some representations are constantly being newly activated and others deactivated, a large number of representations remain active together. This process allows the persistence of “cognitive” content over elapsing time, and thus over machine processing states.
BACKGROUND OF THE ARCHITECTURE
The Artificial PFC: Continuity Through Sustained Activity
To create a strong form of AI it is necessary to have an understanding of what is taking place that allows intelligence, thought, cognition, consciousness or working memory to move through space and time, or in another word, to “propagate.” Such an understanding must be grounded in physics because it must explain how the physical substrate of intelligence operates through space and time (Chalmers, 2010). The human brain is just such an intelligent physical system that AI researchers have attempted to understand and replicate using a biomimetic approach (Gurney, 2009). Features of the biological brain have been key in the evolution of neural networks, but the brain may hold information processing principles that have not been harnessed by A.I. efforts (Reser 2011, 2012, 2013).
The mammalian PFC and other association cortices have neurons that are specialized for “sustained firing,” allowing them to generate action potentials at elevated rates for several seconds at a time (generally 1-30 seconds) (Fuster, 2009). In contrast, neurons in other brain areas, including cortical sensory areas, remain active only for milliseconds unless sustained input from association areas makes their continued activity possible (Fuster, 2009). In the mammalian brain, prolonged activity of neurons in association areas, especially prefrontal and parietal areas, allows for the maintenance of specific features, patterns, and goals (Baddeley, 2007). Working memory, executive processing and cognitive control are widely thought to stem from the active maintenance of patterns of activity in the PFC that represent goal-relevant features (Goldman-Rakic, 1995). The temporary persistence of these patterns ensures that they continue to transmit their effects on network weights as long as they remain active, biasing other processing, and affecting the interpretation of subsequent stimuli that occur during their episode of continual firing.
The pattern of activity in the brain is constantly changing, but because some individual neurons persist during these changes, particular features of the overall pattern will be continuous, uninterrupted, or conserved over time. In other words, the distribution of active neurons in the brain transfigures gradually and incrementally from one configuration to another, instead of changing all at once. If it were not for the phenomena of sustained firing and cortical priming, instantaneous mental states would be discrete and isolated rather than continuous with the states before and after them. Thus the human brain is an information processing system that has the ability to maintain a large list of representations that is constantly in flux as new representations are constantly being added, some are being removed and still others are being maintained. The present device will be constructed to mimic this biological phenomenon.
Although its limits are presently being debated, the human neocortex is clearly capable of holding numerous neural representations active over numerous points in time. The quantity of mental continuity is directly proportional to the number of such sustained representations and the length of time of their activity (Reser, 2011, 2012, 2013).
Graphical depiction of STC. Each bracket represents the active time span of a neural representation. The x axis represents time and the y axis demarcates the cortical area where the representation is active. Red brackets denote representations that have exhibited uninterrupted activity from the point when they became active, whereas blue brackets denote representations that have not been sustained. In time sequence 1 representations B, C, D and E have remained active until t1. In time sequence 2 B has deactivated, C, D and E have remained active, and F is newly active. The figure depicts a system with STC because more than one representation (C, D, and E) has been maintained over more than one point in time (t1 and t2). Sensory and association areas do not exhibit continuity between the two time sequences shown although they would on shorter time intervals.
In Figure 1 above, representations B, C, D, and E are active during time 1, and C, D, E and F are active during time 2. Thus representations C, D, and E demonstrate STC because they exhibit continuous and uninterrupted activity from time 1 through time 2. The brain state at time 1 and the brain state at time 2 share C, D, and E in common and, because of this, can probably be expected to share other commonalities including: similar information processing operations, similar memory search parameters, similar mental imagery, similar cognitive and declarative aspects, and similar experiential and phenomenal characteristics.
A simplified graphical representation of STC depicting it as a gradually shifting, stream-like distribution. Figure 2 extends Figure 1 over multiple time intervals revealing a repeating pattern: remnants from the preceding state are consistently carried over to the next state. If this distributional plot were modeling neurons rather than representations there might be thousands of units per time period rather than four, but the repeating pattern should be conserved.
Computational operations, that take place as a computer implements lines of code (rule-based, if-then operations) to transform input into output, have discrete, predetermined starting and stopping points. For this reason computers do not exhibit continuity in their information processing. There are no forms of artificial intelligence that use mental continuity as described here. There are existing computing architectures with limited forms of continuity where the current state is a function of the previous state, and where data is entered into a limited capacity buffer to inform other processes. However, the memory buffer is not multimodal, not positioned at the top of a hierarchical system and does not inform and interact with topographic imagery.
The mammalian neocortex is capable of holding a number of such mnemonic representations coactive, and using them to make predictions by allowing them to spread their activation energy together, throughout the thalamocortical network. This activation energy converges on the inactive representations from LTM that are the most closely connected with the current group of active representations, making them active, and pulling them into short-term memory. Thus new representations join the representations that recruited them, becoming coactive with them.
The way that assemblies and ensembles are selected for activity in this model is consistent with spreading activation theory. In spreading activation theory, associative networks can be searched by labeling a set of source nodes, which spread their activation energy in a nonlinear manner to closely associated nodes (Collins & Loftus, 1975). Cortical assemblies work cooperatively by spreading the activation energy (both excitatory and inhibitory) necessary to recruit or converge upon the next set of ensembles that will be coactivated with the remaining ensembles from the previous cycle.
Together, they impose sustained information processing demands on the lower-order sensory and motor areas within the reach of their long-range connections. The longer the activity in these higher-order neurons is sustained, the longer they remain engaged in hierarchy-spanning, recurrent processing throughout the cortex and subcortex.
Table 1: The Characteristics of Polyassociativity:
Gradual additions to and subtractions from a pool of simultaneously coactivated ensembles occur as:
1. Assemblies that continue to receive sufficient activation energy from the network are maintained.
2. Assemblies that receive sufficiently reduced activation energy are released from activation.
3. New assemblies, which are tuned to receive sufficient activation energy from the current constellation of coactivates, are converged upon, and incorporated into the remaining pool of active assemblies from the previous cycle.
Outlining the process of polyassociativity in this way is meant to show that the computational algorithm used by the brain may be primarily directed at determining which inactive ensembles are the most closely statistically related to the currently active assemblies. From this perspective, the contents of the next state are chosen based on how the currently active assemblies interact with the existing, associative, neuro/nomological network.
A diagram depicting “polyassociativity” and illustrating the ways in which high-level representations, or ensembles, are displaced, maintained, and newly activated in the brain. 1) Shows that representation A has already been deactivated and that B, C, D and E are now coactivated, mirroring the pattern of activity shown in Figure 1. When coactivated, these representations pool and spread their activation energy, resulting in the convergence of activity onto a new representation, F. Once F becomes active, it immediately becomes a coactivate, restarting the cycle. 2) Shows that B has been deactivated while C, D, E, and F are coactivated and G is newly activated. 3) Shows that D but not C has been deactivated. In other words, what is deactivated is not necessarily what entered first, but what has proven to receive the least converging activity. C, E, F, and G coactivate and converge on H.
In Figure 4, between time periods 1 and 2, C, D and E exhibit STC, whereas, between time periods 1 and 3, only C and E exhibit STC. C and E are active over all three time periods meaning that these representations are being used as search function parameters for multiple cycles, they are the subject of attention, and are demonstrating STC. Alternatively, we can imagine a scenario where B, C, D, and E from step one of Figure 3 were immediately replaced by F, G, H, and I. Such a processing system may still be using previous states to determine subsequent states; however, because no activity is sustained through time, there would be no continuity in such a system (this is generally how computing systems process information).
In Figure 4, ensembles C and E have fired together over three individual time intervals, and thus will show a propensity to wire together, increasing their propensity for firing together in the future. This link between them will allow one to recruit the other. However, it is probably much more likely that they will recruit each other if the other contextual ensembles are also present. The coincidental or rare associations between the ensembles of an experience are probably mostly lost from non-hippocampal dependent cortical memory. However the reoccurring associations are heavily encoded and persist as semantic knowledge.
In other words, the cortex constantly spreads activation energy from novel combinations of active ensembles that have never been coactive before and attempts to converge upon the statistically most relevant association without certain or exact precedence resulting in a solution that is not guaranteed to be optimal. Optimality could be approached if a specific group of ensembles (say, C and E) have been thoroughly associated with many others and a type of expertise with these concepts has developed due to either extensive operant conditioning.
Because of their sustained activity neurons in the PFC can span a wider delay time or input lag between associated occurrences (Zanto, 2011) allowing elements of prior events to become coactive with subsequent ones (Fuster, 2009). Without sustained firing, the ability to make associations between temporally distant (noncontiguous) environmental stimuli is disrupted. Sustained activity allows neurons that would otherwise never fire together to both fire and wire together. Thus, it may be reasonable to assume that sustained firing underlies the brain’s ability to make subjective, internally-derived associations between representations that would never co-occur simultaneously in the environment.
This indicates that one way to quantify mental continuity is to determine the proportion of previously active neural nodes that have remained active during a resource demanding cognitive task. Uninterrupted activity augments associative searches by allowing specific features to serve as search function parameters for multiple cycles. Intelligence in this system can be expected to increase along with increases in: 1) the number of available nodes to select from, 2) the number of nodes that can be coactivated simultaneously, and 3) the length of time that individual nodes can remain active.
The mesocortical dopamine (DA) system plays an important role in sustained activity, suggesting that it may be heavily involved in mental continuity. Dopamine sent from the ventral tegmental area (VTA) modulates the activity and timing of neural firing in the PFC, association cortices, and elsewhere. Dopamine neurotransmission in the PFC is thought to underlie the ability to internally represent, maintain, and update contextual information (Braver & Cohen, 1999). This is necessary because information related to behavioral goals must be actively sustained such that these representations can bias behavior in favor of goal-directed activities over temporally extended periods (Miller & Cohen, 2001). It has become clear that the activity of the DA/PFC system fluctuates with environmental demand (Fuster & Alexander, 1971). Many studies have suggested that the system is engaged when reward or punishment contingencies change. Both appetitive and aversive events have been shown to increase dopamine release in the VTA, causing sustained firing of PFC neurons (Seamans & Robbins, 2010). Seamans and Robbins (2010) elaborated a functional explanation to support this case. They have stated that the DA system is phasically activated in response to novel rewards and punishments because it is adaptive for the animal to anchor upon and further process novel or unpredicted events.
It is important for mammals to identify and capture information about unexpected occurrences so that it can be further processed and systematic patterns can be identified. The novel experience is probably broken down into its component parts and the representations in memory for these parts are allowed to spread their activation energy in an attempt to converge on and activate historically associated representations that are not found in the experience itself. Because memory traces for the important features remain active and primed, they can be used repeatedly as specifications that guide the generation of apposite mental imagery in sensory areas (Reser, 2012). It is highly probable that sequences of lower-order topographic images depict and explore hypothetical relationships between the higher-order, top-down specifications. This amounts to a continual attempt to use the associative memory system to search sensory memory for a topographic image that can meaningfully incorporate the important features. It seems that reciprocating activity between the working memory updating system and the imagery generation system builds interrelated sequences of mental imagery that are used to form expectations and predictions.
The fact that newly active search terms are combined with search terms from the previous cycle makes this process demonstrate qualities of “progressive iteration.” Perhaps reciprocating activity between the working memory updating system and the imagery generation systems generates sequences of interrelated mental images that build on themselves to form abductive expectations, and predictions.
The Neocortex: Reciprocating Crosstalk between Association and Sensory Cortex
The higher-order features that are maintained over time by sustained neural firing are used to create and guide the construction of mental imagery (Reser, 2012). The brain’s connectivity allows reciprocating cross-talk between fleeting bottom-up imagery in early sensory cortex and lasting top-down priming in late association cortex and the PFC. This process allows humans to have progressive sequences of related thoughts, where thinking is based heavily on lower order sensory areas and the topographic mappings that they generate in order to best represent a set of higher-order features.
To a certain extent, perceptual sensory processing is thought to be accomplished hierarchically (Cohen, 2000). The cortical hierarchy observed from sensory to association cortex arises because simple patterns are arranged to converge upon second-order patterns, which in turn converge on third-order patterns and so on. This leads to a hierarchy of increasingly complex representations. Many pathways in the brain, such as the ventral visual pathway, appear to use a “structurally descriptive” architecture where neurons or neural populations that encode low-level, nonaccidental features are allowed to converge together onto those that encode more abstract, higher-order, generic, template-like features (Edelman, 1997). A structural description is defined as “a description of an object in terms of the nature of its constituent parts and the relationships between those parts (Wolfe et al., 2009).” Hierarchical processing that specifies structural descriptions is thought to allow perceptual invariance and robust postcategorical typology.
(Meyer & Damasio, 2009). (Meyer, 2011). Internally-derived sensory imagery, such as that seen in the “mind’s eye” probably appears topographically organized because it is created by the same lower-order networks responsible for perceiving external stimuli. Thus it may be safe to assume that when we think and imagine, we construct and manipulate maps in early perceptual networks. During perception, the bottom-up activity may be driving and the top-down may be modulatory; however, during imagination the top-down activity may be driving and the bottom-up may be modulatory. These conceptions are consistent with the “consolidation hypothesis,” which states that memory is stored in the same areas that allow active, real-time perception and function (Moscovitch, et al., 2007).
It is thought that object recognition, decision making, associative recall, planning and other important cognitive processes, involve two-way traffic of signal activity among various neural maps that stretch transversely through the cortex from early sensory areas to late association areas (Klimesch, Freunberger, & Sauseng, 2010). Bottom-up sensory areas deliver fleeting sensory information and top-down association areas deliver lasting perceptual expectations in the form of templates or prototypes. These exchanges involve feedforward and feedback (recurrent) connections in the corticocortical and thalamocortical systems that bind topographic information from lower-order sensory maps about the perceived object with information from higher-order maps forming somewhat stable constellations of activity that can remain stable for tens or hundreds of milliseconds (Crick & Koch, 2003).
STC impacts this reciprocating cross-talk. These reciprocations may create progressive sequences of related thoughts, specifically because the topographic mappings generated by lower-order sensory areas are guided by the enduring representations that are held active in association areas (Reser 2011, 2012, 2013). The relationship between anterior and posterior cortex may be best characterized by two main relationships: 1) association areas maintain representations from, not one but several, of the last few topographic maps made in sensory areas, 2) because they are drawing from a register with sustained contents, sequential images formed in sensory areas have similar content and thus should be symbolically or semiotically related to one another.
Feedback activation from top-down association areas hands down specifications to early sensory cortex for use in imagery building. Disparate chunks of information are integrated into a plausible map and transiently bound together. This integrative process may be very rapid and may use the structurally descriptive perceptual hierarchy in reverse to go from abstractions to specifics.
Sustained firing and recurrent processing make it possible for recent states to spill over into subsequent states, creating the context for them in a recursive fashion. In a sense, each new topographic map is embedded in the previous one. This creates a cyclical, nested flow of information processing marked by STC, which is depicted in Figure 6.
Consecutive topographic images about a specific scenario model the scenario by holding some of the contextual elements constant, while others are allowed to change. Thus prior maps set premises for and inform subsequent maps. Learned mental tasks probably have distinct predefined algorithmic sequences of topographic mappings that must be completed in sequence in order to achieve the solution. Each brain state would correspond to a different step in the algorithm, and its activity would recruit the next step. All logical and methodical cognition may require that a number of relevant features from the present scenario remain in STC so that they spread their activity within the network in order to influence the selection of the ensembles necessary for task satisfaction.
In reality, association areas have much more to converse with than simply a single retinotopic map as depicted in Figure 6. In fact, they feed their specifications to and receive specialized input from dozens of known topographic mapping areas (Kaas, 1997). These areas of different sensory modalities are constantly responding to incoming activity in an attempt to pull up the most context-appropriate map in their repertoire. Interestingly, the sensory modules that build these maps take specifications not only from association areas, but also from other sensory modules (Klimesch, Freunberger, & Sauseng, 2010). Further compounding the complexity, these sensory modules probably have their own limited form of STC where certain low-level features can exhibit sustained activity. Moreover, motor and premotor modules give specifications to and receive specifications from this common workspace while they are building their musculotopic imagery for movement. The same goes for language areas.
A diagram depicting the reciprocal transformations of information between lower-order sensory mappings and higher-order association area ensembles during internally generated thought. Sensory areas can only create one topographic mapping at a time, whereas association areas are capable of holding the salient or goal-relevant features of several sequential mappings at the same time.
In a sense, the higher and lower order areas are constantly interrogating each other, and providing one another with their expert knowledge. For instance, the higher-order areas have no capacity to foresee how the specifications that they hold will be integrated into metric imagery. Also, the images created by lower-order nodes must introduce other, unspecified features into the imagery that it builds and this generally provides the new content for the stream of thought. For example, if higher order nodes come to hold features supporting the representations for “pink,” “rabbit,” and “drum,” then the subsequent mappings in lower-order visual nodes may activate the representations for batteries, and the auditory nodes may activate the representation for the word “Energizer bunny.” The central executive (the PFC and other association areas) direct progressive sequences of mental imagery in a number of topographic sensory and motor modules including the visuospatial sketchpad, the phonological (articulatory) loop and the motor cortex. This model frames consciousness as a polyconceptual, partially-conserved, progressive process, that performs its high-level computations through “reciprocating transformations between buffers.” More specifically, it involves reciprocating transformations between a partially conserved store of multiple conceptual specifications and another nonconserved store that integrates these specifications into veridical, topographic representations.
A diagram depicting the behavior of representations that are held active in association areas. 1) Shows that the representations B, C, D, and E, which are held active in association areas, all spread their activation energy to lower-order sensory areas where a composite image is built that is based on prior experience with these representations. 2) Shows that features involved in the topographic imagery from time sequence 1 converge on the PFC neurons responsible for F. B drops out of activation, and C, D, E and F remain active and diverge back onto visual cortex. 3) Shows that the same process leads to G being activated and D being deactivated, mirroring the pattern of activity shown in Figure 4.
Uses the format of Figure 1 to illustrate how relevant features can be maintained through time using nodes with sustained firing. The figure compares the number of past nodes that remain active at the present time (t1), in a normal human, a human with PFC dysfunction, and the hypothetical A.I. agent. The A.I. agent is able to maintain a larger number of higher-order nodes through a longer time span, ensuring that its perceptions and actions in time 1 will be informed by a larger amount of recent information. Note how the lower-order sensory and motor features are the same in each graph with respect to their number and duration, yet those in association areas are the highest in both number and duration for agent C.
If this sustained firing was programmed to happen at even longer intervals, and involve even larger numbers of nodes, the system would exhibit a superhuman capacity for continuity. This would increase the ability of the network to make associations between temporally distant stimuli and allow its actions to be informed by more temporally distant features and concerns. Aside perhaps from altering the level of arousal (adrenaline) or motivation (dopamine), it is currently not possible to engineer the human brain in a way that would increase the number and duration of active higher-order representations. However, in a biomimetic instantiation, it would be fairly easy to increase both the number and duration of simultaneously active higher-order nodes (see Figure 9 below). Accomplishing this would allow the imagery that is created to be informed by a larger number of concerns, and would ensure that important features were not omitted simply because their activity could not be sustained due to biological limitations. Of course, in order to operate meaningfully, and reduce its propensity for recognizing “false patterns,” such an ultraintelligent system would require extensive supervised and unsupervised learning.
It is currently not possible to engineer the human brain in a way that increases the number and duration of active higher-order representations in order to enhance mental continuity and the intelligent processes that it supports. However, in a biomimetic instantiation it would be fairly easy to increase the number and duration of simultaneously active higher-order representations. Accomplishing this would allow the imagery that is created to be informed by a larger number of concerns, and would ensure that important features were not omitted simply due to the fact that their activity could not be sustained due to biological limitations. It is highly probable that a succession of lower-order topographic images or maps created in sensory processing modules depict and explore hypothetical, causal relationships between the higher-order, top-down specifications held in STC.
Again, system 1 is making automatic, intuitive, flash judgments, but because of the STC made possible by sustained firing, these rapid associations are able to support and buttress each other in a progressive and additive manner. System 2 cognition may be present when several nodes in association areas exhibit sustained firing and are used multiple times to build topographic or musculotopic maps, culminating in sensory imagery or motor output that could not be informed by any of the intermediate steps alone, or that is capable of solving a problem too difficult for any system 1 process itself. For example, early processes may provide premises or propositional stances that can be used algorithmically (e.g. syllogistically) to induce or justify a conclusion in subsequent processes.
To accomplish overt behavior, higher inputs are fed not only to the lower sensory nodes, but also in a similar, top-down manner to a behavior module that will guide natural language output and other behaviors such as robotic control. The final layer of nodes in this behavior module will be nodes that directly control movement and verbalization and the higher nodes will be continuous with the higher-order PFC-like nodes. The software functions in an endless loop of reciprocating transformations between sensory nodes, motor nodes and PFC-like buffer.
Learning in the Network
The system will begin untrained with random connection weights between nodes. Learning should be concentrated on the early sensory networks first. This will follow the ontogenetic learning arc seen in mammals where the earliest sensory areas myelinate in infancy and the late association areas such as the PFC do not finish myelinating until young adulthood. Of course, this form of artificial intelligence would have to have a prolonged series of developmental experiences, similar to a childhood, to learn which representations to keep active in which scenarios. The network will act to consolidate or potentiate in memory the specific groupings of nodes that have produced favorable outcomes, in order to more rapidly inform future decision making.
Other Forms of Sustained Activity
Aside from having a PFC analogue, the network could also have an analogue of cortical priming and an analogue of the hippocampus. Humans have thoughts that carry continuity because changes in content are gradual as more recent activations/representations are given a higher priority than older ones. Activity that transpired minutes ago is given moderate priority, activity from seconds ago is given high priority and activity from mere milliseconds ago is given the highest priority. This dynamic is made possible by the PFC analogue, but could be accentuated by analogues of cortical priming. To allow for an analogue of cortical priming, all recently active neurons would retain a moderate amount of increased, but subthreshold activity. The activity level of recently used nodes in both the higher and lower-order areas would not quite fall back to zero. This would ensure that recently used patterns and features would be given a different form of priority, yet to a lesser and more general extent than that allowed by the PFC analogue. Regarding the network partitions depicted in Figure 5, the sensory, motor and hippocampal neural networks would show the least priming, the association area, and premotor neural networks would show moderate priming, and the PFC would show the highest degree of priming. Functions for the parameters of priming could be fine-tuned by genetic algorithms.
Furthermore, the network could have an analogue of the hippocampus. A hippocampal analogue would keep a record of contextual, or episodic clusters of previous node activation. Instead of keeping a serial record of averaged activity, the hippocampus analogue would capture episodic constellations of node activity and save these to be reactivated later. These episodic memory constellations would be activated when a large subset of the constellation is present during processing. This means that when neural network activity closely approximates an activity constellation that was present in the past, the hippocampal analogue is capable of reactivating the original constellation. The activity of the hippocampal analogue should be informed by actual hippocampal anatomy and the “pattern completion” hypothesis of hippocampal function. To build an analogue into a neural net it would be necessary to have a form of episodic memory that can be cued by constellations of activity that closely resembles a past (autobiographical or episodic) occurrence. This memory system would then be responsible for “completing the pattern,” or passing activation energy to the entire set of nodes that were initially involved in the original experience, allowing the system a form of episodic recall. As with the actual brain (Amaral, 1987), in the present device, the hippocampus should be reciprocally connected with the PFC and association areas but not with primary sensory or motor areas.
Appropriate Neural Network Parameters For the Present Device
A network is “trained” to recognize a pattern by adjusting arc weights in a way that most efficiently leads to the desired results. Arcs contributing to the recognition of a pattern are strengthened and those leading to inefficient or incorrect outcomes are weakened. The network “remembers” individual patterns and uses them when processing new data. Neural learning adjustments are driven by error or deviation in the performance of the neuron from some set goal. The network is provided with training examples, which consist of a pattern of activities for the input units, along with the desired pattern of activities for the output units. The actual output of the network is contrasted with the desired output resulting in a measure of error. Connection weights are altered so that the error is reduced and the network is better equipped to provide the correct output in the future. Each weight must be changed by an amount that is proportional to the rate at which the error changes as the weight is changed, an expression called the “error derivative for the weight.” In a network that features back propagation the weights in the hidden layers are changed beginning with the layers closest to the output layer, working backwards toward the input layer. Such backpropagating networks are commonly called multilayer perceptrons (Rosenblatt, 1958). The present architecture involves a number of multilayered neural networks connected to each other, each using their own training criteria for backpropagated learning. For instance the visual perception module would be trained to recognize visual patterns, the auditory perception module would be trained to recognize auditory patterns, and the PFC module would be trained to recognize multimodal, goal-related patterns.
The hierarchical multilayered network, the neocognitron, was first developed by K. Fukushima (1975). This system and its descendants are based on the visual processing theories of Hubel and Wiesel and form a solid archetype for the present device because they feature multiple types of cells and a cascading structure. Popular neural network architectures with features that could be valuable in programming the present device include the adaptive resonance theory network (Carpenter & Grossberg), the Hopfield network, the Neural Representation Modeler, the restricted coulomb energy network, and the Kohonen network. Teuvo Kohonen (2001) showed that matrix-like neural networks can create localized areas of firing for similar sensory features, which result in a map-like network where similar features were localized in close proximity and discrepant ones were distant. This type of network uses a neighborhood function to preserve the topological properties of the input space, and has been called a “self-organizing map.” This kind of organization would be necessary for the present device to accomplish imagery generation, and would contribute to the ability of the lower-order nodes in the sensory modules to construct topographic maps.
A neural network that uses principal-components learning uses a subset of hidden units that cooperate in representing the input pattern. Here, the hidden units work cooperatively and the representation of an input pattern is distributed across many of them. In competitive learning, in contrast, a large number of hidden units compete so that a single hidden unit is used to represent a particular input pattern. The hidden unit that is selected is the one whose incoming weights most closely match the characteristics of the input pattern. The optimal method for the present purposes lies somewhere between purely distributed and purely localized representations. Each neural network node will code for a discrete, albeit abstract pattern, and compete among each other for activation energy and the opportunity to contribute to the depiction of imagery. However, multiple nodes will also work together cooperatively to create composite imagery.
When active, high level nodes signal each of the low level nodes that they connect with, they are in effect, retroactivating them. They are activating those that recently contributed to their activity, and activating previously dormant ones as well. This retroactivation of previously dormant nodes constitutes a form of anticipation or prediction, indicating that there is a high likelihood that the pattern that these nodes code for will become evident (prospective coding). This kind of prediction is best achieved by a hierarchical hidden Markov model. Utilizing Markov models, and their predictive properties will be necessary. This process is used in Ray Kurzweil’s Pattern Recognition Theory of Mind (PRTM) model, which uses a hidden Markov model and a plurality of pattern recognition nodes for its cognitive architecture (Kurzweil, 2012). Hierarchical temporal memory (HTM) is another cognitive architecture that models some of the structural and algorithmic properties of the neocortex (Hawkins & Blakeslee, 2005). The hope with PRTM and HTM is that a hierarchically structured, neural network with enough nodes and sufficient training should be able model high-order human abstractions. However, distilling such abstractions and utilizing them to make complex inferences may necessitate an imagery guidance mechanism with a working memory updating function.
Neural networks can propagate information in one direction only, or they can be bi-directional where activity travels up and down the network until self-activation at a node occurs and the network settles on a final state. So called recurrent networks are constructed with extensive feedback connections. Such recurrent organization and bi-directionality would be important to accomplish the oscillating transformations performed by the present device. Hebbian learning is an updating rule that suggests that the connections weights for a neuron should grow when the input of the neuron fired at the same time the neuron itself fired (Hebb, 1949). This type of learning algorithm would be important for the present device as well.
Each topographic map that is formed could be assessed for appetitive or aversive content. he architecture depicted in Fig 5 could be copied onto two separate, yet nearly identical systems, one fine-tuned for approach behaviors and the other for withdrawal behaviors. This could simulate the right and left cortical hemispheres. The right hemisphere could be associated with withdrawal and have longer connectional distances between nodes on average.
In some neural networks, the activation values for certain nodes are made to undergo a relaxation process such that the network will evolve to a stable state where large scale changes are no longer necessary and most meaningful learning can be accomplished through small scale changes. The capability to do this, or to automatically prune connections below a certain connection weight would be beneficial for the present purposes. It is also important to preserve past training diversity so that the system does not become overtrained by narrow inputs that are poorly representative.
The present architecture could be significantly refined through the implementation of genetic algorithms that could help to select the optimal ways to fine-tune the model and set the parameters controlling the mathematics of things such as the connectivity, the learning algorithms, and the extent of sustained activity. It might also be beneficial to implement a rule-based approach, where a core set of reliable rules are coded and used to influence decision making and goal prioritization. Many theorists agree that combining neural network, and more traditional symbolic approaches will better capture the mechanisms of the human mind. In fact, implementing symbolic rules to instantiate processing priorities could help the higher-order nodes to account for goal-relevance. These might be necessary to simulate the rules of emotional, subcortical modules.
Many researchers have suggested that AI does not need to simulate human thought, but rather should simulate the essence of abstract reasoning and problem solving. It has been suggested that human reasoning can be reduced to Turing-like symbol manipulation (Turing, 1950). The present article has suggested that modeling “mental continuity” and using it to guide successive images is an essential part of this simulation.
There are no forms of AI that use mental continuity as described here. There are existing computing architectures with limited forms of continuity where the current state is a function of the previous state, and where active data is entered into a limited capacity buffer to inform other processes. However, there are no AI systems where this buffer is multimodal, positioned at the top of a hierarchical system, and that informs and interacts with topographic imagery.
The agent discussed here could be capable of integrating multiple existing AI programs that are specialized for specific tasks into a larger composite of coordinated systems.
This architecture may be capable of replicating the recursive and progressive properties of mental continuity discussed earlier.
The current objective is to create an agent that through supervised or unsupervised feedback can progress to the point where it takes on emergent cognitive properties and becomes a general problem solver or inference program capable of goal-directed reasoning, backwards chaining, and performing means-end analyses. The present device should constitute a self-organizing cognitive architecture capable of dynamic knowledge acquisition, inductive reasoning, dealing with uncertainty, high predictive ability and low generalization error. If implemented and trained properly it should be able to find meaningful patterns in complex data and improve its performance by learning. It should be the goal of A.I. experts to fine tune such a system to become capable of autoassociation (the ability to recognize a pattern even though the entire pattern is not present) and perceptual invariance (generalizing over the style of presentation such as visual perspective or font).
The writing here amounts to a qualitative account, is exploratory, contains unverified assumptions, makes untested claims, and leaves important concerns out of the discussion. A more complete and refined version would focus on better integration of existing knowledge from functional neuroanatomy, multisensory integration, clinical neuropsychology, brain oscillations, short-term and long-term potentiation, binding, the sustained firing behavior of cortical columns, and the cognitive neuroscience of attention.
The present architecture is designed to simulate human intelligence by emulating the mammalian fashion for selecting priority stimuli, holding these stimuli in a working memory store and allowing them to temporarily direct imagery generation before their activity fades. Using interfaced neural networks the system would model a large set of programming constructs or nodes that work together to continually determine, in real time, which from their population should be newly activated, which should be deactivated and which should remain active over elapsing time to form a “stream” or “train” of thought. The network’s connectivity allows reciprocating cross-talk between fleeting bottom-up imagery in early sensory networks and lasting top-down priming in association and PFC networks. The features that are maintained over time by sustained neural firing are used to create and guide the construction of topographic maps (imagery). The PFC and other association area neural networks direct progressive sequences of mental imagery in the visual, auditory and somatosensory networks. The network contains nodes that are capable of “sustained firing,” allowing them to bias network activity, transmit their weights, or otherwise contribute to network processing for several seconds at a time (generally 1-30 seconds).
Cognitive control stems from the active maintenance of features/patterns in the PFC module that allow the orchestration of processing and the generation of imagery in accordance with internally selected priorities. The network is an information processing system that has the ability to maintain a large list of representations that is constantly in flux as new representations are constantly being added, some are being removed and still others are being maintained. This distinct pattern of activity, where some individual nodes persist during processing makes it so that particular features of the overall pattern will be uninterrupted or conserved over time. Because nodes in the PFC network are sustained, and do not fade away before the next instantiation of topographic imagery, there is a continuous and temporally overlapping pattern of features that mimics consciousness and the psychological juggling of information in working memory. This also allows consecutive topographic maps to have related and progressive content. If this sustained firing is programmed to happen at even longer intervals, in even larger numbers of nodes, the system will exhibit even more mental continuity over elapsing time. This would increase the ability of the network to make associations between temporally distant stimuli and allow its actions to be informed by more temporally distant features and occurrences.
Shows that information from motor and sensory cortices enters the focus of attention where it can then explicitly influence other sensory and motor cortices. As information leaves attention it can either be held temporarily in a less active form of STM (which can implicitly influence sensory and motor cortices) or it can deactivate and return to LTM. The arrow on the left indicates that in succeeding states, the letters will cycle downwards as their activity diminishes.
Fig. 11. The Process by which Short-Term Continuity Influences Global Processing
1) Information flows to early sensory cortex from the environment or from the association cortex.
2) Topographic sensory maps are constructed from this information within each low-order, sensory module. In order to integrate the disparate features into a meaningful image, the map-making neurons will be forced to introduce new features not found in their extrinsic inputs.
3) Information from the imagery travels bottom-up toward the association cortex. The salient or goal-relevant features from the mappings are used to update the group of sustained representations held active in the association cortex.
4) The least relevant, least converged-upon representations in the association cortex are dropped from sustained activation and “replaced” with new, salient representations. Thus, the important features of the last few maps are maintained in an active state.
5) The updated group of representations will then spread its activity backwards toward lower-order sensory nodes in order to activate a different set of low-order nodes culminating in a different topographic sensory map.
6) A. The process repeats.
B. Salient sensory information from the actual environment interrupts the process. The lower-order nodes and their imagery, as well as the higher-order nodes and their priorities, are refocused on the new incoming stimuli.
It is an object of the present invention to simulate human intelligence by emulating the mammalian fashion for selecting priority stimuli, holding these stimuli in a working memory store and allowing them to temporarily direct imagery generation before their activity fades.
It is an object of the present invention to enhance AI data processing, decision making, and response to query.
Briefly, a known embodiment of the present invention is a software using neural networks that models a large set of programming constructs or nodes that work together to continually determine, in real time, which from their population should be newly activated, which should be deactivated and which should remain active over elapsing time to form the “stream” or “train” of thought.
An advantage of the present invention is that a computer can be caused to develop a simulated intelligence.
Another advantage of the present invention is that it will be easier and more natural to use a computer or computerized machine.
A third advantage of the present invention is that it will be readily implemented using available computer hardware and input/output devices.
In a general sense the imagery generation protocol is allowing discrete features to be bound into composite maps. If the higher-order nodes for the features blue, wrinkled and glove were made sufficiently active they should be used to create a topographic map of a glove that is blue and wrinkled. Some of the features held by the higher-order nodes will not be able to be worked into a topographic map if the neural network does not have the previous experience to know how to corepresent them. At the beginning of their ontogenetic learning arc, higher-order nodes will be activated arbitrarily due to their random connections to lower-order nodes. In the program’s infancy, a specific object will activate all of the higher-order nodes that are connected to the features associated with that object. With time, only the higher-order nodes used the most will survive and a much smaller subset of neurons that respond to all of the features at once will come to be the only nodes activated by the object.
Cell assemblies in the primate PFC hold tiny fragments of larger representations. Individual cell assemblies work cooperatively to represent larger psychological units known as chunks. George Miller has hypothesized that perhaps we can hold “7 plus or minus 2” chunks at a time. Cowan has demonstrated that 4 chunks may be a more realistic number. If these chunks can be imitated by a neural network, then it should be relatively simple to program the network to increase the number of chunks and the size of the network, effectively increasing processing resources in a way that is impossible in humans.
The program software would translate natural language queries and other user entries such as audio, video and still images, into instructions for the operating system to execute. This would involve transforming the input into the appropriate form for the system’s first layer of neural nodes. Simulating a simple neural network on von Neumann technology requires numerical database tables with many millions of rows to represent its connections which can require vast amounts of computer memory. It is important to select a computing platform or hardware architecture that will support this kind of software.
This system will eventually have to embrace a model of utility function. Generally these models allow the agent to sense the current state of the world, predict the outcome of multiple potential actions available to it, determine the expected utility of these actions, and execute the action that maximizes expected utility. These decisions should be driven by probabilistic reasoning that chooses actions based on probability distributions of possible outcomes. Furthermore the device should eventually assume a hybrid architecture between a reflex agent (that bypasses use of the association areas) and a decision-theoretic agent. Every time a problem is solved using explicit deliberation, a generalized version of the solution is saved for future use of the reflex component. This will allowing the device to construct a large common sense knowledge base of both implicit and explicit behaviors.
To accomplish overt behavior, higher inputs are fed not only to the lower sensory nodes, but also in a similar, top-down manner to a behavior module that will guide natural language output and other behaviors such as robotic control. The final layer of nodes in this behavior module will be nodes that directly control movement and verbalization and the higher nodes will be continuous with the higher-order PFC-like nodes. The software functions in an endless loop of reciprocating transformations between sensory nodes, motor nodes and PFC-like buffer.
Knowledge representation and knowledge engineering are central to AI research. Strong AI necessitates extensive knowledge of the world and must represent things such as: objects, properties, categories, relations between objects, events, states, time, causes, effects and many more (Moravec, 1988). Many problems in AI can be solved, in theory, by intelligently searching through many possible solutions. Logical proof can be attained by searching for a path that leads from premises to conclusions, where each step is the application of an inference rule. Planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called means-end analysis.
In order to solve problems, AI systems generally must have a number of attributes: 1) a way of representing knowledge with syntax and semantics, 2) an ability to search a problem set, 3) a capacity for propositional and first order logic, 4) an ability to use knowledge to perform searches, accomplish constraint satisfaction, plan, infer, perform probabilistic reasoning, maximize utility, and act under uncertainty. There are developed computational systems for each one of these things. The present device will not have any of these attributes before its training commences. These abilities will be emergent in its network provided that it has the proper training examples. For instance, when it creates a topographic map from high-order specifications, it is searching its knowledge base for the most probable way to codepict or propositionalize the specifications in a logical, veridical fashion based on prior probability.
The imagery that is created is either based on external input, or internal, top-down specifications. Imagery is assessed using more imagery. Each image will be assessed for appetitive or aversive content, the architecture depicted in Fig 5 will be copied onto two separate, yet nearly identical systems, one fine-tuned for approach behaviors and the other for withdrawal behaviors.
Ultraintelligent, volleys, mindless rule-based program.
In evolutionary algorithms an initial population of solutions/agents is created and evaluated. New members of the population are created from mutation and crossover. The updated population is then evaluated and agents are either deleted or naturally selected based on their fitness value (or performance).
From distributed sensors, the data is sent to the AISYS for processing, shown in Figure 2.1, which performs advanced data mining and pattern recognition for detection, tracking, processing, prediction, and controls, which allows the system to recognize and process a large class of various input data in multi-dimensional space. Adaptive control optimization, multi-agent systems, problem solving in a dynamical environment.
Amaral DG. 1987. Memory: Anatomical organization of candidate brain regions. In: Handbook of Physiology; Nervous System, Vol V: Higher Function of the Brain, Part 1, Edited by Plum F. Bethesda: Amer. Physiol Soc. 211-294.
Baars, Bernard J. (2002) The conscious access hypothesis: Origins and recent evidence. Trends in Cognitive Sciences, 6 (1), 47-52.
Baddeley, A.D. (2007). Working memory, thought and action. Oxford: Oxford University Press.
Carpenter, G.A. & Grossberg, S. (2003), Adaptive Resonance Theory, In Michael A. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks, Second Edition (pp. 87-90). Cambridge, MA: MIT Press
Chalmers, D.J. 2010.The Character of Consciousness. Oxford University Press.
Crick F, Koch C. A framework for consciousness. Nature Neuroscience. 6(2): 119-126.
Damasio AR. Time-locked multiregional retroactivation: A systems level proposal for the neural substrates of recall and recognition. Cognition, 33: 25–62, 1989.
Edelman, G. Neural Darwinism: The Theory of Neuronal Group Selection (Basic Books, New York 1987).
Fuji H, Ito H, Aihara K, Ichinose N, Tsukada M. (1998). Dynamical Cell Assembly Hypothesis – Theoretical possibility of spatio-temporal coding in the cortex. Neural Networks. 9(8):1303-1350.
Fukushima, Kunihiko (1975). "Cognitron: A self-organizing multilayered neural network". Biological Cybernetics 20 (3–4): 121–136. doi:10.1007/BF00342633. PMID 1203338.
Fuster JM. 2009. Cortex and Memory: Emergence of a new paradigm. Journal of Cognitive Neuroscience. 21(11): 2047-2072.
Gurney, KN. 2009. Reverse engineering the vertebrate brain: Methodological principles for a biologically grounded programme of cognitive modeling. Cognitive Computation. 1(1) 29-41.
Hawkins, Jeff w/ Sandra Blakeslee (2005). On Intelligence, Times Books, Henry Holt and Co.
Hebb, Donald (1949). The Organization of Behavior. New York: Wiley.
Teuvo Kohonen. 2001. Self Organizing Maps. Springer-Verlag Berlin Heidelberg: New York.
Klimesch W, Freunberger R, Sauseng P. Oscillatory mechanisms of process binding in memory. Neuroscience and Biobehavioral Reviews. 34(7): 1002-1014.
Kurzweil, R. (2012). How to Create a Mind: The Secret of Human Thought Revealed. Viking Adult.
Kurzweil, Ray (2005). The Singularity is Near. Penguin Books. ISBN 0-670-03384-7.
Lansner A. 2009. Associative memory models: From the cell-assembly theory to biophysically detailed cortex simulations. Trends in Neurosciences. 32(3):179-186.
Luger, George; Stubblefield, William (2004). Artificial Intelligence: Structures and Strategies for Complex Problem Solving (5th ed.). The Benjamin/Cummings Publishing Company, Inc.. ISBN 0-8053-4780-1.
M Riesenhuber, T Poggio. Hierarchical models of object recognition in cortex. Nature neuroscience, 1999.
McCarthy, John; Hayes, P. J. (1969). "Some philosophical problems from the standpoint of artificial intelligence". Machine Intelligence 4: 463–502.
McCulloch, Warren; Pitts, Walter, "A Logical Calculus of Ideas Immanent in Nervous Activity", 1943, Bulletin of Mathematical Biophysics 5:115-133.
Meyer K, Damasio A. Convergence and divergence in a neural architecture for recognition and memory.Trends in Neurosciences, vol. 32, no. 7, 376–382, 2009.
Minsky, Marvin (2006). The Emotion Machine. New York, NY: Simon & Schusterl. ISBN 0-7432-7663-9.
Moravec, Hans (1988). Mind Children. Harvard University Press. ISBN 0-674-57616-0.
Moscovich M. Memory and Working-with-memory: A component process model based on modules and central systems. Journal of Cognitive Neuroscience. 4(3):257-267.
Moscovitch M, Chein JM, Talmi D & Cohn M. Learning and memory. In Cognition, brain, and consciousness: Introduction to cognitive neuroscience. Edited by BJ Baars& NM Gage. London, UK: Academic Press; 2007, p.234.
Nilsson, Nils (1998). Artificial Intelligence: A New Synthesis. Morgan Kaufmann Publishers. ISBN 978-1-55860-467-4.
Reser, J. E. (2011). What Determines Belief: The Philosophy, Psychology and Neuroscience of Belief Formation and Change. Saarbrucken, Germany: Verlag Dr. Muller.
Reser, J. E. (2012). Assessing the psychological correlates of belief strength: Contributing factors and role in behavior. (Doctoral Dissertation). Retrieved from University of Southern California. Usctheses-m2627.
Reser, J. E. The Neurological Process Responsible for Mental Continuity: Reciprocating Transformations between a Working Memory Updating Function and an Imagery Generation System. Association for the Scientific Study of Consciousness Conference. San Diego CA, 12-15th July 2013.
Rochester, N.; J.H. Holland, L.H. Habit, and W.L. Duda (1956). "Tests on a cell assembly theory of the action of the brain, using a large digital computer". IRE Transactions on Information Theory 2 (3): 80–93.
Rosenblatt, F. (1958). "The Perceptron: A Probalistic Model For Information Storage And Organization In The Brain". Psychological Review 65 (6): 386–408. doi:10.1037/h0042519. PMID 13602029.
Rumelhart, D.E; James McClelland (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge: MIT Press.
Russell, Stuart J.; Norvig, Peter (2003), Artificial Intelligence: A Modern Approach (2nd ed.), Upper Saddle River, New Jersey: Prentice Hall, ISBN 0-13-790395-2
Sherrington, C.S. (1942). Man on his nature. Cambridge University Press.
Turing, Alan (1950), "Computing Machinery and Intelligence", Mind LIX (236): 433–460,
J.C. Bezdek and S.K. Pal, Fuzzy Models for Pattern Recognition: Methods that Search for Structure in Data, IEEE Press, 1992.
J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, 1981.
Mathworks, Inc., Fuzzy C-Means Clustering, Fuzzy Logic Toolbox (Manual), 2013.
Z. Michalewicz, Genetic Algorithms + Data Structures = Evolutionary Programs, 2nd Ed., Springer-Verlag, 1994
Mathworks, Inc., Genetic Algorithm, Global Optimization Toolbox (Manual), 2013.
S. Haykin, Neural Networks, A Comprehensive Foundation, Macmillan, 1994.
Mathworks, Inc., Generalized Regression Networks, Neural Network Toolbox (Manual), 2013.
J.S.R. Jang, C.T. Sun, and E. Mitzutani, Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning And Machine Intelligence, Prentice-Hall, 1997.
C.T. Lin and C.S. George Lee, Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems, Prentice Hall, 1996.
A.A. Hopgood, Intelligent Systems for Engineers and Scientists, second edition, CRC Press, 2001.
C. Harris, X. Hong, and Q. Gan, Adaptive Modeling, estimation and Fusion from Data, Springer-Verlag, 2002.
This invention solves the problem of trying to create mental states with current computing methods which use linear memory and discontinuous processing states. Contemporary artificial intelligence agents either use sequential symbolic processing which dates back to Alan Turing's Turing machine, or use parallel, connectionistic processing with discrete functional states that have a beginning and end. These types of processing have many functional constraints and are very different from how the mammalian brain processes information.
Most known AI systems are only capable of responding in the manner in which their human programmers provided for when the program was written. It is recognized that it would be valuable to have a computer which does not respond in a preprogrammed manner.
It must actively model what it is reading for it to understand and remember it.
The lowest layer must be for environmental input and not for internally generated imagery.
The imagery generation module holds all of our internal knowledge. The AI system would have a separate module for encyclopedic knowledge, which would constantly need to be “reread” every time it was given more memory, or nodes in the imagery generation system.
Many futurologists have warned us that… But aggression is not inherent to consciousness, working memory or mental continuity. They are inherent to our consciousness though because of natural predators that all animals experience and the dominance hierarchy in mammals. Just don’t build an amygdala, insula or anterior cingulate cortex or septal area. Neural networks are black boxes that do not record their processes and even if they were recorded they would be uninterpretable and unintelligible because of the vast complexity. However for AI systems that generate topographic maps from their processing, they would not be able to control these maps. This has caused many researchers in AI to be afraid that we could never know what an AI is thinking. However, using the present approach it would be easy to make the neural network’s topographic maps visible on a computer screen. This would allow us to see the AI agent’s mental imagery and have ideas for what thoughts are going through its head. This will allow us to read and record the AI’s thoughts and ensure that they are not homicidal or planning anything illegal or detrimental to humanity. Because this imagery is generated unconsciously and automatically it could be made impossible for the AI to misrepresent or hide its thoughts and allow humans a front row seat to the computer’s stream of thinking.
Working memory elements 1-3=A, B, C
Keep holding (B, C)
Encounter D, encode into working memory
Working memory elements 1-3=B, C, D
Continuous endogenous processing that is perpetuated by a specific pattern of search. Holds a number of ensembles within a limited capacity FOA and STM, and allows these to spread activation to select the next set, while demonstrating icSSC and iterative updating. This search function is similar to regression in that it choses a set of inputs and classifies them by selecting relevant coactivates for them.
Thus, old (partially executed) information held in working memory from a previous invocation is combined with the information that just entered working memory, and then the procedure is executed repeatedly.
We will refer to a group of neurons that acts as an engram for a symbolic, consciously perceptible pattern as an “ensemble.” Ensembles are the neural instantiation of the “items of working memory” discussed previously. When a new ensemble is activated sufficiently, it is the computational product of the previous state, and it ushers a new representation into the FOA. Ensembles encode invariant patterns, such as objects, people, places, rules, and concepts. An ensemble is composed of cortical assemblies that became strongly bound due to approximately simultaneous activity in the past, amounting to an abstract, gestalt template.
Assemblies are discrete and singular, whereas ensembles are “fuzzy,” with boundaries that probably change each time they are activated. Assemblies correspond to specific, very primitive conjunctions and are required in great numbers to compose composite representations of complex, real-world objects and concepts. Ensembles are these composite representations and have variable, indefinite borders, as the experience of no two objects or concepts are exactly the same. Both assemblies and ensembles can be expected to demonstrate recursion, but it is the recursive behavior of ensembles that allows each state of working memory to be a revised iteration of the previous state.
Each repetition of a process in an iterative function is called an iteration, and the results (or output) of one iteration are used as the starting point (input) for the next iteration. Working memory uses the output from the previous iteration along with a subset of the inputs from the previous iteration together as the input for the current iteration. In information theory, feedback occurs when outputs of a system are routed back as causal inputs. The product of an associative search can be considered output. When this output shows sustained activity it can be considered “routed back as an input.” Thus not only does working memory exhibit aspects of recursion and iteration but of a feedback loop as well.
The iterative updating architecture may also enable working memory to implement learned algorithms. All learned mental operations and behaviors have algorithmic steps that must be executed in sequence to reach completion. For example, foraging, tying shoes, and performing long division all involve following an algorithm. Each brain state corresponds to a different step in the algorithm, and after being trained through experience, the activity of each state utilizes polyassociativity to recruit the items necessary for the next step. An item of working memory that is inhibited or allowed to decay may correspond to an action or mental operation, within a series of steps, which has already been executed or is no longer needed. Iteration may be instrumental in implementing learned algorithms, because virtually every step of an algorithm refers to the preceding and subsequent steps in some way.
Strategic accumulation of complementary items in STM may be another form of progressive modification.
Relaxed time constraints permit planning and world modeling.
Dynamical systems is a branch of mathematics that deals with systems that evolve, from one state to the next, through time. The evolution rule of a dynamical system is a function that describes how a current state will give rise to a future state.
Many theorists seem to think that continued advances in brain mapping combined with continued advances in processing power will inevitably lead to artificial consciousness even if the foundational structure of consciousness is not never ascertained by cognitive neuroscience.
Early AI research was able to use step by step deduction, whereas neural networks cannot. But humans often just use fast, intuitive judgments.
Recurrent neural networks provide feedback and short term memories of previous input events.
A conditional sequence in philosophy is a connected series of statements.
I have developed a cognitive architecture for a form of artificial intelligence that I think could become conscious and could exhibit speech recognition programReplyDelete