FIG 2. is a diagram depicting the reciprocal transformations of
information between lower-order sensory nodes and higher-order PFC nodes.
Sensory areas can only create one sensory image at a time, whereas the PFC is
capable of holding the salient or goal-relevant features of several sequential
images at the same time.
FIG. 3 is a diagram depicting the behavior of features
that are held active in the PFC. 1) Shows that features B, C, D and E which are
held active in the PFC all spread their activation energy to lower-order
sensory areas where a composite image is built that is based on prior
experience with these features. 2) Shows that features involved in the
retinotopic imagery from time sequence 1 converge on the PFC neurons
responsible for feature F. Feature B drops out of activation, and C, D, E and F
remain active and diverge back onto visual cortex. 3) Shows that this same
process leads to G being activated and D being deactivated.
FIG. 4. is a list of
processes involved in the central AI algorithm implemented by the present
device.
1)
Either sensory information from
the environment, or top-down internally held specifications, or both are sent
to low-order sensory neural network layers that contain feature extracting
cells. This includes either feedforward sensory information from sense
receptors (experiential perception) or from downstream retroactivation from
higher-level nodes (internally guided imagery).
2)
A topographic sensory map is made
by each low-order, sensory neural network. These topographic maps represent the
networks best attempt at integrating and reconciling the disparate stimulus and
feature specifications into a single composite, topographic depiction. The map
that is created is based on prior probability and training experience with
these features.
3)
In order to integrate the disparate features into a meaningful
image, the map making neurons will usually be forced to introduce new features.
The salient or goal-relevant
features that have been introduced are extracted through a perceptual process
where active, lower-order nodes spread their activity to higher-order nodes. As
the new features pass through the neural networks, some are given priority and
are used to update the limited-capacity, working memory, storage buffer that is
composed of active high-level nodes.
4)
Salient features that cohere with
features that are already active in the higher-order nodes are added to the active
features there. The least relevant, least fired upon features in higher-order
areas are dropped from activation. The sustained firing of a subset of higher-order nodes
allows the important features of the last few maps to be maintained in an
active state.
5)
At this point it is necessary for the system to implement a
program that allows it to decide if it will continue operating on the
previously held nodes or redirect its attention to the newly introduced nodes.
Each time the new features garnered from the topographic maps are used to
update the working memory store, the agent must decide what percentage of
previously active higher-order nodes should be deactivated in order to
reallocate processing resources to the newest set of salient features. Prior
probability with saliency training will determine the extent to which
previously active nodes will continue to remain active.
6)
The updated subset of higher-order
nodes will then spread its activity backwards toward lower-order sensory nodes
in order to activate a different set of low-order nodes culminating in different
topographic sensory map.
7)
A. The process repeats.
B.
Salient sensory information from the actual environment interrupts the process.
The lower-order nodes and their imagery, as well as the higher-order nodes and
their priorities, are refocused on the new incoming stimuli.
FIG 5. Demonstrates the architecture of
the interfacing neural networks.
FIG 6: illustrates how relevant features
can be maintained through time using nodes with sustained firing. The figure compares
the number of past nodes that remain active at the present time period (“now”)
in a normal human, a human with PFC dysfunction, and the hypothetical AI agent.
The AI agent is able to maintain a larger number of higher-order nodes though a
longer time span, ensuring that its perceptions and actions, now, will be
informed by a larger amount of recent information. Note how the lower-order
sensory and motor features are the same in each graph with respect to their
number and duration, yet those in association areas are highest in both number
and duration for agent C.
FIG 7. Depicts an octopus within a brain
in an attempt to communicate how continuity is made possible in the brain and
the in present device. When an octopus exhibits seafloor walking, it places
most of its arms on the sand and gradually repositions arms in the direction of
its movement. Similarly, the mental continuity exhibited by the present device
is made possible because even though some representations are constantly being
newly activated and others deactivated, a large number of representations
remain active together. This process allows the persistence of “cognitive” content
over elapsing time, and thus over machine processing states.
Read the article on working memory that this system is built upon here:
http://www.sciencedirect.com/science/article/pii/S0031938416308289
BACKGROUND OF THE ARCHITECTURE
The
Artificial PFC: Continuity Through Sustained Activity
To
create a strong form of AI it is necessary to have an understanding of what is
taking place that allows intelligence, thought, cognition, consciousness or
working memory to move through space and time, or in another word, to
“propagate.” Such an understanding must be grounded in physics because it must
explain how the physical substrate of intelligence operates through space and
time (Chalmers, 2010). The human brain is just such an intelligent physical
system that AI researchers have attempted to understand and replicate using a
biomimetic approach (Gurney, 2009). Features of the biological brain have been
key in the evolution of neural networks, but the brain may hold information
processing principles that have not been harnessed by A.I. efforts (Reser 2011,
2012, 2013).
The mammalian PFC and other association
cortices have neurons that are specialized for “sustained firing,” allowing
them to generate action potentials at elevated rates for several seconds at a
time (generally 1-30 seconds) (Fuster, 2009). In contrast, neurons in other
brain areas, including cortical sensory areas, remain active only for milliseconds
unless sustained input from association areas makes their continued activity
possible (Fuster, 2009). In the mammalian brain, prolonged activity of neurons
in association areas, especially prefrontal and parietal areas, allows for the
maintenance of specific features, patterns, and goals (Baddeley, 2007). Working
memory, executive processing and cognitive control are widely thought to stem
from the active maintenance of patterns of activity in the PFC that represent
goal-relevant features (Goldman-Rakic, 1995). The temporary persistence of
these patterns ensures that they continue to transmit their effects on network
weights as long as they remain active, biasing other processing, and affecting
the interpretation of subsequent stimuli that occur during their episode of
continual firing.
The pattern of activity in the brain is
constantly changing, but because some individual neurons persist during these
changes, particular features of the overall pattern will be continuous,
uninterrupted, or conserved over time. In other words, the distribution of
active neurons in the brain transfigures gradually and incrementally from one
configuration to another, instead of changing all at once. If it were not for the phenomena of sustained
firing and cortical priming, instantaneous mental states would be discrete and
isolated rather than continuous with the states before and after them. Thus the
human brain is an information processing system that has the ability to
maintain a large list of representations that is constantly in flux as new
representations are constantly being added, some are being removed and still
others are being maintained. The present device will be constructed to mimic
this biological phenomenon.
Although
its limits are presently being debated, the human neocortex is clearly capable
of holding numerous neural representations active over numerous points in time.
The quantity of mental continuity is directly proportional to the number of
such sustained representations and the length of time of their activity (Reser,
2011, 2012, 2013).
Fig.1.
Graphical
depiction of STC. Each bracket represents the active time span of a neural
representation. The x axis represents time and the y axis demarcates the cortical
area where the representation is active. Red brackets denote representations
that have exhibited uninterrupted activity from the point when they became
active, whereas blue brackets denote representations that have not been
sustained. In time sequence 1 representations B, C, D and E have remained
active until t1. In time sequence 2 B has deactivated, C, D and E have remained
active, and F is newly active. The figure depicts a system with STC because
more than one representation (C, D, and E) has been maintained over more than
one point in time (t1 and t2). Sensory and association areas do not exhibit
continuity between the two time sequences shown although they would on shorter
time intervals.
In Figure 1 above, representations B, C, D, and E are active during time
1, and C, D, E and F are active during time 2. Thus representations C, D, and E
demonstrate STC because they exhibit continuous and uninterrupted activity from
time 1 through time 2. The brain state at time 1 and the brain state at time 2 share
C, D, and E in common and, because of this, can probably be expected to share
other commonalities including: similar information processing operations,
similar memory search parameters, similar mental imagery, similar cognitive and
declarative aspects, and similar experiential and phenomenal characteristics.
Fig. 2.
A
simplified graphical representation of STC depicting it as a gradually
shifting, stream-like distribution. Figure 2 extends Figure 1 over multiple
time intervals revealing a repeating pattern: remnants from the preceding state
are consistently carried over to the next state. If this distributional plot
were modeling neurons rather than representations there might be thousands of
units per time period rather than four, but the repeating pattern should be
conserved.
Computational operations,
that take place as a computer implements lines of code (rule-based,
if-then operations) to transform input
into output, have discrete, predetermined starting and stopping points. For
this reason computers do not exhibit continuity in their information
processing. There
are no forms of artificial intelligence that use mental continuity as described
here. There are existing computing architectures with limited forms of
continuity where the current state is a function of the previous state, and
where data is entered into a limited capacity buffer to inform other processes.
However, the memory buffer is not multimodal, not positioned at the top of a
hierarchical system and does not inform and interact with topographic imagery.
Search
The
mammalian neocortex is capable of holding a number of such mnemonic
representations coactive, and using them to make predictions by allowing them
to spread their activation energy together, throughout the thalamocortical
network. This activation energy converges on the inactive representations from
LTM that are the most closely connected with the current group of active
representations, making them active, and pulling them into short-term memory.
Thus new representations join the representations that recruited them, becoming
coactive with them.
The
way that assemblies and ensembles are selected for activity in this model is
consistent with spreading activation theory. In spreading activation theory,
associative networks can be searched by labeling a set of source nodes, which
spread their activation energy in a nonlinear manner to closely associated
nodes (Collins & Loftus, 1975). Cortical assemblies work cooperatively by
spreading the activation energy (both excitatory and inhibitory) necessary to
recruit or converge upon the next set of ensembles that will be coactivated
with the remaining ensembles from the previous cycle.
Together,
they impose sustained information processing demands on the lower-order sensory
and motor areas within the reach of their long-range connections. The longer
the activity in these higher-order neurons is sustained, the longer they remain
engaged in hierarchy-spanning, recurrent processing throughout the cortex and
subcortex.
Table 1: The
Characteristics of Polyassociativity:
Gradual
additions to and subtractions from a pool of simultaneously coactivated
ensembles occur as:
1. Assemblies that continue to receive
sufficient activation energy from the network are maintained.
2. Assemblies that receive sufficiently
reduced activation energy are released from activation.
3. New assemblies, which are tuned to
receive sufficient activation energy from the current constellation of
coactivates, are converged upon, and incorporated into the remaining pool of
active assemblies from the previous cycle.
Outlining
the process of polyassociativity in this way is meant to show that the
computational algorithm used by the brain may be primarily directed at
determining which inactive ensembles are the most closely statistically related
to the currently active assemblies. From this perspective, the contents of the
next state are chosen based on how the currently active assemblies interact
with the existing, associative, neuro/nomological network.
Fig.4
A diagram
depicting “polyassociativity” and illustrating the ways in which high-level
representations, or ensembles, are displaced, maintained, and newly activated
in the brain. 1) Shows that representation A has already been deactivated and
that B, C, D and E are now coactivated, mirroring the pattern of activity shown
in Figure 1. When coactivated, these representations pool and spread their
activation energy, resulting in the convergence of activity onto a new
representation, F. Once F becomes active, it immediately becomes a coactivate,
restarting the cycle. 2) Shows that B has been deactivated while C, D, E, and F
are coactivated and G is newly activated. 3) Shows that D but not C has been
deactivated. In other words, what is deactivated is not necessarily what
entered first, but what has proven to receive the least converging activity. C,
E, F, and G coactivate and converge on H.
In Figure
4, between time periods 1 and 2, C, D and E exhibit STC, whereas, between time
periods 1 and 3, only C and E exhibit STC.
C and E are active over all three time periods meaning that these
representations are being used as search function parameters for multiple
cycles, they are the subject of attention, and are demonstrating STC.
Alternatively, we can imagine a scenario where B, C, D, and E from step one of
Figure 3 were immediately replaced by F, G, H, and I. Such a processing system
may still be using previous states to determine subsequent states; however,
because no activity is sustained through time, there would be no continuity in
such a system (this is generally how computing systems process information).
In Figure
4, ensembles C and E have fired together over three individual time intervals,
and thus will show a propensity to wire together, increasing their propensity
for firing together in the future. This link between them will allow one to
recruit the other. However, it is probably much more likely that they will
recruit each other if the other contextual ensembles are also present. The
coincidental or rare associations between the ensembles of an experience are
probably mostly lost from non-hippocampal dependent cortical memory. However
the reoccurring associations are heavily encoded and persist as semantic
knowledge.
In other
words, the cortex constantly spreads activation energy from novel combinations
of active ensembles that have never been coactive before and attempts to
converge upon the statistically most relevant association without certain or
exact precedence resulting in a solution that is not guaranteed to be optimal.
Optimality could be approached if a specific group of ensembles (say, C and E)
have been thoroughly associated with many others and a type of expertise with
these concepts has developed due to either extensive operant conditioning.
Because
of their sustained activity neurons in the PFC can span a wider delay time or
input lag between associated occurrences (Zanto, 2011) allowing elements of
prior events to become coactive with subsequent ones (Fuster, 2009). Without
sustained firing, the ability to make associations between temporally distant
(noncontiguous) environmental stimuli is disrupted. Sustained activity allows
neurons that would otherwise never fire together to both fire and wire
together. Thus, it may be reasonable to assume that sustained firing underlies
the brain’s ability to make subjective, internally-derived associations between
representations that would never co-occur simultaneously in the environment.
This indicates that one way to quantify
mental continuity is to determine the proportion of previously active neural
nodes that have remained active during a resource demanding cognitive task.
Uninterrupted activity augments associative searches by allowing specific
features to serve as search function parameters for multiple cycles. Intelligence
in this system can be expected to increase along with increases in: 1) the
number of available nodes to select from, 2) the number of nodes that can be coactivated
simultaneously, and 3) the length of time that individual nodes can remain
active.
The mesocortical dopamine (DA) system plays
an important role in sustained activity, suggesting that it may be heavily
involved in mental continuity. Dopamine sent from the ventral tegmental area
(VTA) modulates the activity and timing of neural firing in the PFC,
association cortices, and elsewhere. Dopamine neurotransmission in the PFC is
thought to underlie the ability to internally represent, maintain, and update
contextual information (Braver & Cohen, 1999). This is necessary because information
related to behavioral goals must be actively sustained such that these
representations can bias behavior in favor of goal-directed activities over
temporally extended periods (Miller & Cohen, 2001). It has become clear
that the activity of the DA/PFC system fluctuates with environmental demand
(Fuster & Alexander, 1971). Many studies have suggested that the system is
engaged when reward or punishment contingencies change. Both appetitive and
aversive events have been shown to increase dopamine release in the VTA,
causing sustained firing of PFC neurons (Seamans & Robbins, 2010). Seamans
and Robbins (2010) elaborated a functional explanation to support this case.
They have stated that the DA system is phasically activated in response to
novel rewards and punishments because it is adaptive for the animal to anchor
upon and further process novel or unpredicted events.
It is important for mammals to identify
and capture information about unexpected occurrences so that it can be further
processed and systematic patterns can be identified. The novel experience is
probably broken down into its component parts and the representations in memory
for these parts are allowed to spread their activation energy in an attempt to
converge on and activate historically associated representations that are not
found in the experience itself. Because memory traces for the important
features remain active and primed, they can be used repeatedly as
specifications that guide the generation of apposite mental imagery in sensory
areas (Reser, 2012). It is highly probable that sequences of lower-order
topographic images depict and explore hypothetical relationships between the
higher-order, top-down specifications. This amounts to a continual attempt to
use the associative memory system to search sensory memory for a topographic
image that can meaningfully incorporate the important features. It seems that
reciprocating activity between the working memory updating system and the
imagery generation system builds interrelated sequences of mental imagery that
are used to form expectations and predictions.
The fact that newly
active search terms are combined with search terms from the previous cycle
makes this process demonstrate qualities of “progressive iteration.” Perhaps
reciprocating activity between the working memory updating system and the
imagery generation systems generates sequences of interrelated mental images
that build on themselves to form abductive expectations, and predictions.
The
Neocortex: Reciprocating Crosstalk between Association and Sensory Cortex
The
higher-order features that are maintained over time by sustained neural firing are
used to create and guide the construction of mental imagery (Reser, 2012). The
brain’s connectivity allows reciprocating cross-talk between fleeting bottom-up
imagery in early sensory cortex and lasting top-down priming in late association
cortex and the PFC. This process allows humans to have progressive sequences of
related thoughts, where thinking is based heavily on lower order sensory areas
and the topographic mappings that they generate in order to best represent a
set of higher-order features.
To
a certain extent, perceptual sensory processing is thought to be accomplished
hierarchically (Cohen, 2000). The cortical hierarchy observed from sensory to
association cortex arises because simple patterns are arranged to converge upon
second-order patterns, which in turn converge on third-order patterns and so
on. This leads to a hierarchy of increasingly complex representations. Many
pathways in the brain, such as the ventral visual pathway, appear to use a
“structurally descriptive” architecture where neurons or neural populations that
encode low-level, nonaccidental features are allowed to converge together onto
those that encode more abstract, higher-order, generic, template-like features
(Edelman, 1997). A structural description is defined as “a description of an
object in terms of the nature of its constituent parts and the relationships
between those parts (Wolfe et al., 2009).” Hierarchical processing that
specifies structural descriptions is thought to allow perceptual invariance and
robust postcategorical typology.
(Meyer & Damasio, 2009). (Meyer, 2011).
Internally-derived sensory imagery, such as that seen in the “mind’s eye”
probably appears topographically organized because it is created by the same
lower-order networks responsible for perceiving external stimuli. Thus it may
be safe to assume that when we think and imagine, we construct and manipulate
maps in early perceptual networks. During perception, the bottom-up activity
may be driving and the top-down may be modulatory; however, during imagination
the top-down activity may be driving and the bottom-up may be modulatory. These
conceptions are consistent with the “consolidation hypothesis,” which states
that memory is stored in the same areas that allow active, real-time perception
and function (Moscovitch, et al., 2007).
It
is thought that object recognition, decision making, associative recall,
planning and other important cognitive processes, involve two-way traffic of
signal activity among various neural maps that stretch transversely through the
cortex from early sensory areas to late association areas (Klimesch,
Freunberger, & Sauseng, 2010). Bottom-up sensory areas deliver fleeting
sensory information and top-down association areas deliver lasting perceptual
expectations in the form of templates or prototypes. These exchanges involve
feedforward and feedback (recurrent) connections in the corticocortical and
thalamocortical systems that bind topographic information from lower-order
sensory maps about the perceived object with information from higher-order maps
forming somewhat stable constellations of activity that can remain stable for
tens or hundreds of milliseconds (Crick & Koch, 2003).
STC
impacts this reciprocating cross-talk. These reciprocations may create
progressive sequences of related thoughts, specifically because the topographic
mappings generated by lower-order sensory areas are guided by the enduring
representations that are held active in association areas (Reser 2011, 2012,
2013). The relationship between anterior and posterior cortex may be best characterized
by two main relationships: 1) association areas maintain representations from,
not one but several, of the last few topographic maps made in sensory areas, 2)
because they are drawing from a register with sustained contents, sequential
images formed in sensory areas have similar content and thus should be
symbolically or semiotically related to one another.
Feedback
activation from top-down association areas hands down specifications to early
sensory cortex for use in imagery building. Disparate chunks of information are
integrated into a plausible map and transiently bound together. This
integrative process may be very rapid and may use the structurally descriptive
perceptual hierarchy in reverse to go from abstractions to specifics.
Sustained
firing and recurrent processing make it possible for recent states to spill
over into subsequent states, creating the context for them in a recursive
fashion. In a sense, each new topographic map is embedded in the previous one.
This creates a cyclical, nested flow of information processing marked by STC,
which is depicted in Figure 6.
Consecutive
topographic images about a specific scenario model the scenario by holding some
of the contextual elements constant, while others are allowed to change. Thus
prior maps set premises for and inform subsequent maps. Learned mental tasks
probably have distinct predefined algorithmic sequences of topographic mappings
that must be completed in sequence in order to achieve the solution. Each brain
state would correspond to a different step in the algorithm, and its activity
would recruit the next step. All logical and methodical cognition may require
that a number of relevant features from the present scenario remain in STC so
that they spread their activity within the network in order to influence the
selection of the ensembles necessary for task satisfaction.
In
reality, association areas have much more to converse with than simply a single
retinotopic map as depicted in Figure 6. In fact, they feed their specifications
to and receive specialized input from dozens of known topographic mapping areas
(Kaas, 1997). These areas of different sensory modalities are constantly
responding to incoming activity in an attempt to pull up the most
context-appropriate map in their repertoire. Interestingly, the sensory modules
that build these maps take specifications not only from association areas, but
also from other sensory modules (Klimesch, Freunberger, & Sauseng, 2010).
Further compounding the complexity, these sensory modules probably have their
own limited form of STC where certain low-level features can exhibit sustained
activity. Moreover, motor and premotor modules give specifications to and
receive specifications from this common workspace while they are building their
musculotopic imagery for movement. The same goes for language areas.
Fig. 5.
A diagram
depicting the reciprocal transformations of information between lower-order
sensory mappings and higher-order association area ensembles during internally
generated thought. Sensory areas can only create one topographic mapping at a
time, whereas association areas are capable of holding the salient or
goal-relevant features of several sequential mappings at the same time.
In
a sense, the higher and lower order areas are constantly interrogating each
other, and providing one another with their expert knowledge. For instance, the
higher-order areas have no capacity to foresee how the specifications that they
hold will be integrated into metric imagery. Also, the images created by
lower-order nodes must introduce other, unspecified features into the imagery
that it builds and this generally provides the new content for the stream of
thought. For example, if higher order nodes come to hold features supporting
the representations for “pink,” “rabbit,” and “drum,” then the subsequent
mappings in lower-order visual nodes may activate the representations for
batteries, and the auditory nodes may activate the representation for the word
“Energizer bunny.” The central executive (the PFC and other association areas)
direct progressive sequences of mental imagery in a number of topographic
sensory and motor modules including the visuospatial sketchpad, the
phonological (articulatory) loop and the motor cortex. This model frames consciousness as a polyconceptual, partially-conserved,
progressive process, that performs its high-level computations through
“reciprocating transformations between buffers.” More specifically, it involves
reciprocating transformations between a partially conserved store of multiple
conceptual specifications and another nonconserved store that integrates these specifications into veridical, topographic representations.
Fig. 6.
A diagram
depicting the behavior of representations that are held active in association
areas. 1) Shows that the representations B, C, D, and E, which are held active
in association areas, all spread their activation energy to lower-order sensory
areas where a composite image is built that is based on prior experience with these
representations. 2) Shows that features involved in the topographic imagery
from time sequence 1 converge on the PFC neurons responsible for F. B drops out
of activation, and C, D, E and F remain active and diverge back onto visual
cortex. 3) Shows that the same process leads to G being activated and D being
deactivated, mirroring the pattern of activity shown in Figure 4.
Fig. 9.
Uses the
format of Figure 1 to illustrate how relevant features can be maintained
through time using nodes with sustained firing. The figure compares the number
of past nodes that remain active at the present time (t1), in a normal human, a
human with PFC dysfunction, and the hypothetical A.I. agent. The A.I. agent is
able to maintain a larger number of higher-order nodes through a longer time
span, ensuring that its perceptions and actions in time 1 will be informed by a
larger amount of recent information. Note how the lower-order sensory and motor
features are the same in each graph with respect to their number and duration,
yet those in association areas are the highest in both number and duration for
agent C.
If this sustained firing was programmed to
happen at even longer intervals, and involve even larger numbers of nodes, the
system would exhibit a superhuman capacity for continuity. This would increase
the ability of the network to make associations between temporally distant
stimuli and allow its actions to be informed by more temporally distant
features and concerns. Aside perhaps from altering the level of arousal
(adrenaline) or motivation (dopamine), it is currently not possible to engineer
the human brain in a way that would increase the number and duration of active
higher-order representations. However, in a biomimetic instantiation, it would
be fairly easy to increase both the number and duration of simultaneously
active higher-order nodes (see Figure 9 below). Accomplishing this would allow
the imagery that is created to be informed by a larger number of concerns, and
would ensure that important features were not omitted simply because their
activity could not be sustained due to biological limitations. Of course, in
order to operate meaningfully, and reduce its propensity for recognizing “false
patterns,” such an ultraintelligent system would require extensive supervised
and unsupervised learning.
It is currently not possible to engineer
the human brain in a way that increases the number and duration of active
higher-order representations in order to enhance mental continuity and the
intelligent processes that it supports. However, in a biomimetic instantiation
it would be fairly easy to increase the number and duration of simultaneously
active higher-order representations. Accomplishing this would allow the imagery
that is created to be informed by a larger number of concerns, and would ensure
that important features were not omitted simply due to the fact that their
activity could not be sustained due to biological limitations. It is highly
probable that a succession of lower-order topographic images or maps created in
sensory processing modules depict and explore hypothetical, causal
relationships between the higher-order, top-down specifications held in STC.
Again,
system 1 is making automatic, intuitive, flash judgments, but because of the
STC made possible by sustained firing, these rapid associations are able to
support and buttress each other in a progressive and additive manner. System 2
cognition may be present when several nodes in association areas exhibit
sustained firing and are used multiple times to build topographic or
musculotopic maps, culminating in sensory imagery or motor output that could
not be informed by any of the intermediate steps alone, or that is capable of
solving a problem too difficult for any system 1 process itself. For example,
early processes may provide premises or propositional stances that can be used
algorithmically (e.g. syllogistically) to induce or justify a conclusion in
subsequent processes.
To
accomplish overt behavior, higher inputs are fed not only to the lower sensory
nodes, but also in a similar, top-down manner to a behavior module that will
guide natural language output and other behaviors such as robotic control. The
final layer of nodes in this behavior module will be nodes that directly
control movement and verbalization and the higher nodes will be continuous with
the higher-order PFC-like nodes. The software functions in an endless loop of
reciprocating transformations between sensory nodes, motor nodes and PFC-like
buffer.
Learning
in the Network
The
system will begin untrained with random connection weights between nodes.
Learning should be concentrated on the early sensory networks first. This will
follow the ontogenetic learning arc seen in mammals where the earliest sensory
areas myelinate in infancy and the late association areas such as the PFC do
not finish myelinating until young adulthood. Of course, this form of
artificial intelligence would have to have a prolonged series of developmental
experiences, similar to a childhood, to learn which representations to keep
active in which scenarios. The network will act to consolidate or potentiate in
memory the specific groupings of nodes that have produced favorable outcomes,
in order to more rapidly inform future decision making.
Other
Forms of Sustained Activity
Aside
from having a PFC analogue, the network could also have an analogue of cortical
priming and an analogue of the hippocampus. Humans have thoughts that carry
continuity because changes in content are gradual as more recent
activations/representations are given a higher priority than older ones.
Activity that transpired minutes ago is given moderate priority, activity from
seconds ago is given high priority and activity from mere milliseconds ago is
given the highest priority. This dynamic is made possible by the PFC analogue,
but could be accentuated by analogues of cortical priming. To allow for an
analogue of cortical priming, all recently active neurons would retain a
moderate amount of increased, but subthreshold activity. The activity level of
recently used nodes in both the higher and lower-order areas would not quite
fall back to zero. This would ensure that recently used patterns and features
would be given a different form of priority, yet to a lesser and more general
extent than that allowed by the PFC analogue. Regarding the network partitions
depicted in Figure 5, the sensory, motor and hippocampal neural networks would
show the least priming, the association area, and premotor neural networks
would show moderate priming, and the PFC would show the highest degree of
priming. Functions for the parameters of priming could be fine-tuned by genetic
algorithms.
Furthermore,
the network could have an analogue of the hippocampus. A hippocampal analogue
would keep a record of contextual, or episodic clusters of previous node
activation. Instead of keeping a serial record of averaged activity, the
hippocampus analogue would capture episodic constellations of node activity and
save these to be reactivated later. These episodic memory constellations would
be activated when a large subset of the constellation is present during
processing. This means that when neural network activity closely approximates
an activity constellation that was present in the past, the hippocampal
analogue is capable of reactivating the original constellation. The activity of the hippocampal analogue
should be informed by actual hippocampal anatomy and the “pattern completion”
hypothesis of hippocampal function. To build an analogue into a neural net it
would be necessary to have a form of episodic memory that can be cued by
constellations of activity that closely resembles a past (autobiographical or
episodic) occurrence. This memory system would then be responsible for
“completing the pattern,” or passing activation energy to the entire set of
nodes that were initially involved in the original experience, allowing the
system a form of episodic recall. As with the actual brain (Amaral, 1987), in
the present device, the hippocampus should be reciprocally connected with the
PFC and association areas but not with primary sensory or motor areas.
Appropriate
Neural Network Parameters For the Present Device
A
network is “trained” to recognize a pattern by adjusting arc weights in a way
that most efficiently leads to the desired results. Arcs contributing to the
recognition of a pattern are strengthened and those leading to inefficient or
incorrect outcomes are weakened. The network “remembers” individual patterns
and uses them when processing new data. Neural learning adjustments are driven
by error or deviation in the performance of the neuron from some set goal. The
network is provided with training examples, which consist of a pattern of
activities for the input units, along with the desired pattern of activities
for the output units. The actual output of the network is contrasted with the
desired output resulting in a measure of error. Connection weights are altered
so that the error is reduced and the network is better equipped to provide the
correct output in the future. Each weight must be changed by an amount that is
proportional to the rate at which the error changes as the weight is changed,
an expression called the “error derivative for the weight.” In a network that
features back propagation the weights in the hidden layers are changed
beginning with the layers closest to the output layer, working backwards toward
the input layer. Such backpropagating networks are commonly called multilayer
perceptrons (Rosenblatt, 1958). The present architecture involves a number of
multilayered neural networks connected to each other, each using their own
training criteria for backpropagated learning. For instance the visual
perception module would be trained to recognize visual patterns, the auditory
perception module would be trained to recognize auditory patterns, and the PFC
module would be trained to recognize multimodal, goal-related patterns.
The
hierarchical multilayered network, the neocognitron, was first developed by K.
Fukushima (1975). This system and its descendants are based on the visual
processing theories of Hubel and Wiesel and form a solid archetype for the
present device because they feature multiple types of cells and a cascading
structure. Popular neural network architectures with features that could be
valuable in programming the present device include the adaptive resonance
theory network (Carpenter & Grossberg), the Hopfield network, the Neural
Representation Modeler, the restricted coulomb energy network, and the Kohonen
network. Teuvo Kohonen (2001) showed that matrix-like neural networks can
create localized areas of firing for similar sensory features, which result in
a map-like network where similar features were localized in close proximity and
discrepant ones were distant. This type of network uses a neighborhood function
to preserve the topological properties of the input space, and has been called
a “self-organizing map.” This kind of organization would be necessary for the
present device to accomplish imagery generation, and would contribute to the
ability of the lower-order nodes in the sensory modules to construct
topographic maps.
A
neural network that uses principal-components learning uses a subset of hidden
units that cooperate in representing the input pattern. Here, the hidden units
work cooperatively and the representation of an input pattern is distributed
across many of them. In competitive learning, in contrast, a large number of hidden
units compete so that a single hidden unit is used to represent a particular
input pattern. The hidden unit that is selected is the one whose incoming
weights most closely match the characteristics of the input pattern. The
optimal method for the present purposes lies somewhere between purely
distributed and purely localized representations. Each neural network node will
code for a discrete, albeit abstract pattern, and compete among each other for
activation energy and the opportunity to contribute to the depiction of
imagery. However, multiple nodes will also work together cooperatively to
create composite imagery.
When active, high level nodes signal each of
the low level nodes that they connect with, they are in effect, retroactivating
them. They are activating those that recently contributed to their activity,
and activating previously dormant ones as well. This retroactivation of
previously dormant nodes constitutes a form of anticipation or prediction,
indicating that there is a high likelihood that the pattern that these nodes
code for will become evident (prospective coding). This kind of prediction is
best achieved by a hierarchical hidden Markov model. Utilizing Markov models,
and their predictive properties will be necessary. This process is
used in Ray Kurzweil’s Pattern Recognition Theory of Mind (PRTM) model, which
uses a hidden Markov model and a plurality of pattern recognition nodes for its
cognitive architecture (Kurzweil, 2012). Hierarchical temporal
memory (HTM) is another cognitive architecture that models some of the
structural and algorithmic properties of the neocortex (Hawkins &
Blakeslee, 2005). The hope with PRTM and HTM is that a hierarchically
structured, neural network with enough nodes and sufficient training should be
able model high-order human abstractions. However, distilling such abstractions and utilizing them to make
complex inferences may necessitate an imagery guidance mechanism with a working
memory updating function.
Neural networks can propagate information
in one direction only, or they can be bi-directional where activity travels up
and down the network until self-activation at a node occurs and the network
settles on a final state. So called recurrent networks are constructed with
extensive feedback connections. Such recurrent organization and
bi-directionality would be important to accomplish the oscillating
transformations performed by the present device. Hebbian learning is an
updating rule that suggests that the connections weights for a neuron should
grow when the input of the neuron fired at the same time the neuron itself
fired (Hebb, 1949). This type of learning algorithm would be important for the
present device as well.
Each
topographic map that is formed could be assessed for appetitive or aversive
content. he architecture depicted in Fig 5 could be copied onto two separate,
yet nearly identical systems, one fine-tuned for approach behaviors and the
other for withdrawal behaviors. This could simulate the right and left cortical
hemispheres. The right hemisphere could be associated with withdrawal and have
longer connectional distances between nodes on average.
In
some neural networks, the activation values for certain nodes are made to
undergo a relaxation process such that the network will evolve to a stable
state where large scale changes are no longer necessary and most meaningful
learning can be accomplished through small scale changes. The capability to do
this, or to automatically prune connections below a certain connection weight
would be beneficial for the present purposes. It is also important to preserve
past training diversity so that the system does not become overtrained by
narrow inputs that are poorly representative.
The present architecture could be
significantly refined through the implementation of genetic algorithms that
could help to select the optimal ways to fine-tune the model and set the
parameters controlling the mathematics of things such as the connectivity, the
learning algorithms, and the extent of sustained activity. It might also be
beneficial to implement a rule-based approach, where a core set of reliable
rules are coded and used to influence decision making and goal prioritization. Many theorists
agree that combining neural network, and more traditional symbolic approaches
will better capture the mechanisms of the human mind. In fact, implementing symbolic rules to instantiate
processing priorities could help the higher-order nodes to account for
goal-relevance. These might be necessary to simulate the rules of emotional,
subcortical modules.
Conclusions
Many
researchers have suggested that AI does not need to simulate human thought, but
rather should simulate the essence of abstract reasoning and problem solving.
It has been suggested that human reasoning can be reduced to Turing-like symbol
manipulation (Turing, 1950). The present article has suggested that modeling
“mental continuity” and using it to guide successive images is an essential
part of this simulation.
There
are no forms of AI that use mental continuity as described here. There are
existing computing architectures with limited forms of continuity where the
current state is a function of the previous state, and where active data is
entered into a limited capacity buffer to inform other processes. However,
there are no AI systems where this buffer is multimodal, positioned at the top
of a hierarchical system, and that informs and interacts with topographic
imagery.
The
agent discussed here could be capable of integrating multiple existing AI programs
that are specialized for specific tasks into a larger composite of coordinated
systems.
This
architecture may be capable of replicating the recursive and progressive
properties of mental continuity discussed earlier.
The
current objective is to create an agent that through supervised or unsupervised
feedback can progress to the point where it takes on emergent cognitive
properties and becomes a general problem solver or inference program capable of
goal-directed reasoning, backwards chaining, and performing means-end analyses.
The present device should constitute a self-organizing cognitive architecture
capable of dynamic knowledge acquisition, inductive reasoning, dealing with
uncertainty, high predictive ability and low generalization error. If implemented
and trained properly it should be able to find meaningful patterns in complex
data and improve its performance by learning. It should be the goal of A.I.
experts to fine tune such a system to become capable of autoassociation (the
ability to recognize a pattern even though the entire pattern is not present)
and perceptual invariance (generalizing over the style of presentation such as
visual perspective or font).
The
writing here amounts to a qualitative account, is exploratory, contains
unverified assumptions, makes untested claims, and leaves important concerns
out of the discussion. A more complete and refined version would focus on
better integration of existing knowledge from functional neuroanatomy,
multisensory integration, clinical neuropsychology, brain oscillations,
short-term and long-term potentiation, binding, the sustained firing behavior
of cortical columns, and the cognitive neuroscience of attention.
The
present architecture is designed to simulate human intelligence by emulating
the mammalian fashion for selecting priority stimuli, holding these stimuli in
a working memory store and allowing them to temporarily direct imagery
generation before their activity fades. Using interfaced neural networks the
system would model a large set of programming constructs or nodes that work
together to continually determine, in real time, which from their population
should be newly activated, which should be deactivated and which should remain
active over elapsing time to form a “stream” or “train” of thought. The
network’s connectivity allows reciprocating cross-talk between fleeting
bottom-up imagery in early sensory networks and lasting top-down priming in
association and PFC networks. The features that are maintained over time by
sustained neural firing are used to create and guide the construction of
topographic maps (imagery). The PFC and other association area neural networks
direct progressive sequences of mental imagery in the visual, auditory and
somatosensory networks. The network contains nodes that are capable of
“sustained firing,” allowing them to bias network activity, transmit their
weights, or otherwise contribute to network processing for several seconds at a
time (generally 1-30 seconds).
Cognitive
control stems from the active maintenance of features/patterns in the PFC
module that allow the orchestration of processing and the generation of imagery
in accordance with internally selected priorities. The network is an
information processing system that has the ability to maintain a large list of
representations that is constantly in flux as new representations are
constantly being added, some are being removed and still others are being
maintained. This distinct pattern of activity, where some individual nodes
persist during processing makes it so that particular features of the overall
pattern will be uninterrupted or conserved over time. Because nodes in the PFC
network are sustained, and do not fade away before the next instantiation of
topographic imagery, there is a continuous and temporally overlapping pattern
of features that mimics consciousness and the psychological juggling of
information in working memory. This also allows consecutive topographic maps to
have related and progressive content. If this sustained firing is programmed to
happen at even longer intervals, in even larger numbers of nodes, the system
will exhibit even more mental continuity over elapsing time. This would
increase the ability of the network to make associations between temporally
distant stimuli and allow its actions to be informed by more temporally distant
features and occurrences.
Fig. 10.
Shows that
information from motor and sensory cortices enters the focus of attention where
it can then explicitly influence other sensory and motor cortices. As
information leaves attention it can either be held temporarily in a less active
form of STM (which can implicitly influence sensory and motor cortices) or it
can deactivate and return to LTM. The
arrow on the left indicates that in succeeding states, the letters will cycle
downwards as their activity diminishes.
Fig. 11. The Process by which Short-Term
Continuity Influences Global Processing
1) Information flows to early sensory
cortex from the environment or from the association cortex.
2) Topographic sensory maps are
constructed from this information within each low-order, sensory module. In
order to integrate the disparate features into a meaningful image, the
map-making neurons will be forced to introduce new features not found in their
extrinsic inputs.
3) Information from the imagery travels
bottom-up toward the association cortex. The salient or goal-relevant features
from the mappings are used to update the group of sustained representations
held active in the association cortex.
4) The least relevant, least
converged-upon representations in the association cortex are dropped from
sustained activation and “replaced” with new, salient representations. Thus,
the important features of the last few maps are maintained in an active state.
5) The updated group of representations
will then spread its activity backwards toward lower-order sensory nodes in
order to activate a different set of low-order nodes culminating in a different
topographic sensory map.
6) A. The process repeats.
B. Salient sensory information from the actual environment interrupts
the process. The lower-order nodes and their imagery, as well as the
higher-order nodes and their priorities, are refocused on the new incoming
stimuli.
Other
Features
It
is an object of the present invention to simulate human intelligence by
emulating the mammalian fashion for selecting priority stimuli, holding these
stimuli in a working memory store and allowing them to temporarily direct
imagery generation before their activity fades.
It
is an object of the present invention to enhance AI data processing, decision
making, and response to query.
Briefly,
a known embodiment of the present invention is a software using neural networks
that models a large set of programming constructs or nodes that work together
to continually determine, in real time, which from their population should be
newly activated, which should be deactivated and which should remain active
over elapsing time to form the “stream” or “train” of thought.
An
advantage of the present invention is that a computer can be caused to develop
a simulated intelligence.
Another
advantage of the present invention is that it will be easier and more natural
to use a computer or computerized machine.
A
third advantage of the present invention is that it will be readily implemented
using available computer hardware and input/output devices.
In
a general sense the imagery generation protocol is allowing discrete features
to be bound into composite maps. If the higher-order nodes for the features
blue, wrinkled and glove were made sufficiently active they should be used to
create a topographic map of a glove that is blue and wrinkled. Some of the
features held by the higher-order nodes will not be able to be worked into a
topographic map if the neural network does not have the previous experience to
know how to corepresent them. At the beginning of their ontogenetic learning
arc, higher-order nodes will be activated arbitrarily due to their random
connections to lower-order nodes. In the program’s infancy, a specific object
will activate all of the higher-order nodes that are connected to the features
associated with that object. With time, only the higher-order nodes used the
most will survive and a much smaller subset of neurons that respond to all of
the features at once will come to be the only nodes activated by the
object.
Cell
assemblies in the primate PFC hold tiny fragments of larger representations.
Individual cell assemblies work cooperatively to represent larger psychological
units known as chunks. George Miller has hypothesized that perhaps we can hold
“7 plus or minus 2” chunks at a time. Cowan has demonstrated that 4 chunks may
be a more realistic number. If these chunks can be imitated by a neural
network, then it should be relatively simple to program the network to increase
the number of chunks and the size of the network, effectively increasing
processing resources in a way that is impossible in humans.
The
program software would translate natural language queries and other user
entries such as audio, video and still images, into instructions for the
operating system to execute. This would involve transforming the input into the
appropriate form for the system’s first layer of neural nodes. Simulating a
simple neural network on von Neumann technology requires numerical database
tables with many millions of rows to represent its connections which can
require vast amounts of computer memory. It is important to select a computing
platform or hardware architecture that will support this kind of software.
This
system will eventually have to embrace a model of utility function. Generally
these models allow the agent to sense the current state of the world, predict
the outcome of multiple potential actions available to it, determine the expected
utility of these actions, and execute the action that maximizes expected
utility. These decisions should be driven by probabilistic reasoning that
chooses actions based on probability distributions of possible outcomes.
Furthermore the device should eventually assume a hybrid architecture between a
reflex agent (that bypasses use of the association areas) and a
decision-theoretic agent. Every time a problem is solved using explicit
deliberation, a generalized version of the solution is saved for future use of
the reflex component. This will allowing the device to construct a large common
sense knowledge base of both implicit and explicit behaviors.
Behavioral
Output
To
accomplish overt behavior, higher inputs are fed not only to the lower sensory
nodes, but also in a similar, top-down manner to a behavior module that will
guide natural language output and other behaviors such as robotic control. The
final layer of nodes in this behavior module will be nodes that directly
control movement and verbalization and the higher nodes will be continuous with
the higher-order PFC-like nodes. The software functions in an endless loop of
reciprocating transformations between sensory nodes, motor nodes and PFC-like
buffer.
Knowledge
representation and knowledge engineering are central to AI research. Strong AI
necessitates extensive knowledge of the world and must represent things such
as: objects, properties, categories, relations between objects, events, states,
time, causes, effects and many more (Moravec, 1988). Many problems in AI can be
solved, in theory, by intelligently searching through many possible solutions.
Logical proof can be attained by searching for a path that leads from premises
to conclusions, where each step is the application of an inference rule.
Planning algorithms search through trees of goals and subgoals, attempting to
find a path to a target goal, a process called means-end analysis.
In
order to solve problems, AI systems generally must have a number of attributes:
1) a way of representing knowledge with syntax and semantics, 2) an ability to
search a problem set, 3) a capacity for propositional and first order logic, 4)
an ability to use knowledge to perform searches, accomplish constraint
satisfaction, plan, infer, perform probabilistic reasoning, maximize utility,
and act under uncertainty. There are developed computational systems for each
one of these things. The present device will not have any of these attributes
before its training commences. These abilities will be emergent in its network
provided that it has the proper training examples. For instance, when it
creates a topographic map from high-order specifications, it is searching its
knowledge base for the most probable way to codepict or propositionalize the
specifications in a logical, veridical fashion based on prior probability.
The
imagery that is created is either based on external input, or internal,
top-down specifications. Imagery is assessed using more imagery. Each image
will be assessed for appetitive or aversive content, the architecture depicted
in Fig 5 will be copied onto two separate, yet nearly identical systems, one
fine-tuned for approach behaviors and the other for withdrawal behaviors.
Ultraintelligent,
volleys, mindless rule-based program.
In evolutionary algorithms an initial
population of solutions/agents is created and evaluated. New members of the
population are created from mutation and crossover. The updated population is
then evaluated and agents are either deleted or naturally selected based on their
fitness value (or performance).
From distributed
sensors, the data is sent to the AISYS for processing, shown in Figure 2.1,
which performs advanced data mining and pattern recognition for detection,
tracking, processing, prediction, and controls, which allows the system to
recognize and process a large class of various input data in multi-dimensional
space. Adaptive control optimization,
multi-agent systems, problem solving in a dynamical environment.
References
Amaral
DG. 1987. Memory: Anatomical organization of candidate brain regions. In:
Handbook of Physiology; Nervous System, Vol V: Higher Function of the Brain,
Part 1, Edited by Plum F. Bethesda: Amer. Physiol Soc. 211-294.
Baars,
Bernard J. (2002) The conscious access hypothesis: Origins and recent evidence.
Trends in Cognitive Sciences, 6 (1), 47-52.
Baddeley,
A.D. (2007). Working memory, thought and action. Oxford: Oxford
University Press.
Chalmers,
D.J. 2010.The Character of Consciousness. Oxford University Press.
Crick
F, Koch C. A framework for consciousness. Nature Neuroscience. 6(2): 119-126.
Damasio
AR. Time-locked multiregional retroactivation: A systems level proposal for the
neural substrates of recall and recognition. Cognition, 33: 25–62, 1989.
Edelman, G. Neural Darwinism: The Theory of
Neuronal Group Selection
(Basic Books, New York 1987).
Fuji
H, Ito H, Aihara K, Ichinose N, Tsukada M. (1998). Dynamical Cell Assembly
Hypothesis – Theoretical possibility of spatio-temporal coding in the cortex.
Neural Networks. 9(8):1303-1350.
Fukushima, Kunihiko (1975).
"Cognitron: A self-organizing multilayered neural network".
Biological Cybernetics 20 (3–4): 121–136. doi:10.1007/BF00342633. PMID 1203338.
Fuster JM. 2009. Cortex and Memory: Emergence
of a new paradigm. Journal of Cognitive Neuroscience. 21(11): 2047-2072.
Gurney, KN. 2009. Reverse engineering the
vertebrate brain: Methodological principles for a biologically grounded
programme of cognitive modeling. Cognitive Computation. 1(1) 29-41.
Hawkins, Jeff w/ Sandra Blakeslee
(2005). On Intelligence, Times Books, Henry Holt and Co.
Hebb, Donald (1949). The Organization of
Behavior. New York: Wiley.
Teuvo
Kohonen. 2001. Self Organizing Maps. Springer-Verlag Berlin Heidelberg: New
York.
Klimesch
W, Freunberger R, Sauseng P. Oscillatory mechanisms of process binding in
memory. Neuroscience and Biobehavioral Reviews. 34(7): 1002-1014.
Kurzweil, R. (2012). How to Create a Mind:
The Secret of Human Thought Revealed. Viking Adult.
Kurzweil, Ray (2005). The Singularity is
Near. Penguin Books. ISBN 0-670-03384-7.
Lansner
A. 2009. Associative memory models: From the cell-assembly theory to biophysically
detailed cortex simulations. Trends in Neurosciences. 32(3):179-186.
Luger, George; Stubblefield, William
(2004). Artificial Intelligence: Structures and Strategies for Complex Problem
Solving (5th ed.). The Benjamin/Cummings Publishing Company, Inc.. ISBN
0-8053-4780-1.
M Riesenhuber, T Poggio. Hierarchical
models of object recognition in cortex. Nature neuroscience, 1999.
McCarthy, John; Hayes, P. J. (1969).
"Some philosophical problems from the standpoint of artificial
intelligence". Machine Intelligence 4: 463–502.
McCulloch, Warren; Pitts, Walter, "A
Logical Calculus of Ideas Immanent in Nervous Activity", 1943, Bulletin of
Mathematical Biophysics 5:115-133.
Meyer
K, Damasio A. Convergence and divergence in a neural architecture for
recognition and memory.Trends in
Neurosciences, vol. 32, no. 7, 376–382, 2009.
Minsky, Marvin (2006). The Emotion
Machine. New York, NY: Simon & Schusterl. ISBN 0-7432-7663-9.
Moravec, Hans (1988). Mind Children.
Harvard University Press. ISBN 0-674-57616-0.
Moscovich M.
Memory and Working-with-memory: A component process model based on modules and
central systems. Journal of Cognitive Neuroscience. 4(3):257-267.
Moscovitch M, Chein JM, Talmi D & Cohn M.
Learning and memory. In Cognition, brain, and consciousness: Introduction to cognitive neuroscience.
Edited by BJ Baars& NM Gage. London, UK: Academic Press; 2007, p.234.
Nilsson, Nils (1998). Artificial
Intelligence: A New Synthesis. Morgan Kaufmann Publishers. ISBN
978-1-55860-467-4.
Reser, J. E. (2011). What Determines
Belief: The Philosophy, Psychology and Neuroscience of Belief Formation and
Change. Saarbrucken, Germany: Verlag Dr. Muller.
Reser, J. E. (2012). Assessing the
psychological correlates of belief strength: Contributing factors and role in
behavior. (Doctoral Dissertation). Retrieved from University of Southern
California. Usctheses-m2627.
Reser, J. E. The Neurological
Process Responsible for Mental Continuity: Reciprocating Transformations
between a Working Memory Updating Function and an Imagery Generation System.
Association for the Scientific Study of Consciousness Conference. San Diego CA,
12-15th July 2013.
Rochester, N.; J.H. Holland, L.H. Habit,
and W.L. Duda (1956). "Tests on a cell assembly theory of the action of
the brain, using a large digital computer". IRE Transactions on
Information Theory 2 (3): 80–93.
Rosenblatt, F. (1958). "The
Perceptron: A Probalistic Model For Information Storage And Organization In The
Brain". Psychological Review 65 (6): 386–408. doi:10.1037/h0042519. PMID
13602029.
Rumelhart, D.E; James McClelland (1986).
Parallel Distributed Processing: Explorations in the Microstructure of
Cognition. Cambridge: MIT Press.
Russell, Stuart J.; Norvig, Peter (2003),
Artificial Intelligence: A Modern Approach (2nd ed.), Upper Saddle River, New
Jersey: Prentice Hall, ISBN 0-13-790395-2
Sherrington, C.S. (1942). Man on his
nature. Cambridge University Press.
Turing, Alan (1950), "Computing
Machinery and Intelligence", Mind LIX (236): 433–460,
J.C.
Bezdek and S.K. Pal, Fuzzy Models for Pattern Recognition: Methods that Search
for Structure in Data, IEEE Press, 1992.
J.C.
Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum
Press, 1981.
Mathworks, Inc., Fuzzy C-Means Clustering, Fuzzy
Logic Toolbox (Manual), 2013.
Z.
Michalewicz, Genetic Algorithms + Data Structures = Evolutionary Programs, 2nd
Ed., Springer-Verlag, 1994
Mathworks,
Inc., Genetic Algorithm, Global Optimization Toolbox (Manual), 2013.
S.
Haykin, Neural Networks, A Comprehensive Foundation, Macmillan, 1994.
Mathworks,
Inc., Generalized Regression Networks, Neural Network Toolbox (Manual), 2013.
J.S.R.
Jang, C.T. Sun, and E. Mitzutani, Neuro-Fuzzy and Soft Computing: A
Computational Approach to Learning And Machine Intelligence, Prentice-Hall,
1997.
C.T.
Lin and C.S. George Lee, Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to
Intelligent Systems, Prentice Hall, 1996.
A.A.
Hopgood, Intelligent Systems for Engineers and Scientists, second edition, CRC
Press, 2001.
C.
Harris, X. Hong, and Q. Gan, Adaptive Modeling, estimation and Fusion from
Data, Springer-Verlag, 2002.
This
invention solves the problem of trying to create mental states with current
computing methods which use linear memory and discontinuous processing states.
Contemporary artificial intelligence agents either use sequential symbolic
processing which dates back to Alan Turing's Turing machine, or use parallel,
connectionistic processing with discrete functional states that have a
beginning and end. These types of processing have many functional constraints
and are very different from how the mammalian brain processes information.
Most known
AI systems are only capable of responding in the manner in which their human
programmers provided for when the program was written. It is recognized that it
would be valuable to have a computer which does not respond in a preprogrammed
manner.
It must
actively model what it is reading for it to understand and remember it.
The lowest layer must be for environmental
input and not for internally generated imagery.
The imagery generation module holds all of our
internal knowledge. The AI system would have a separate module for encyclopedic
knowledge, which would constantly need to be “reread” every time it was given
more memory, or nodes in the imagery generation system.
Many futurologists have warned us that… But
aggression is not inherent to consciousness, working memory or mental
continuity. They are inherent to our consciousness though because of natural
predators that all animals experience and the dominance hierarchy in mammals.
Just don’t build an amygdala, insula or anterior cingulate cortex or septal
area. Neural networks are black boxes that do not record their processes and
even if they were recorded they would be uninterpretable and unintelligible
because of the vast complexity. However for AI systems that generate
topographic maps from their processing, they would not be able to control these
maps. This has caused many researchers in AI to be afraid that we could never
know what an AI is thinking. However, using the present approach it would be
easy to make the neural network’s topographic maps visible on a computer
screen. This would allow us to see the AI agent’s mental imagery and have ideas
for what thoughts are going through its head. This will allow us to read and
record the AI’s thoughts and ensure that they are not homicidal or planning
anything illegal or detrimental to humanity. Because this imagery is generated
unconsciously and automatically it could be made impossible for the AI to
misrepresent or hide its thoughts and allow humans a front row seat to the
computer’s stream of thinking.
Working
memory elements 1-3=A, B, C
Keep
holding (B, C)
Encounter
D, encode into working memory
Working
memory elements 1-3=B, C, D
Continuous endogenous processing that is perpetuated by a specific pattern of search. Holds a number of ensembles within a limited capacity FOA and STM, and allows these to spread activation to select the next set, while demonstrating icSSC and iterative updating. This search function is similar to regression in that it choses a set of inputs and classifies them by selecting relevant coactivates for them.
Thus, old (partially executed) information held in working memory from a previous invocation is combined with the information that just entered working memory, and then the procedure is executed repeatedly.
We will refer to a group of neurons that acts as an engram for a symbolic, consciously perceptible pattern as an “ensemble.” Ensembles are the neural instantiation of the “items of working memory” discussed previously. When a new ensemble is activated sufficiently, it is the computational product of the previous state, and it ushers a new representation into the FOA. Ensembles encode invariant patterns, such as objects, people, places, rules, and concepts. An ensemble is composed of cortical assemblies that became strongly bound due to approximately simultaneous activity in the past, amounting to an abstract, gestalt template.
Assemblies are discrete and singular, whereas ensembles are “fuzzy,” with boundaries that probably change each time they are activated. Assemblies correspond to specific, very primitive conjunctions and are required in great numbers to compose composite representations of complex, real-world objects and concepts. Ensembles are these composite representations and have variable, indefinite borders, as the experience of no two objects or concepts are exactly the same. Both assemblies and ensembles can be expected to demonstrate recursion, but it is the recursive behavior of ensembles that allows each state of working memory to be a revised iteration of the previous state.
Each repetition of a process in an iterative function is called an iteration, and the results (or output) of one iteration are used as the starting point (input) for the next iteration. Working memory uses the output from the previous iteration along with a subset of the inputs from the previous iteration together as the input for the current iteration. In information theory, feedback occurs when outputs of a system are routed back as causal inputs. The product of an associative search can be considered output. When this output shows sustained activity it can be considered “routed back as an input.” Thus not only does working memory exhibit aspects of recursion and iteration but of a feedback loop as well.
The iterative updating architecture may also enable working memory to implement learned algorithms. All learned mental operations and behaviors have algorithmic steps that must be executed in sequence to reach completion. For example, foraging, tying shoes, and performing long division all involve following an algorithm. Each brain state corresponds to a different step in the algorithm, and after being trained through experience, the activity of each state utilizes polyassociativity to recruit the items necessary for the next step. An item of working memory that is inhibited or allowed to decay may correspond to an action or mental operation, within a series of steps, which has already been executed or is no longer needed. Iteration may be instrumental in implementing learned algorithms, because virtually every step of an algorithm refers to the preceding and subsequent steps in some way.
Strategic accumulation of complementary items in STM may be another form of progressive modification.
Relaxed time constraints permit planning and world modeling.
Dynamical systems is a branch of mathematics that deals with systems that evolve, from one state to the next, through time. The evolution rule of a dynamical system is a function that describes how a current state will give rise to a future state.
Many theorists seem to think that continued advances in brain mapping combined with continued advances in processing power will inevitably lead to artificial consciousness even if the foundational structure of consciousness is not never ascertained by cognitive neuroscience.
Early AI research was able to use step by step deduction, whereas neural networks cannot. But humans often just use fast, intuitive judgments.
Recurrent neural networks provide feedback and short term memories of previous input events.
A conditional sequence in philosophy is a connected series of statements.
If you found this
interesting, please visit aithought.com. The site delves into my model of
working memory and its application to AI, illustrating how human thought
patterns can be emulated to achieve machine consciousness and
superintelligence. Featuring over 50 detailed figures, the article provides a
visually engaging exploration of how bridging the gap between psychology and
neuroscience can unlock the future of intelligent machines.