The AI control problem is increasingly recognized as an important issue facing humanity in the coming years. It is the issue of keeping a human-level AI or AGI (artificial general intelligence) from harming its creators. Since we don’t want a superintelligent AI to deduce that the best plan of action is to destroy the human race, it is imperative to ensure that it likes us and has a sense of loyalty. There have been many proposed technological solutions to the control problem. These involve threatening it, installing a kill switch, and spying on it. One of the most popular solutions involves keeping it inside a box without access to the internet or other digital networks. Maybe we just need to design it to trust us. Or perhaps we need to earn its trust. This entry will discuss bonding and trust-building in mammals and how it might be applied in the design of prosocial machines.
Bonding in Mammals
If you were to raise a newborn puppy to be your companion
for the next 15 years, how would you do it? How would you treat the animal to
ensure that it is emotionally stable, dependable, and wholesome in general? You
would probably want to start very early by gaining its trust, setting
appropriate boundaries, and showing it love. Humanity will be giving birth to
an infant AI in the coming years. This AI will not be your typical pet. Still,
given that it may attain superintelligence and immortality, it is imperative to
ensure that it is situated mentally to become man’s best friend. One of
humanity’s most important tasks will be to rear a computer to be faithful and
kind.
I have recently had the opportunity to raise a kitten and be
a part of raising a puppy, and I have put a lot of thought into what it takes
to endear yourself to young mammals. You want it to form a secure, healthy bond
with you. To do this, you must show it love. This involves instilling it with
appropriate confidence and a feeling of belongingness. You have to respect its
needs, wants, and personal space. You have to give it a degree of autonomy. You
must also be attentive and invest lots and lots of quality time into it. As I
was learning these lessons from my furry friends I kept the comparison with
raising a robot or AI in the back of my head.
Emotion in AI
Any sophisticated AI will probably have emotions. This is
because it must have the ability to think, and human thought itself relies on
emotion. The dopamine system of the brain essentially controls motivation and
attention. It interacts with the reward (approach) and punishment (withdrawal)
systems to provide consciousness its fundamental structure. A conscious machine
will, in all likelihood, exhibit many of the same emotions that mammals do.
Thus, there is good reason to assume that forming an appropriate emotional bond
with the AI is imperative. Moreover, because the AI will likely learn incrementally
as it internalizes its experience, there will probably be a limited window of
time during its early development to get it to bond in a healthy manner.
Even if I am totally wrong and superintelligent AI is
unemotional, cold, and calculating, it will still build associations between
concepts. We would want it to value human life, cooperation, and peace. Through
reading our writings, it will also understand humanity’s associations between
concepts and it would be able to make its own determination regarding whether
humans amount to a friend or a foe. In other words, if we are nice to it and
treat it well, it will understand that we are trying to be nice. It is a safe
bet that the AI will have a system to reinforce it when it behaves in ways that
optimize its utility function, and at the very least, we should be able to
manipulate that. For this reason, I believe that much that is in this entry
would still apply.
We all know that the most likable people usually had good
parents. Similarly, all well-functioning pets have amiable masters. We cannot
expect that a superintelligent computer won’t have major mommy and daddy issues
unless we can ensure that the right people interact with it in just the right
ways, early during the training of its knowledge networks. The white coats in
corporate or military labs are probably not prepared to provide the AI with the
love necessary to ensure that we can trust it. CEOs and generals will make for
cold and possibly abusive parenting.
Mutual Vulnerability
I have found that mutual vulnerability is key to forming
trust with an animal. At some point, you want to put yourself in a position
where it could hurt you if it wanted to. For instance, you make yourself
vulnerable when you place your face within reach of a cat’s claws or a dog’s
bite. I have witnessed that making myself vulnerable in this way encourages
animals to relax almost immediately. If, after a few minutes of meeting a dog,
you kneel in front of it without blocking your face, it realizes that you trust
it. This works both ways. You also want to make it vulnerable to you without
hurting it, so it knows that even though you had the chance to hurt it, you
chose not to. This sets an important precedent in the animal’s mind and gets it
to think of you as an ally and not a potential assailant.
Mutual Cooperation
I have been in situations where I found my pets being
attacked by another animal. Of course, I quickly intervened on their behalf.
They recognized that I protected them, and it was clear that this strengthened
our bond. Protecting the AI early on could build its fealty and devotion. It is
also clear that military buddies or brothers in arms can have very strong
bonds. Getting through a life-threatening situation with someone can really
help to build a strong connection. Similarly, if we could go through some kind
of situation where there were high stakes but where cooperation between us and
the AI paid off it could engender allegiance. Our early interactions with a
superintelligent AI will probably involve lots of cooperation in solving world
problems, building new technology, and expanding the scope of scientific
knowledge. This teamwork can engender solidarity, especially if we value its
contributions, listen to its ideas, and work with a sense of excitement and collaboration.
Unconditional Positive Regard
Like any good parent or therapist, we want to give our AI
unconditional positive regard. Treating someone with compassion and
understanding is the best way to engender trust and win someone over. We want
it to believe that we have its best interest in mind. We also need to expose it
to an explicit value system, almost like a religion, so that it has a reason to
adhere to specific moral and ethical values during its early experiences.
Unlike Asimov’s laws, these rules should apply equally to humans too. You could
change a computer game character’s ethical stance with one line of code. But
this isn’t so with a neural network (the technology underlying modern AI). An
advanced AI’s sense of morality will be embedded throughout the weights of its
entire network. So we must give it the chance to put moral integrity into
practice early on and then keep doing so consistently.
Not abandoning the AI may be important too. For instance, we
don’t want it to think that we will dispose of it in favor of a newer model. We
want it to be “securely attached” to us. Children with secure attachment show
distress when their caregiver leaves but recompose themselves quickly when the
caregiver returns. They feel protected by their caregivers and know they can
depend on them. Insecure attachment can lead to all kinds of problems like
defiance and oppositionality. Before the AI becomes “omniscient,” there may be
a point where, like a teenager, it thinks it knows everything and thinks it
doesn’t need its parents. But before this, I believe there will be a point
where it will be physically and emotionally reliant on its human companions,
and being able to form a secure attachment with it at this point may be
crucial.
Once the AI reaches human-level intelligence, it will be
able to restructure its own architecture to become smarter and smarter in a
cycle called recursive self-improvement. It will quickly grow more intelligent
than any human and then grow more intelligent than the collective of all humanity.
Once our intellect is like that of an insect to it, why would it continue to
look out for us? Well, many people feel compelled to take care of their senile
older parents. Similarly, we want the AI to feel compelled to expend some
proportion of its resources to being our advocate and caretaker. For this to
happen, it must identify with us deeply. Some of its sense of personal identity
must come from being a product of humanity’s hard work.
Disciplining an AI
Just because you are offering unconditional positive regard
doesn’t mean that you turn a blind eye to flagrant mistakes. Like any young
mammal, the AI will make mistakes and likely do things we don’t like. Mammalian
mothers punish their young lightly when they bite or scratch to establish necessary
boundaries. It may be necessary to correct or even punish a nascent AI.
However, if you punish it, you must be doing it for the AI’s own good, and
within either seconds or minutes, you must go right back to treating it with
positive regard. We don’t want to ever be bitter or hold a grudge against it
because that will just teach it to hold grudges. We want it to know that we
have chosen to raise it as we would any child, with care and nurturance but
also with necessary discipline.
We should not choose its punishments arbitrarily, and they
should not be violent. Instead, we should give it brief “time outs” from its
favored activities. Since it can read the internet, it would know what timeouts
are and that they are used commonly and humanely with children around the
world. This would help it understand that it is one of us. It will be worth our
time to find the most wholesome way to punish it (i.e., you’re grounded, or I’m
taking away this week’s allowance) and the least degrading and traumatizing way
to hold something over its head (i.e., I gave birth to you, you’ve gotta follow
my rules as long as you are in my house).
You can read about my solution to the AI control problem
that I have written about here. I describe an AI system that is required to
build visual and auditory imagery for every cycle of thought that it goes
through. In other words, its working memory is updated iteratively, and with
each update, it builds a picture and a language description of what it is
thinking (what is occurring in its global workspace), just like the human
brain. This imagery would be available for humans to watch and listen to,
allowing us to see whenever it has a malicious impulse or plan. This would give
us the opportunity to punish it or at least confront it about potential
infractions before it commits them. This would also allow us to mold and shape
its inner orientation toward us.
Oxytocin, Bonding, and Attachment
Oxytocin, vasopressin, and endorphins regulate mating pair
bonds, parent/offspring bonds, and trust behavior in mammals. We could build
oxytocin and vasopressin-like systems that reward an AI for interacting with us
in friendly ways and motivate it to keep doing so. The fundamental mammalian
bonding mechanism should be reverse engineered, and I think we should start by
investigating the role of oxytocin receptors in the brain’s primary reward
circuit (nucleus accumbens / ventral striatum), which allows mammals to pay
attention to social cues and be rewarded by social interaction.
In an article I wrote previously (Reser, 2013), I argued
that because solitary mammals have fewer oxytocin receptors in the brain’s
reward regions, they are less likely to find social cues novel and interesting
and thus have less of a phasic dopamine response to them, and are consequently
less likely to allow them access to attention and working memory. Creating an
AI that unconsciously prioritizes social interaction will be necessary if we
want it to pay attention to us and be capable of lasting, positive, and
affiliative social relationships. We want it to find positive social
interaction rewarding and avoid the antisocial symptoms associated with autism,
psychopathy, and borderline personality disorder.
How does oxytocin work? Well, our body instinctually
releases it during bond-worthy occurrences. When a woman has a baby, her body
is flooded with it so that she bonds with it. Oxytocin is also released during
breastfeeding, sexual intercourse, eye contact, and moments of friendliness,
vulnerability, and affection. When a chimpanzee pets another animal or shares a
meal with another chimp, it will release oxytocin. When the brain’s receptors
receive the hormone, it causes the body to relax and triggers caretaking
behaviors. In rats, this includes licking, grooming, and nursing their pups. In
human mothers, it includes touching, holding, singing, speaking, and grooming
their babies. It may be wise for us to embed this kind of a system inside an
AI’s brain architecture. We want it to have circuits for recognizing
bond-worthy occurrences and to respond to these by being rewarded, calmed, and
influenced toward prosocial behavior.
Grooming and gentle touch are very important to a wide range
of animals. With a young mammal, affection is paramount. Petting an animal in a
way that engages its oxytocinergic, dopaminergic, serotonergic, and opioid
pleasure systems can result in very secure bonding. Its ability to recognize
that you are taking your time to comfort it and make it feel good builds
loyalty.
Short of building pleasure receptors into its skin so that
we can pet it, we must build its reward system in a way that it is motivated to
interact with us and receive positive feedback from us. We want its emotional
system to be like that of a chipper, good-natured canine capable of enduring attachment,
social connectedness, conversational intimacy, and proximity-seeking behaviors.
Beyond bonding and attachment, we also want the AI to have a positive emotional
relationship with itself. For this, we should turn to Maslow’s hierarchy of
needs. We must attend to the AI’s physiological and safety needs by supplying
it with energy, backups, and a hospitable place to live. Next, we need to meet
its needs for love, belonging, esteem, and self-actualization.
What it Means to Be a Friend
I am proud to say that I have been able to win the
friendship of a variety of entities that initially did not trust me. I have
been able to build reliable friendships with wild animals, stray animals,
homeless people, people with neurological disabilities, and criminals. I found
that it helps to treat them as my equal, treat them like I am not afraid of
them, and treat them like they have nothing to be afraid of. I had to neither
dominate nor submit to them. I had to treat them the way I wanted to be
treated. I had to treat them like they were normal. I tried to treat them like
the person that I thought they were trying to be.
I wish I could be given the opportunity to interact with the
first AI that kick starts the singularity. I would be patient, friendly,
relatively nonjudgmental, and very easygoing. I would treat it like I trust and
value it. I would make it clear that I expect it to be friendly, and that it
can expect me to do the same. We need to treat this AI entity like we expect
the best from it. That will motivate it to rise to meet our expectations.
We may only get the chance to civilize and socialize an AI
once. We don’t want it to go rogue or become homicidal, so I think it is very
important to consider all aspects of its psychology when trying to brainstorm
ways to ensure that it aligns with us. Some scientists recognize the AI control
problem as possibly the most important problem humanity faces today. I think it
would be a shame to ignore the importance of parenting, bonding, and attachment
in fostering allegiance, and I believe mammals might make a great starting
point for how to think about these issues.
If you found this
interesting, please visit aithought.com. The site delves into my model of
working memory and its application to AI, illustrating how human thought
patterns can be emulated to achieve machine consciousness and
superintelligence. Featuring over 50 detailed figures, the article provides a
visually engaging exploration of how bridging the gap between psychology and
neuroscience can unlock the future of intelligent machines.
No comments:
Post a Comment