Thursday, July 3, 2025

How Hemispheric Valence Could Help Solve the Brain’s Credit-Assignment Problem


I just finished listening to an interview with the “Godfather of AI,” Geoffrey Hinton and he was saying that the main thing we don’t understand about the brain is how it determines when to learn and when to unlearn. He emphasized that we don’t know how real brains determine whether to increase or decrease a connection weight at the level of the cell, dendrite, and synapse. Currently, Artificial neural networks do this through a mathematical processes called "back propagation," but if we knew how the brain accomplished it it could lead to the construction of truly intelligent AI. Well, I’ve spent a lot of time reading about neuroscience and this is something I’ve never understood either and I can see how the rest of the field hasn’t been able to figure it out. But, for the last few years I have had a sneaking suspicion that it may relate to hemispheric laterality.

What if this decision to learn or unlearn actually starts with the decision of what side of the brain to do the learning on? Our brains are fundamentally split down the middle, into two equal cortical hemispheres. One side is associated with approach and the side is associated with withdrawal. When you reach for that cookie, the left side is more active guiding your arm and when you step back from a loud noise, the right side presides over that movement. Maybe modern neuroscience has everything right, researchers just have not yet appreciated how all interactions with the world are funneled down one of two filters. The left hemisphere for approach (learning) or the right hemisphere for withdrawal (unlearning). This split, laterality, is almost as old as animals, jellyfish don't have it but the vast majority of animals do, and I believe that’s because it plays an elementary and functional role that is so deep that it’s gone mostly unrecognized. More specifically, maybe the brain doesn’t unlearn and decrease connection weights. Maybe instead it learns in one of two functionally distinct hemispheres, but the fact that these two hemispheres are constantly working together, obscures this.



How plasticity is regulated is a core unresolved problem in neuroscience. Neuroscientists usually call this the credit-assignment problem (or synaptic credit-assignment). The phrase covers a family of questions: Which neurons and synapses “deserve credit or blame” for a given behavioral success or failure? How are those specific sites located and tagged so that plasticity mechanisms (LTP/LTD) know whether to strengthen or weaken them? How is the error signal delivered with the right timing and specificity—especially when the reward or punishment can be distal in time from the activity that caused it?

Hinton is right that artificial neural networks use a clean mathematical mechanism, backpropagation, to determine when to strengthen or weaken a weight, but the biological brain has no known analog. Despite identifying mechanisms like long-term potentiation (LTP) and depression (LTD), neuroscience hasn’t answered the control question: What tells a neuron or synapse to potentiate or depress in a given moment?

I am proposing that the brain doesn’t primarily control plasticity by explicitly deciding to increase or decrease a connection. Instead, it routes learning experiences through either the left or right hemisphere. The left hemisphere encodes learning in an “approach” frame, while the right hemisphere encodes in a “withdrawal” frame. What appears as “unlearning” may not start with a synaptic weakening, but a counterbalancing via opposite-valence encoding in the other hemisphere. This creates a hidden dialectic of encoding, in which two types of learning, approach and avoidance, compete and interact over time, giving the appearance of weight modulation when it may actually be a net representational shift due to hemispheric negotiation.

The left prefrontal cortex is reliably linked to positive affect, goal pursuit, and approach motivation. The right prefrontal cortex is linked to negative affect, inhibition, vigilance, and withdrawal. These are not just perceptual biases—they influence attention, motor readiness, and learning strategies. Some research suggests left-dominant individuals learn better from positive reinforcement, while right-dominant individuals are more sensitive to punishment or negative feedback. The two hemispheres often inhibit each other reciprocally, suggesting that information routed through one suppresses the other’s processing. This would create a seesaw of plasticity—one hemisphere encoding and the other being downregulated in the same domain.

In this model, “unlearning” isn’t always a literal synaptic weakening, but instead: A switch in encoding locus, from the approach hemisphere to the withdrawal hemisphere (or vice versa). The integration or resolution of competing hemisphere-encoded representations creates behavioral flexibility or change. The experience of “forgetting” may be inhibitory masking, not destructive erasure. Learning always occurs, but where and how it occurs depends on whether the experience is framed as rewarding or aversive. Plasticity control is not primarily local (per-synapse) but is topologically gated by motivational framing that routes learning to one hemisphere or the other. What appears as synaptic weakening or forgetting is actually the emergence of a counterweight in the opposite hemisphere.

Lateralization is seen in fish, amphibians, birds, mammals and invertebrates. Why is it so persistent? Because lateralization allows for parallel, opponent-processing systems: one to engage with novelty (left = approach), and one to monitor and inhibit threat (right = withdrawal). Routing plasticity through these systems allows for: Redundant encoding: no need to destroy old memories, just encode an opposing one.

In this model: “Learning” = encoding a stimulus–response–valence relationship in one hemisphere. “Unlearning” = new encoding in the opposite hemisphere, with a contradictory valence. The behavior that results reflects the net influence of these competing encodings, mediated by interhemispheric inhibition and contextual signals. This is analogous to opponent-process theory in motivation and emotion—but now applied to physical learning loci. This hypothesis reframes learning control not as a question of when to update weights, but where to place the updates in a brain that encodes motivational context spatially. It suggests the “control problem” is distributed and evolved, not centrally computed. It also addresses the paradox of stability vs. flexibility: instead of editing the past, the brain layers new motivational perspectives onto it. A name like Lateralized Valence Learning (LVL) may strike the right balance. I wouldn’t say that anything is routed to one specific hemisphere over another. The experience is already there in both hemispheres, but the environmental feedback triggers perceptual and emotional systems that tag the experience as confirmatory or disconfirmatory for the prediction in question.

So I think the issue that Hinton grapples with is, and I have grappled with this a lot too: when the brain makes the wrong prediction, how does it remember that that prediction was wrong? How does it right there and then tag that thought as negated or unjustified so that one can learn from their mistakes. It can’t all happen in the amygdala. And individual cells can’t know if something is wrong or right. Hormones and neuromodulators are too slow. Brains capture series of decisions and their environmental feedback that contain multiple instances of being right or wrong. How can the brain remember which was right and which was wrong for every instance and go back making all of the slow molecular changes appropriately even as the clock continues to run? Well, each decision and form of environmental feedback is registered in both hemispheres, but if it made the wrong prediction, then that activation pattern is coupled with LTP phenomena in the hemisphere associated with withdrawal. Thus, wrong predictions are not “erased,” but re-encoded with long-term potentiation (LTP) in the withdrawal hemisphere. There is no need for synapse-specific error signaling.

If this framing is even partly correct, then the brain’s solution to Hinton’s “credit-assignment” riddle isn’t some elusive synaptic bookkeeping algorithm at all, it’s the ancient, lateral tug-of-war our nervous systems have been running for half a billion years. Under Lateralized Valence Learning, success is amplified in the approach hemisphere while failure is crystal-logged in the withdrawal hemisphere, and behavior emerges from their ongoing negotiation. Instead of overwriting yesterday’s models, the brain layers new, context-stamped perspectives beside them, preserving both flexibility and hard-won stability. Seen this way, every mistake we make is not deleted but archived as a cautionary echo, ready to temper future optimism. Testing this idea, by tracking how positive and negative feedback shift hemispheric plasticity, could finally connect the dots between molecular neuroscience and intelligent machines, and show that the brain’s most elegant learning trick has been hiding in plain left-and-right sight all along.