The Compounding Frontier of Artificial Intelligence and Why Interconnected Progress Shortens AGI Timelines
It’s truly incredible how many different facets of artificial intelligence are making progress simultaneously. Here I want to explain a few of those facets to give you an idea about what to expect in the future. I’m assuming that your timelines for AGI and super intelligence will shrink after reading this. I don’t think many people understand that these things will be working together, multiplicatively and synergistically creating a runaway feedback system.
Imagine that your brain was growing in size at an exponential rate but not only was it getting bigger, but dozens of your mental attributes were expanding and becoming more refined on a daily basis. I believe that the synergistic compounding of its improvement vectors is one of the most underappreciated aspects of modern AI evolution. These multipliers aren’t just parallel upgrades; they reinforce and accelerate one another in reinforcing loops.
Everyone knows that hallucinations are decreasing, benchmarks are being saturated, cost is coming down, and speed is going up. However, many people forecast that this will end soon and that artificial intelligence is in an economic bubble that is slowing down. I know where they’re coming from, because I also believe that language models are only an early incarnation of true artificial intelligence. But I also believe that the synergy between many different vectors of improvement is now pushing us towards a much more advanced stage. It truly looks like, well before artificial consciousness is reached, large language models and all of their new bells and whistles will deliver an intelligence explosion and the technological singularity.
Synergy with Robotics. As the simplest example of merging complementary technologies, let’s consider the simultaneous progress in language models and that in robotics. Today, robots routinely run language models making them much more useful. These two, previously separate, technologies have recently combined to create something much more than the sum of their parts. In fact, all major robot manufacturers now use language models as the intelligent engine guiding, not only how the robot speaks to humans, but how it interacts with its world.
Now let’s focus specifically on language models and their various components so we can get a wide perspective on how several different areas of research are all converging to accelerate progress towards artificial general intelligence. Keep in mind that there are thousands of researchers around the globe working on making each of these components more effective.
Pre-Training Scaling. The developers of frontier LLMs are using more energy, more computing resources, and more memory (parameters) to train each new model. They are also providing it with more text to read (data). As each year passes, these companies have been and will be able to increase all of these to generate more performant models with greater emergent capabilities.
Test Time Compute. In the last year we’ve seen language models become much more capable as they have been trained to use the ability to think to themselves. This happens after training, during inference (user interaction), and that’s why it’s called "test time" compute. Today’s models are very quickly learning how to reason more effectively, and it’s clear that increasing reasoning, especially with difficult problems, will continue to yield better results.
Reinforcement Scaling. Language model developers are finding that they can use positive feedback in many different ways to improve model performance on a variety of different tasks. Now they are even using reinforcement training to improve reasoning itself. Moreover, AI’s are moving from human feedback to model-generated feedback loops.
Tool Calling. Language models, generally just produce text, but these text outputs can be used to orchestrate actions within software, on the internet, and by robots. This allows them to outsource uncertainty to deterministic, verifiable systems. For example, often a language model will resort using a calculator to make sure it doesn’t make any mistakes. Today they can invoke Python, web search, or domain specific APIs. Language models are steadily becoming integrated with a wider variety of tools and they’re getting better at selecting tools as well as implementing the output from those tools.
Coding. As language models' mastery of English (and every other language) has improved its coding abilities have improved as well. In 2021 Open AI found that simply continuing to train the GPT 3 model on examples of written code not only gave it the ability to become an elementary programmer, but also improved its abilities with language. Today when you ask a language model a difficult question it will often write computer code to help it work through the problem. It’s abundantly clear that AI’s faculties with language and code complement each other and as both continue to grow, we can expect this complementarity to grow along with them. Aside from code, AI is learning how to interact better with the command line and is being further incorporated into browsers, shells, operating systems and productivity software such as Word, PowerPoint, and Excel.
Context Size. Context size is extremely important for language models because it dictates how many specifics, related to the problem at hand, it can keep in mind. In 2003 a common context window for a large language model was around 8000 tokens. Today you can use 1 million tokens with Google Gemini. That allows you to reason over an entire book and this size is only going to go up from here.
Retrieval Augmented Generation. RAG is another important addition to the LLM arsenal, allowing a model to reference documents or data that it wasn’t trained on, and that exists outside of its context window. Researchers and engineers are pushing improvements across all stages of the RAG pipeline (retrieval, ranking, integration, generation, caching, dynamic strategies).
Memory Ability. Forms of persistent memory are allowing new models to work with stateful memory, allowing the recall of facts and preferences across sessions, and this is a step towards agentic continuity. There are many new methods to compress context into smaller or more manageable forms and memory retrieval is being optimized in countless ways.
Computer Use. Every couple of months we are seeing major advances in computer use software. This goes beyond giving AI access to internal tools and actually allows it to control a mouse and keyboard. Of course, this gives it the ability to perform actions with software and on the internet. It is still somewhat clumsy today, but it’s getting better every week. If you Google the term AI computer use, you can see it in action, and it will be clear to you that this is yet another faculty, improving rapidly, that provides AI more autonomy and control.
Long Time Horizon Work. Every major AI company is now working to improve the ability of large language models to stay on track and accomplish tasks and projects that take hours to complete. The first chat bots such as GPT three would simply spit out a couple of paragraphs. But researchers are teaching chatbots to keep spitting out meaningful text that climbs toward the achievement of a goal. We have seen that chatbots from a couple years ago could only work for a few seconds at a time. Now they’re being designed for autonomous goal pursuit, doing meaningful multi-step work for tens of hours without stopping.
Specific Task Training. AI researchers are training language models on very specific tasks, including economically meaningful tasks. This is greatly improving their utility in the workplace and increasing their value to companies and organizations.
Agentic Collaboration. Language models are being trained to work next to other other language models in a configuration known as mixture of experts that increase their efficiency and appropriateness. But beyond this, language models are being trained to actively collaborate with each other, critique each other, and supervise each other, leading to improvements in performance across the board.
Multimodal Integration. Models are learning cross-modal embeddings that let them reason about vision, sound, motion, and language jointly. This allows better 3-D representations, the simulation of physics, and advanced reality modeling. I believe that eventually, combining language with imagery generation will provide AI with an ability to use its mind’s eye to imagine and create. The more modalities that are richly together integrated, the better.
Synthetic Data. AI’s are now generating their own synthetic data, and if it is of high enough quality, that data can be used to train the next generation of models. As you can imagine when the best and newest model is used to generate synthetic data, that synthetic data will be first-class and will go a long way when used to train the next model.
Deep Research. Language models are consistently getting better at their ability to perform online searches to gather detailed information about subjects, and then refine the results to generate professional consultant-quality reports. Again, this is not a feature but a multiplier. This is because it’s clear that the ability to do research can have far reaching consequences for an AI’s ability to do intellectually challenging work.
Hardware Improvements. Aside from Moore‘s law and Huang‘s law there are thousands of different S curves that are working together to make computers faster, more powerful, more cost-efficient, and more energy efficient. There are constant breakthroughs in the development of specialized AI chips for hardware acceleration and increased throughput.
Software Optimization. AI software engineers report that there’s still a lot of low hanging fruit when it comes to finding ways to make language models run faster on traditional computing equipment.
Research Agents. Software developers are turning large language models into research agents that are able to accomplish real scientific work. They come up with hypotheses, find ways to research and test them, and generate results and conclusions that inform real world science.
Algorithmic Innovations. Since the transformer model was developed in 2017, it has changed significantly, but has not been replaced. However, there are thousands of excellent articles and experiments advocating new algorithms to enhance the fundamental AI pipeline.
Recursive Self Improvement. There I’ve already been several examples of AI being used to improve itself. Chipmakers use AI to design chips. Large language model designers use AI to write the majority of their code now. Artificial intelligence has been finding key optimizations to improve processes at various level levels. It’s a matter of time before artificial intelligence is redesigning its entire stack. In the meantime, every year humans are publishing thousands of articles on how to make improvements in AI, both utilitarian and speculative. In the next training run, AI will be trained on all of this text giving it cutting edge knowledge on how to improve itself.
There’s a lot of talk currently about when we can expect artificial general intelligence to arrive. The answer to this question has many ramifications. It will determine when people start to invest their money in this future. It will determine when the world decides to take AI safety seriously. My answer is that the compounding effects of all of these different simultaneously progressing faculties is ushering in advanced intelligence. Even though many people are aware of the fact that current AI can still make brain dead mistakes, very few people are aware of these various interlocked faculties, how they interact, and how they are producing lesser-known but significant new capabilities.
There are many other interacting faculties that are acting as multipliers and these include planning depth, interpretability, causal reasoning, scalability, personalization, alignment and safety, compression methods, parameter optimization, dynamic computation, long-term consistency, coherence across turns, commonsense reasoning, user modeling, meta-learning, sample efficiency, and many others.
There are limits to what you can do with today’s AI that are not going to be overcome without a paradigm shift. Bells and whistles will not get us to machine consciousness. It will be several years still before artificial intelligence research understands and replicates all of the processing advantages that the human brain has. However, before that time, the variables discussed here will have amplified each other past the point of automating scientific discovery, reshaping the entire economy, and making humans obsolete in almost every way. Then AI will create the paradigm shift itself. In other words, we may be missing the secret sauce, but what we have built already will find that sauce's recipe.
No comments:
Post a Comment