There Would Have Been No LLMs Without This

Game-changing 1990s: the bloom of NN, success of RNNLM and backpropagation, introduction of GPUs and advent of Transformers

Today, we immerse ourselves in the whirlpool of incredible proposals and developments that emerged in Artificial Intelligence (AI) and Machine Learning (ML) from 1990 to the mid-2000s. We explore how previous research, — despite the lack of computational power! — paved the way for modern Language Models (LLMs).

Luckily enough, since the decline of expert systems in the 1990s, there were no more AI winters. It truly became the renaissance of AI. During this time machine learning as we know it today was born. Though it’s impossible to cover everything that happened in ML and AI in an article, we will highlight the most significant events that have shaped the AI field and made the seemingly impossible possible. You might argue with what we chose! Please feel free to add your views to the comment section.

Before embarking on our journey, let’s describe what LLMs are. In a nutshell, LLMs are deep neural networks that utilize specialized architectures and undergo training on vast amounts of data. The first crucial keyword to consider is “neural networks*,” which serves as an ideal entry point for our discussion. They didn’t “happen” in the 90s, so we will need to cover a bit of the 1960s — late 1980s. As our story unfolds, you will witness the progression of concepts, gradually growing more sophisticated, until we ultimately arrive at the era of LLMs.

*A neural network is a computational model inspired by the human brain, composed of interconnected artificial neurons that process and transmit information to learn and make predictions

1960–1985: Progress and Setbacks of Neural Networks

The foundation for LLMs was laid long before the 1990s. In the 1940s, McCulloch and Walter Pitts pioneered artificial neural networks inspired by the human mind and biological neural networks. Marvin Minsky’s SNARC system in the 1950s marked the first computerized artificial neural network.