Will AI model sizes increase or decrease over time?

Despite ChatGPT, a limited AI ecosystem exists. Stanford's $600 Alpaca model can equal GPT-3.5's multimillion-dollar training. Larger models run into performance, data, or cost limitations, while smaller models become unoptimisable.

Will AI model sizes increase or decrease over time?

Microsoft, Google, Meta, and others' foundational AI models have enraptured both businesses and casual observers. This is due to the fact that their creativity and communication skills are versatile enough to satisfy both general curiosity and business requirements.

However, this versatility has not come cheaply. Numerous prohibitive expenses, such as those for acquiring high-quality data, hardware, electricity, and talent, have posed formidable obstacles to the training and deployment of large AI models.

Innovation in the AI space has continued unabated despite the fact that these high costs pose a significant barrier to AI development. This is because a confluence of factors has allowed a tiny AI ecosystem to flourish. This article examines a few of these and, based on this information, makes predictions about the future of this AI subsector.

Why AI’s Catching On

The first thing to realise is that the rate of adoption of AI will only quicken from here on out. This is due to the many observable advantages it provides, especially in the realms of knowledge management and productivity boosts.

By incorporating language models into their workflows, businesses can collect and analyse data from a wide variety of sources, including but not limited to: online conversations, meetings, operational transactions, policy documents, etc.

This is analogous to a mining company realising the value in the slag (a byproduct of mining) they generate and figuring out how to get to that value. Any company worth its salt would be eager to seize such chances. Their only concerns would be whether or not they could pull it off and how much it would set them back.

And in the case of AI, a number of factors are combining to lower the entry price.

Fine-Tuning Off-The-Shelf Models

The first development causing a decrease in prices is the refining of AI models. It's not the least expensive option, but it's cheaper than building your own in-house language model. In this method, a company has access to a preexisting base model, such as GPT-4, via an application programming interface (API), then modifies it with its own data to better suit its purposes.

The Alpaca model at Stanford University was trained for only $600 by fine-tuning Meta's LLaMA model, but the solutions at Morgan Stanley cost millions of dollars because they used one million of the firm's research reports to train GPT-4.

The effectiveness of Stanford's model becomes clear when we consider its low cost and its competency (it can compete with GPT-3.5, which cost several million dollars to train, on many tests).

Edge AI

Edge AI is also contributing to the advancement of AI and fostering the development of small models. Up until now, you would have required to make use of the resources of a centralised cloud computing facility in order to train and run your own model.

Firms like Qualcomm (whose chips power most of the world's Android phones) are now researching and introducing higher AI processing capacity, which means that models with up to ten billion parameters may soon be executable on the phone itself. Edge AI is the term for this kind of thing.

With the availability of competent tiny AI models and the democratisation of unprecedented processing power, the conditions are ripe for widespread adoption of AI. It's possible that in the near future, regular individuals may be able to run models as powerful as GPT-3.5 locally on their phone even if they don't have access to the internet. The implications of that are enormous.

Training Data Shrinks

Lessening data needs for training AI models have also aided in innovation and adoption. This is the result of transfer learning, which involves the application of a previously trained model to a different set of data.

For instance, a clinic might begin conducting kidney disease diagnosis by training a model on just a few photos of kidneys using a publicly available model that is pre-trained on millions of images (such as the ImageNet database). There are a number of medical articles, some from India, that detail how this is possible.

Nitro

Codex, developed by OpenAI, is another tool that exemplifies the effectiveness of this method. Copilot, a tool developed by GitHub, uses this concept to generate code in response to user input. In January of 2023, it has already amassed over a million users after being trained using GPT-3.

Codex, thanks to transfer learning, has quickly surpassed GPT-3's coding skills despite being trained on a dataset that is hundreds of times smaller. Codex required only 159 GB of data for training, whereas GPT-3 needed 45 TB. As a result, we can see that the need for data is greatly reduced in settings where huge pre-trained models may be used.

The Free Large Models

The availability of numerous high-quality AI models for no cost is a fourth element that has contributed to the industry's expansion. Orca from Microsoft, trained to reason like GPT-4, LLaMA models from Meta, which come in varying sizes and have outperformed much larger offerings from OpenAI and Google, and the BLOOM model from Hugging Face, now available via Amazon Web Services, are just a few examples of the powerful and free models that developers can now build upon.

These models have enabled massive savings and helped spread a variety of artificial intelligence (AI) technologies and software by offering a strong basis at no cost. All these changes mean that cutting-edge AI is no longer the exclusive domain of the world's most successful technology companies.

Also Read : Utkarsh Small Finance Bank will launch on July 21: Is it going to be a great catalogue?

To Be Or Not To Be

There is clearly a split in the road ahead for AI advancement. One one side, various variables are assisting the development of increasingly competent yet smaller AI systems, while huge corporations have continued to seek ever larger data and parameter sets to better their models.

The kinds of constraints we face will determine the future trajectory of our efforts. The drive towards greater sizes may be tempered if the larger models encounter performance, data, or cost limits that they are unable to overcome.

It's possible that the future of AI will head in a more capital-intensive route if ongoing optimisations are unable to boost the performance of small AI systems above a certain threshold. We are currently in a situation of imperfect equilibrium, where the future is open.