Monday, December 23, 2024

Google: New AI Training Method 13x Faster, 10x More Efficient

Must read

Google’s Deepmind researchers discovered a quicker and more efficient method to train artificial intelligence (AI) models, claiming that the new technique offers 13 times faster performance and 10 times greater power efficiency than existing technologies.

The discovery was pioneered by Google’s AI research lab, which worked to build JEST or joint example selection, an improved AI training method that can accelerate the process while reducing computational resources and required time.

This fresh approach to training models arrives on time as concerns over the environmental effects of AI data companies and their power consumption continue to loom over the horizon.

AI versus Mother Nature

The AI industry uses major processing power that requires a lot of energy. In 2023, AI operations reached an alarming record of roughly 4.3 GW in electricity, almost equaling Cyprus’s power consumption in 2011.

Now, one request in ChatGPT costs 10 times more power than a simple Google search. Experts also estimate that AI will cover as much as 25%, up from only 4% today, of the United States’s power grid by 2030.

Using extensive energy also demands an equally extensive volume of water to dissipate the heat produced in these systems. As such, Microsoft has contributed to the decrease in water supply after a 34% increase in water consumption from 2021 to 2022 following increased AI workloads. Many have likewise accused ChatGPT of utilizing half a liter of water for every 5 to 50 prompts.  

But Not All Hope Is Lost

Nevertheless, Deepmind’s JEST method allows Google to significantly reduce the number of iterations and computational power required to train AI models, which may lower overall energy consumption.

According to the technical paper published by the researchers, the new method differs from existing training techniques by using complementary batches of data instead of individual data points to boost the machine learning of an AI model.

“We demonstrate that jointly selecting batches of data is more effective for learning than selecting examples independently,” the paper stated.

JEST works by first building a smaller AI model to initially grade the quality of information from curated datasets, ranking the batches in terms of quality, and applying the findings to larger, lower-quality datasets. This way, the small JEST model initially identifies the most suitable batches for training, and the large model is then trained based on the results of the smaller model.

For AI training to be successful, the datasets should be of the highest quality to optimize training efficiency. This feature makes the method difficult to replicate since professional research expertise is needed to create the starting training data on which the whole technique stands.

Based on Google Deepmind’s study, JEST “surpasses state-of-the-art models with up to 13× fewer iterations and 10× less computation.”

“A reference model trained on a small curated dataset can effectively guide the curation of a much larger dataset, allowing the training of a model which strongly surpasses the quality of the reference model on many downstream tasks,” the paper added.

The report supported its claims using experiments, which showed remarkable improvements in power efficiency and learning speed when JEST was applied to train an AI model using a basic web language image (WebLI) dataset. 

Latest article