Since the first paper examining the impact of this technology on the environment was published three years ago, a movement has grown among researchers to self-report the energy use and emissions produced by their work. Having accurate numbers is an important step in making changes, but actually collecting those numbers can be challenging.
“You can’t improve what you can’t measure,” says Jesse Dodge, a researcher at the Allen Institute for AI in Seattle. “The first step for us if we want to make progress in reducing emissions is that we get a good measurement.”
To that end, the Allen Institute recently partnered with Microsoft, AI firm Hugging Face, and three universities to develop a tool that would measure the power consumption of all machine learning programs running on Azure, Microsoft’s cloud service. With it, Azure users creating new models can see the total power consumption of graphics processing units (GPUs) — computer chips specialized in running computations in parallel — at every stage of their project, from choosing a model, to training it, and finally Use . It is the first major cloud provider to offer users access to information about the energy impact of its machine learning programs.
While tools already exist that measure the energy consumption and emissions of machine learning algorithms running on local servers, these tools don’t work when researchers use cloud services from companies like Microsoft, Amazon, and Google. These services don’t give users direct insight into the GPU, CPU, and memory resources their activities are consuming – and the existing tools like Carbontracker, Experiment Tracker, EnergyVis, and CodeCarbon need these values to provide accurate estimates.
The new Azure tool, which debuted in October, currently reports energy consumption, not emissions. So Dodge and other researchers figured out how to map energy use to emissions, and presented a paper accompanying that work at FAccT, a major computer science conference, in late June. Researchers used a service called Watttime to estimate emissions based on zip codes from cloud servers running 11 machine learning models.
They found that emissions can be significantly reduced when researchers use servers in specific geographic locations and at specific times of the day. Emissions from training small machine learning models can be reduced by up to 80% if training starts at times when more renewable electricity is available on the grid, while emissions from large models can be reduced by over 20% when training work on renewable energies is interrupted Electricity is scarce and restarted when it is plentiful.