How much energy and resources would we save by simply slowing down AI response time? A lot of the time you get an instant response from an LLM, and sure, it looks impressive, but most of the time you don’t need it that urgently.
The majority of energy consumed is for training the AI models, not providing output from those models.
This means the resource consumption is not tied to usage and prompts. Also it means resource consumption to train models is temporary, relative to the model.
That is how water use works, yes. The water goes back into the environment and is later reused.
Also, there’s a good chance the AIs are not being trained in the same facilities that they’re later being run in. Different sorts of work is being done.
How much energy and resources would we save by simply slowing down Ai usage? A lot of the time people make unnecessary prompts or receive unhelpful generated text, and sure, it looks impressive, but most of the time you don’t need it at all.
At scale? None. If we assume that (a) the number of queries are constant (i.e. that the slow response doesn’t drive away users) and (b) that the efficiency is the same whether it’s fast or slow, then having computers that take longer to calculate each response just means you need to have more of them working in parallel to service the demand.
Now, for a home user running AI locally, you could maybe save some energy by using more efficient silicon since you only need it to process one query at a time (assuming lower-spec parts actually are more efficient, which may or may not be the case), but that’s not really what we’re talking about here.
Actual interesting question:
How much energy and resources would we save by simply slowing down AI response time? A lot of the time you get an instant response from an LLM, and sure, it looks impressive, but most of the time you don’t need it that urgently.
The majority of energy consumed is for training the AI models, not providing output from those models.
This means the resource consumption is not tied to usage and prompts. Also it means resource consumption to train models is temporary, relative to the model.
Oh ok. So they’ll put the water back once the models are trained?
That’s irrelevant to what I was responding to - the question being asked had an incorrect context and I was correcting it.
That is how water use works, yes. The water goes back into the environment and is later reused.
Also, there’s a good chance the AIs are not being trained in the same facilities that they’re later being run in. Different sorts of work is being done.
Not necessarily. Some ground water is ancient “fossil water” and won’t replenish. At least not before a very long time.
I disagree. I think the biggest consumers of AI currently use it for work, and depending on the type of work I think very fast ai == more customers.
Another interesting question:
How much energy and resources would we save by simply slowing down Ai usage? A lot of the time people make unnecessary prompts or receive unhelpful generated text, and sure, it looks impressive, but most of the time you don’t need it at all.
At scale? None. If we assume that (a) the number of queries are constant (i.e. that the slow response doesn’t drive away users) and (b) that the efficiency is the same whether it’s fast or slow, then having computers that take longer to calculate each response just means you need to have more of them working in parallel to service the demand.
Now, for a home user running AI locally, you could maybe save some energy by using more efficient silicon since you only need it to process one query at a time (assuming lower-spec parts actually are more efficient, which may or may not be the case), but that’s not really what we’re talking about here.
Maybe you wanted to answer to the original comment? I was mostly ironizing it and mentioning a reduction in overall usage.