Power Capping
Changing configured power caps by 2.5x barely changed actual power or clocks in a batch-size-1 decode measurement. Decode power draw on the tested H200 stayed far below its 700 W TDP across attention paradigms. Power capping is ineffective…
1 sources - 4 claims
Changing configured power caps by 2.5x barely changed actual power or clocks in a batch-size-1 decode measurement. Decode power draw on the tested H200 stayed far below its 700 W TDP across attention paradigms. Power capping is ineffective for memory-bound LLM decode when actual draw remains below the configured ceiling. Facility-level power capping should not be assumed to reduce decode energy in production LLM serving.