Overall FLOP Utilization
The article frames OFU as a complement to application-level MFU rather than a replacement. OFU is a coarse signal at the individual job level and should not be interpreted as a precise substitute for application MFU. OFU is computed from T…
1 sources - 4 claims
The article frames OFU as a complement to application-level MFU rather than a replacement. OFU is a coarse signal at the individual job level and should not be interpreted as a precise substitute for application MFU. OFU is computed from Tensor Pipe Activity multiplied by the ratio of instantaneous SM clock to the architecture-specific Tensor Core maximum clock frequency. Overall FLOP Utilization is introduced as a hardware-counter-based metric for fleet-wide GPU efficiency monitoring without application instrumentation or software-stack changes.