Enabling Extreme Energy Efficiency Via Timing Speculation for Deep Neural Network Accelerators

Home / Publications / Enabling Extreme Energy Efficiency Via Timing Speculation for Deep Neural Network Accelerators

Jeff (Jun) Zhang, Zahra Ghodsi, Kartheek Rangineni and Siddharth Garg

Due to the success of deep neural networks (DNN) in achieving and surpassing state-of-the-art results for a range of machine learning applications, there is growing interest in the design of high-performance hardware accelerators for DNN execution. Further, as DNN hardware accelerators are increasingly being deployed in datacenters, accelerator power and energy efficiency have become key design metrics. In this paper, we seek to enhance the energy efficiency of high-performance systolic array based DNN accelerators, like the recently released Google TPU, using voltage underscaling based timing speculation, a powerful energy reduction technique that enables digital logic to execute below its nominal supply voltage.