100x improvement in throughput and energy efficiency on Xilinx™ FPGAs

Xilinx™, an AMD company, is a technology and semiconductor company and primary supplier of programmable logic devices. Known for inventing the field-programmable gate array (FPGA), Xilinx provides adaptable, accelerated computing that can be deployed at global scale and respond to dynamic needs.

CHALLENGE

Overcoming performance problems in deep learning without increasing energy consumption

Deep learning networks today have accomplished a great deal but are hitting bottlenecks as they scale to more complex tasks and bigger models. Attempts to break through the performance bottlenecks in today’s machine learning techniques typically require adding more compute power and more data. The result is enormous models that consume vast amounts of power, limiting scalability and creating environmental damage.

We need a new approach to achieve significant breakthroughs in performance and scalability while reducing power consumption on today’s hardware.


SOLUTION

Brain-inspired, optimized networks on FPGAs yield multiplicative throughput improvements

In contrast to the standard dense representations used in most deep learning networks, we created networks that borrow several aspects of the brain’s efficient structure. These brain-inspired, optimized networks not only deliver equivalent accuracy to their standard counterparts, they drastically reduce computational requirements and can run on today’s hardware.

We demonstrated these performance improvements on inference tasks using the Google Speech Commands (GSC) dataset. We created optimized networks on two off-the-shelf Xilinx products:

  • Alveo™ U250 – a powerful platform designed for datacenters
  • Zynq™ UltraScale+ ZU3EG – a smaller platform designed for embedded applications

RESULTS

100x throughput speedup and power improvement, and new possibilities for deep learning at the edge

Our optimized networks delivered over 100x throughput speed-up and power improvement over their traditional counterparts on the large FPGA platform. Additionally, our optimized network was able to run efficiently on even the smallest of these platforms, where the standard network could not fit, opening new possibilities for Edge AI.

BENEFITS

Better resource utilization, untapped edge opportunities and critical energy savings

This dramatic speed improvement provides great benefits, enabling:

  • Implementation of much larger networks using the same resource
  • Implementation of more copies of networks on the same resource
  • Ability to run networks on edge platforms where traditional networks don’t fit
  • Massive energy savings and lower costs due to scaling efficiencies

Additional Resources

Whitepaper: FPGA 100x Acceleration in Deep Learning Networks  
Press: Xilinx and Numenta claim dramatic speed-up of neural nets versus Nvidia GPUs

Interested in working with us?

Related Case Studies