Automating the Deep Neural Network Models to Cloud FPGAs

Abstract/Technology Overview

The efficacy of Convolutional Neural Networks (CNN) has been proven in a wide range of machine learning applications. However, the high computational complexity of CNNs often hinders their widespread adoption for real-time and low power applications. FPGAs are poised to take a significant role for high-performance and low-energy computation of CNNs for both mobile (e.g., UAVs or self-driving cars) and cloud computing domains. However, it is challenging to implement an effective and efficient CNN system on FPGAs. To address these challenges, we propose a novel automated toolchain called Open-DNN. Our toolchain takes trained CNN models specified
in either Caffe or annotated TensorFlow as input, performs a set of transformations, and maps the model to a cloud-based FPGA. Open-DNN can significantly improve the overall design productivity of neural networks on FPGAs, while also satisfying the emergent computational and energy efficiency requirements. Our design presents an alternative solution compared to other cloud-based options (e.g., CPUs or GPUs) while offering flexibility, low power/energy, and high performance. Open-DNN also provides additional features such as supporting quantized network model with fixed-point representation and balancing the on-chip resource usage during the implementation. 

Technology Features, Specifications and Advantages

Specifically, the company toolchain integrates numerical techniques into an automated framework for analyzing, generating and implementing trained neural network models in cloud-based FPGA platforms by taking advantage of High-Level Synthesis (HLS) design methodology.

Our core features include:
• A fully hardware friendly higher level language template library that containing all fundamental CNN functionalities.
• An automated generation flow for CNN implementation on cloud-based FPGAs with Caffe and TensorFlow input.
• Accurate model of the accelerator template is implemented, along with model-based system optimization to generate an optimal system configuration.
• Full software stack generation, together with the accelerator IP system construction, which provides a system level solution for the input network models.

Potential Applications

High throughput and low energy cost image and video analysis tasks with Neural Networks as the core functionality, such as current high-accuracy large-scale facial recognition, surveillance data processing.

Most of the current Artificial Intelligent applications with CNNs as the major functionalities, such as intelligent traffic control, video-based anomaly detection systems.

Customer Benefit

The company toolchain is proposed to provide the high computational capability and usability, to fully explore the computational potential that could be provided by the FPGA devices. Experimental results demonstrate comparable usability, flexibility, and strong quality when compared to CPU and GPU implementations.

Technology Owner

Yao Chen


ADSC (Illinois at Singapore Pte Ltd)

Technology Category
  • Electronics
  • Artificial Intelligence
  • Cloud Computing
Technology Status
  • Available for Licensing
Technology Readiness Level
  • TRL 5

Convolutional Neural Networks, CNN, FPGA platforms, cloud computing, artificial intelligence, AI, machine learning, ML