Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

8 Apr 2019  ·  Rod Burns, John Lawson, Duncan McBain, Daniel Soutar ·

Over the past few years machine learning has seen a renewed explosion of interest, following a number of studies showing the effectiveness of neural networks in a range of tasks which had previously been considered incredibly hard. Neural networks' effectiveness in the fields of image recognition and natural language processing stems primarily from the vast amounts of data available to companies and researchers, coupled with the huge amounts of compute power available in modern accelerators such as GPUs, FPGAs and ASICs. There are a number of approaches available to developers for utilizing GPGPU technologies such as SYCL, OpenCL and CUDA, however many applications require the same low level mathematical routines. Libraries dedicated to accelerating these common routines allow developers to easily make full use of the available hardware without requiring low level knowledge of the hardware themselves, however such libraries are often provided by hardware manufacturers for specific hardware such as cuDNN for Nvidia hardware or MIOpen for AMD hardware. SYCL-DNN is a new open-source library dedicated to providing accelerated routines for neural network operations which are hardware and vendor agnostic. Built on top of the SYCL open standard and written entirely in standard C++, SYCL-DNN allows a user to easily accelerate neural network code for a wide range of hardware using a modern C++ interface. The library is tested on AMD's OpenCL for GPU, Intel's OpenCL for CPU and GPU, ARM's OpenCL for Mali GPUs as well as ComputeAorta's OpenCL for R-Car CV engine and host CPU. In this talk we will present performance figures for SYCL-DNN on this range of hardware, and discuss how high performance was achieved on such a varied set of accelerators with such different hardware features.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here