Deep Learning Processor: Kortiq AIScale

Smart Blog / Smart Solutions

Deep Learning Processor: Kortiq AIScale

Deep learning and deep neural networks (DNNs) are currently two of the most intensively and widely used predictive models in the field of machine learning. DNNs are not a new concept, but after the recent initial breakthrough applications of DNNs in the fields of speech recognition and image recognition, as well as due to the availability of large training data sets and extensive compute and memory capability in the cloud, they have returned to the focus of both academic and industrial communities. Today, different types of DNNs are being employed in areas ranging from autonomous driving and medical and industrial applications to playing complex games. In many of these applications, DNNs and, in particular, convolutional neural networks (CNNs) are now able to outperform humans. This arises from their ability to automatically extract high-level features from large amounts of raw sensory data during training in order to obtain an effective representation of an input space. This approach differs from earlier machine learning attempts using manually crafted features or rules designed by experts.

We recognize a strong and increasing demand for object recognition and image classification applications,

says Michaël Uyttersprot, Technical Marketing Manager for Avnet Silica.

CNNs are widely implemented in many embedded vision applications for different markets, including the industrial, medical, IoT, and automotive market.

CNN is a type of feed-forward artificial neural network in which the connectivity pattern between the neurons is inspired by the neural connectivity found in visual cortex of animals and humans. Individual neurons from visual cortex respond to stimuli only from a restricted region of space, known as receptive field. Receptive fields of neighboring neurons partially overlap, spanning the entire visual field. However, the superior accuracy of CNNs comes at the cost of their high computational complexity. All CNNs are extremely computationally demanding, requiring billions of computations in order to process single input instance. The largest CNNs (from the VGG neural networks models) require more than 30 bn computations to process one input image. This significantly reduces the use of CNNs in embedded/edge devices. All CNNs are extremely memory demanding, requiring megabytes of memory space for storing CNN parameters.

For example, VGG-16 CNN has more than 138 m network parameters. With a 16-bit fxed-point representation, more than 276 Mbytes of memory must be allocated just for storing all network parameters. Kortiq GmbH, a Munich-based company, has recently developed a novel CNN hardware accelerator, called AIScale. AIScale, distributed as an IP core implemented using FPGA technology, provides high processing speed and low power consumption. Kortiq’s AIScale accelerator is designed to process pruned/compressed CNNs and compressed feature maps – this increases processing speed by skipping all unnecessary computations, and reduces memory size for storing CNN parameters as well as feature maps. All these features help to reduce power consumption. It is also to support all layer types found in today’s state-of-the-art convolution, pooling, adding, concatenation, fully connected (CNN). This yields a highly flexible and universal system, which can support CNN architectures without modifying underlying hardware architecture. It is designed to be highly scalable, by simply providing more or fewer compute cores (CCs), the core processing blocks of the AIScale architecture. By using an appropriate number of CCs, different processing power requirements can be easily met.

Image Credit: Kortiq

 


Smart Industry: The IoT Magazine - Internet of Things, Industry 4.0

Smart Industry 2/2018

This post appeared first in the issue 2/2018 of "Smart Industry".

Read the digital issue of the magazine

Get your free printed copy of the current issue

Leave a Reply

Your email address will not be published. Required fields are marked *

*