Edge AI applications need specialized ASICs and System-on-Chip accelerators, in order to have both the necessary computing power and greatly reduced power dissipation. In the short term, these applications will target inference tasks only: the neural network is learned offline and classification tasks are executed locally. Depending on the application class, more emphasis is placed on energy efficiency or classification accuracy. On endpoint devices, where only small networks are involved, the key parameters of interest are the energy dissipation and the memory footprint. Both can be addressed thanks to extreme weight quantization, down to binary synapses. This eases analogue in-memory compute, using non-volatile memory technology. The challenge, in this case, is that of the learning algorithm: several tricks have to be employed to keep the impact on classification accuracy low. On the Edge, much bigger networks can be used, for instance for autonomous driving applications. Safety of operation and classification accuracy are the chief parameters. The challenge here lies on the architecture side: it must be scalable and flexible enough, to accommodate still larger networks and to implement fancy new layers.
On the longer run, Edge AI applications ideally exhibit lifelong learning abilities as well, for having autonomous agents who adapt to their environment. The weights accuracy must therefore be higher for the learning algorithm to converge and the on-chip memory larger for storing all the intermediate results. The challenge is to design very dense, local, memory with a low energy access cost.
As described, there is no obvious “one size fits all solution” for Edge AI application. However, the adequate solution may lie in choosing the right combination of technology, design and tools.