Chauvaux, Nicolas
[UCL]
Legat, Jean-Didier
[UCL]
With the current trend toward exponential growth of IoT (Internet of Things) devices, there is a need to off-load computation from big data-centers to near-sensor decentralized devices (i.e. edge computing), which requires ultra-low power consumption and always-on processing. Brain-inspired spiking neural networks (SNNs) promise a high power efficiency, but still have not demonstrated it when compared to conventional artificial neural networks (ANNs). Analog in-memory computing (IMC), while matching well the dot-product operation of ANNs, encountered many difficulties to be applied to SNNs. However, its digital IMC counterpart seems to be promising for low-power SNNs inference and is well-suited to sparse inputs, a natural direction of research to leverage the low-power potential of SNNs. A new generation of digital IMC-based SNNs would thus ideally exploit sparsity and event-based temporal processing for the low-power always-on edge computing applications targeted in this project. In parallel, the observation that a wide range of neural network topologies, each with different neuron's characteristics (i.e. weight resolution), exists, combined to neurons' parameters dependent hardware accelerators (i.e. suited for one possible weight resolution) proposed in the literature, makes it essential to design a flexible chip. Digital IMC, with its one-by-one bit computation aspect, is a good candidate for this aim. Therefore, the analysis of the in-memory computing challenges and solutions for the flexible processing of a single-layer, either fully-connected or convolutional, of a large SNN is intended to be performed in this MSc thesis. Based on parallel architecture analysis and hierarchical control blocks, as a key point of the proposed chip called SpikIMC, a full implementation and in-memory data mapping for both layer types will be given, supported by an analysis and comparison with other existing hardware. It has been shown that the suggested hardware is competitive with state-of-the-art chips. When comparing the execution of a convolutional layer in SpikIMC and INXS, which is analog IMC-based, a gain of 8% is obtained on the number of clock cycles, while keeping the same power consumption and reducing by 98% the required memory bandwidth.


Bibliographic reference |
Chauvaux, Nicolas. Toward a reconciliation of digital in-memory architectures and spiking neural networks for flexible low-power inference at the edge. Ecole polytechnique de Louvain, Université catholique de Louvain, 2021. Prom. : Legat, Jean-Didier. |
Permanent URL |
http://hdl.handle.net/2078.1/thesis:33059 |