Gheysen, Guillaume
[UCL]
Legat, Jean-Didier
[UCL]
De Vleeschouwer, Christophe
[UCL]
In the domain of Artificial Intelligence and Deep Learning, Convolutional Neural Networks (CNN) are powerful models used in image classification, speech recognition, medical image analysis, etc. However, those models require a large number of parameters and operations to be executed. Therefore, GPUs are the dominant platform to implement CNNs thanks to their huge computational resources and memory. However, such platforms are power-hungry and limit the implementation of CNNs on mobile and embedded devices. A solution would be to use, rather than GPU, FPGA which are more energy efficient but do not possess enough resources to implement the most performing models. The analysis of the literature revealed a series of optimizations that could be applied to make it possible to efficiently implement CNNs on FPGA. From this analysis, two optimizations appeared to be most promising. Therefore, this thesis aims at combining the advantages of those two optimizations, pruning, and depthwise separable convolution, to reduce the number of parameters and operations required by a CNN and make an efficient implementation of CNNs on FPGA possible. To demonstrate the performance of the pruning scheme, an FPGA-based accelerator was designed combining both optimizations. This thesis shows that combining both pruning and depthwise separable convolution provides an efficient way to reduce both the number of weights and computational complexity. Moreover, the proposed pruning scheme can be efficiently handled on FPGA without any loss of performance when the sparsity increases.


Bibliographic reference |
Gheysen, Guillaume. Accelerating Convolutional Neural Networks for FPGA using depthwise separable convolution and pruning. Ecole polytechnique de Louvain, Université catholique de Louvain, 2020. Prom. : Legat, Jean-Didier ; De Vleeschouwer, Christophe. |
Permanent URL |
http://hdl.handle.net/2078.1/thesis:26731 |