An Energy-Efficient Fine-Grained Deep Neural Network Partitioning Scheme for Wireless Collaborative Fog Computing

Reference Type:

Journal Article

Kilcioglu, Emre, Hamed Mirghasemi, Ivan Stupia, and Luc Vandendorpe. 2021. “An Energy-Efficient Fine-Grained Deep Neural Network Partitioning Scheme for Wireless Collaborative Fog Computing.” IEEE Access 9:79611–27. https://doi.org/10.1109/ACCESS.2021.3084689.

Fog computing is a potential solution for heterogeneous resource-constrained mobile devices to collaboratively operate deep learning-driven applications at the edge of the networks, instead of offloading the computations of these applications to the powerful cloud servers thanks to the latency reduction, decentralized structure, and privacy concerns. Compared to the mobile cloud computing concept where computation-intensive deep learning operations are offloaded to the powerful cloud servers, making use of the computing capabilities of resource-constrained devices can improve the delay performance and lessen the need for powerful servers to execute such applications by considering a collaborative fog computing scenario with deep neural network (DNN) partitioning. In this paper, we propose an energy-efficient fine-grained DNN partitioning scheme for wireless collaborative fog computing systems. The proposed scheme includes both layer-based partitioning where the DNN model is divided into layer by layer and horizontal partitioning where the input data of each layer operation is partitioned among multiple devices to encourage parallel computing. A convex optimization problem is formulated to minimize the energy consumption of the collaborative part of the system by optimizing the communication and computation parameters as well as the workload of each participating device and solved by using the primal-dual decomposition and Lagrange duality theory. As can be observed in the simulation results, the proposed optimized scheme makes a notable difference in the energy consumption compared to the non-optimized scenario where the workload distribution is equal for all participating devices but the communication and computation parameters are still optimized, so it is a quite challenging bound to be compared.