SPHINX: Search Space-Pruning Heterogeneous Task Scheduling for Deep Neural Networks

Published in ACM ICPP 2024, 2024

Recommended citation: Bowen Yuchi, Heng Shi, and Guoqing Bao, "SPHINX: Search Space-Pruning Heterogeneous Task Scheduling for Deep Neural Networks", In Proceedings of the 53rd International Conference on Parallel Processing. Association for Computing Machinery, New York, NY, USA, 524-533. https://doi.org/10.1145/3673038.3673155 https://doi.org/10.1145/3673038.3673155

Given the tendency of increasingly heterogeneous AI systems and the large workload scale of deep neural networks (DNNs), there is an urgent demand for model scheduling to improve execution performance in heterogeneous computational systems. However, this is very challenging because the task scheduling under the high-dimensional search space is an NP-hard problem. Existing works either schedule under naive search spaces without simplifications or oversimplifies the optimisation, which is hard to strike a balance between efficiency and optimality.

To address the aforementioned challenges, we propose SPHINX, an efficient and novel search space-pruning heterogeneous task scheduling engine to improve the DNN execution performance. The search space can be reduced by taking the prior knowledge of the model structures into account, and a cost model has been developed to guide the search process. In combination with our novel search method, named Critical Path Genetic Algorithm (CPGA), the search efficiency and stability can be significantly improved resulting in higher performance of DNN execution. The SPHINX engine is implemented based on the Multi-Level Intermediate Representation (MLIR) dialect, making it scalable and reusable within existing AI compilers. We examine the SPHINX engine with six popular deep neural networks across different heterogeneous scenarios, and the results prove the SPHINX engine can outperform existing state-of-the-art works with up to 1.44 × speedup. Compared to other search-based methods, our method converges faster and can boost the search speed by 23.6-35.5 ×.