Volume 22, Issue 1 (5-2025)                   JSDP 2025, 22(1): 25-38 | Back to browse issues page

XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Mohajjel kafshdooz M, Shamsi M, Rastari S. Resource-Aware Neural Architecture Search for Multicore Embedded Real-Time Systems. JSDP 2025; 22 (1) :25-38
URL: http://jsdp.rcisp.ac.ir/article-1-1418-en.html
Qom University of Technology
Abstract:   (79 Views)
Creating neural networks in a non-automatic way is a slow process based on trial and error. When the number of network parameters or the number of layers increases, the non-automatic method becomes very expensive and the final result may be suboptimal. Automatic network architecture search algorithms are used to solve this problem. Recently, these algorithms have been able to achieve high accuracies on various datasets such as CIFAR-10, ImageNet, and Penn Tree Bank. These algorithms have the ability to search a wide space of architectures with different characteristics such as network depth, width, connection method, and operations in order to discover architectures with appropriate accuracy. However, one of the traditional challenges of these algorithms is their high search time (Approximately tens of thousands of GPU hours), which has been reduced to tens of hours with new research. Another challenge that usually exists in these methods is their focus on improving network accuracy, while other criteria such as network speed and consumed resources are not taken into account. As a result, these methods cannot be used directly to find the optimal architecture in embedded systems that have limited resources such as processing power, memory, and energy consumption. Therefore, search methods should be devised that are aware of these limitations. Research has been done in this field in recent years, but these methods do not focus specifically on coarse-grained multi-core architectures that do not have a GPU.
In this article, we present a method for the automatic design of networks that are suitable for running on multi-core processors. In this method, based on gradient descent, a SuperNet with parallel paths and computational blocks is created. The number of parallel paths is equal to or less than the number of cores. We use a series of decision variables to select appropriate operations in each block of the path. In addition to deciding on the operations performed in each block, deciding is also made regarding synchronization points to utilize the intermediate results of parallel paths and improve the network's accuracy. Then, by training the decision variables (block type and synchronization points) simultaneously with the main network weights, an appropriate subnetwork is selected. Due to the use of the gradient descent method in this approach, the training process is performed only twice, resulting in the final structure of the network. As a result, it has a much lower execution time compared to other methods based on evolutionary search and reinforcement learning. Additionally, considering the constraints of the target system, such as the number of cores and memory consumption, can lead to a more suitable architecture compared to other methods. Experiments conducted on the CIFAR-10 dataset demonstrate that the proposed method can achieve satisfactory accuracy with very little search time.
Full-Text [PDF 1200 kb]   (21 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2024/01/23 | Accepted: 2025/03/8 | Published: 2025/06/21 | ePublished: 2025/06/21

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing