Volume 18, Issue 4 (3-2022)                   JSDP 2022, 18(4): 37-48 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Pourasghar B, Izadkhah H, Lotfi S, Salehi K. A partition-based algorithm for clustering large-scale software systems. JSDP 2022; 18 (4) : 3
URL: http://jsdp.rcisp.ac.ir/article-1-1028-en.html
Abstract:   (2444 Views)
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawbacks such as finiteness criterion and arbitrary decisions occurred in the process. Because of the NP-hardness of clustering software systems, evolutionary and search-based algorithms are more commonly used algorithm than hierarchical ones. In evolutionary algorithms, the clustering of software systems is considered as a problem of searching over some possible clustering candidates. Although these algorithms are often able to achieve an appropriate structure of the software, they are not applicable in clustering large-scale software. Furthermore, these algorithms are unable to consider the knowledge in the artifact dependency graph, which extracted from the source code of the software. In software systems, an artifact can be everything like a class, a function, or a file. In this paper, a new partition-based clustering algorithm is presented. This algorithm attempts to partition the artifact dependency graph considering the knowledge therein. Moreover, a new distance criterion is presented to measure the similarity and dissimilarity of the artifacts. The proposed algorithm starts with the artifact dependency graph and creates the similarity matrices of the artifacts. So, it attempts to refine the partition candidate until a fixed point is reached. We expect that the proposed method compared with other methods could lead to achieve the clustering with high quality and similar to the expert's clustering based on MoJo-FM measure. To demonstrate the applicability and validity of the proposed algorithm, a large-scale case study, Mozilla Firefox, is employed. The results demonstrate that the proposed algorithm outperforms the commonly used evolutionary methods in the literature.
Article number: 3
Full-Text [PDF 940 kb]   (1175 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2019/07/10 | Accepted: 2020/08/18 | Published: 2022/03/21 | ePublished: 2022/03/21

References
1. [1] A. Isazadeh, H. Izadkhah, and I. Elgedawy, ''Source code modularization: theory and techniques'', Springer, 2017. [DOI:10.1007/978-3-319-63346-6]
2. [2] B.S. Mitchell, S. and Mancoridis, ''On the automatic modularization of software systems using the bunch tool'', IEEE Transactions on Software Engineering, vol. 32(3), pp.193-208 , 2006. [DOI:10.1109/TSE.2006.31]
3. [3] I. Candela, G. Bavota, B. Russo, and R. Oliveto, ''Using cohesion and coupling for software remodularization: Is it enough?'', ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 25(3), pp.1-28, 2016. [DOI:10.1145/2928268]
4. [4] G. Bavota, F. Carnevale, A. De Lucia, M. Di Penta, and R. Oliveto, ''Putting the developer in-the-loop: an interactive GA for software re-modularization'', In International Symposium on Search Based Software Engineering, 2012, pp. 75-89. [DOI:10.1007/978-3-642-33119-0_7]
5. [5] W. Mkaouer, M. Kessentini, A. Shaout, P. Koligheu, S. Bechikh, K. Deb, andA. Ouni, ''Many-objective software remodularization using NSGA-III'', ACM Transactions on Software Engineering and Methodology (TOSEM), vol.24(3), pp.1-45, 2017. [DOI:10.1145/2729974]
6. [6] O. Maqbool, and H. Babri, ''Hierarchical clustering for software architecture recovery'', IEEE Transactions on Software Engineering, vol. 33(11), pp.759-780, 2007. [DOI:10.1109/TSE.2007.70732]
7. [7] P. Andritsos, and V. Tzerpos, ''Information-theoretic software clustering'', IEEE Transactions on Software Engineering, vol. 31(2), pp.150-165, 2007. [DOI:10.1109/TSE.2005.25]
8. [8] M. Tajgardan, H. Izadkhah, and S. Lotfi, ''Software systems clustering using estimation of distribution approach,'' Journal of Applied Computer Science Methods, vol.8(2), pp.99-113, 2016. [DOI:10.1515/jacsm-2016-0007]
9. [9] H. Izadkhah, I. Elgedawy, A. Isazadeh, ''E-cdgm: an evolutionary call-dependency graph modularization approach for software systems'', Cybernetics and Information Technologies, vol.16(3), pp.70-90, 2016. [DOI:10.1515/cait-2016-0035]
10. [10] S. Parsa, and O. Bushehrian, '' The design and implementation of a framework for automatic modularization of software systems'', The Journal of Supercomputing, vol.32(1), pp.71-94, 2005. [DOI:10.1007/s11227-005-0159-5]
11. [11] M. Kargar, A. Isazadeh, and H. Izadkhah, ''Semantic-based software clustering using hill climbing'', In 2017 International Symposium on Computer Science and Software Engineering Conference (CSSE) , pp. 55-60, IEEE, 2017, October. [DOI:10.1109/CSICSSE.2017.8320117]
12. [12] K. Praditwong, M. Harman, M. and X. Yao, ''Software module clustering as a multi-objective search problem'', IEEE Transactions on Software Engineering, vol. 37(2), pp.264-282, 2010. [DOI:10.1109/TSE.2010.26]
13. [13] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein, ''Introduction to algorithms'', MIT press, 2009.
14. [14] Z. Wen, and V. Tzerpos, '' An effectiveness measure for software clustering algorithms'', In Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004. pp. 194-203, IEEE.
15. [15] T. Lutellier, D. Chollak, J. Garcia, L. Tan, D. Rayside, N. Medvidović, and R. Kroeger, '' Measuring the impact of code dependencies on software architecture recovery techniques'', IEEE Transactions on Software Engineering, vol. 44(2), pp.159-181, 2017. [DOI:10.1109/TSE.2017.2671865]
16. [16] N.S. Jalali, H. Izadkhah, and S. Lotfi, ''Multi-objective search-based software modularization: structural and non-structural features'', Soft Computing, vol23(21), pp.11141-11165. 2019 [DOI:10.1007/s00500-018-3666-z]
17. [17] R. Naseem, O. Maqbool, and S. Muhammad, ''Cooperative clustering for software modularization'', Journal of Systems and Software, vol. 86(8), pp.2045-2062, 2013. [DOI:10.1016/j.jss.2013.03.080]
18. [18] S. Mohammadi, and H. Izadkhah, ''A new algorithm for software clustering considering the knowledge of dependency between artifacts in the source code'', Information and Software Technology, vol.105, pp.252-256. 2019. [DOI:10.1016/j.infsof.2018.09.001]
19. [19] H. Sözer, ''Evaluating the Effectiveness of Multi-level Greedy Modularity Clustering for Software Architecture Recovery'', In European Conference on Software Architecture, pp. 71-87, 2019. [DOI:10.1007/978-3-030-29983-5_5]
20. [20] M. Kargar, A. Isazadeh, and H. Izadkhah, '' Multi-programming language software systems modularization'', Computers & Electrical Engineering, vol.80, pp.106-500, 2019. [DOI:10.1016/j.compeleceng.2019.106500]

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing