Volume 17, Issue 4 (2-2021)                   JSDP 2021, 17(4): 3-14 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Eskandari S. Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features. JSDP 2021; 17 (4) :3-14
URL: http://jsdp.rcisp.ac.ir/article-1-942-en.html
University of Guilan
Abstract:   (2380 Views)
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with time as new features stream in. For instance, in the problem of semantic segmentation of images using texture-based features, the number of features can be infinitely growing.
In these dynamically growing scenarios, a rudimentary approach is waiting a long time for all features to become available and then carry out the feature selection methods. However, due to the importance of optimal decisions at every time step, a more rational approach is to design an online streaming feature selection (OSFS) method which selects a best feature subset from so-far-seen information and updates the subset on the fly when new features stream in. Any OSFS method must satisfy three critical conditions; first, it should not require any domain knowledge about feature space, because the full feature space is unknown or inaccessible. Second, it should allow efficient incremental updates in selected features. Third, it should be as accurate as possible at each time instance to allow having reliable classification and learning tasks at that time instance.
In this paper, OSFS is considered from the geometric series of features adjacency matrix and, a new OSFS algorithm called OSFS-GS is proposed. This algorithm ranks features based on path integrals and the centrality concept on an online feature adjacency graph. The most appealing characteristics of the proposed algorithm are; 1) all possible subsets of features are considered in evaluating the rank of a given feature, 2) it is extremely efficient, as it converts the feature ranking problem to simply calculating the geometric series of an adjacency matrix and 3) beside selected features subset, it uses a redundant features subset that provides the reconsideration of good features at different time instances.
This algorithm is compared with three state-of-the-art OSFS algorithms, namely information-investing, fast-OSFS and OSFSMI. The information-investing algorithm is an embedded online feature selection method that considers the feature selection as a part of learning process. This algorithm selects a new incoming feature if it reduces the model entropy more than the cost of the feature coding. The fast-OSFS algorithm is a filter method that gradually generates a Markov-blanket of feature space using causality-based measures. For any new incoming feature, this algorithm executes two processes: an online relevance analysis and then an online redundancy analysis. OSFSMI is a similar algorithm to fast-OSFS, in which uses information theory for feature analysis.
The algorithms are extensively evaluated on eight high-dimensional datasets in terms of compactness, classification accuracy and run-time. In order to provide OSF scenario, features are considered one by one. Moreover, in order to strengthen the comparison, the results are averaged over 30 random streaming orders. Experimental results demonstrate that OSFS-GS algorithm achieves better accuracies than the three existing OSFS algorithms.
Full-Text [PDF 3320 kb]   (542 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2018/12/30 | Accepted: 2019/09/2 | Published: 2021/02/22 | ePublished: 2021/02/22

References
1. [1] S. Eskandari and M. Javidi, "Online streaming feature selection using rough sets," International Journal of Approximate Reasoning, vol. 69, pp. 35-57, 2016. [DOI:10.1016/j.ijar.2015.11.006]
2. [2] I. Guyon and A. Elisseeff, "An introduction to variable and feature selection," Journal of machine learning research, vol. 3, pp. 1157-1182, 2003.
3. [3] S. Eskandari and M. Javidi, "Streamwise feature selection: a rough set method," International Journal of Machine Learning and Cybernetics, vol. 9, no. 4, pp. 667-676, 2016. [DOI:10.1007/s13042-016-0595-y]
4. [4] R. Kohavi and G. H. John, "Wrappers for feature subset selection," Artificial intelligence, vol. 97, no. 1-2, pp. 273-324, 1997. [DOI:10.1016/S0004-3702(97)00043-X]
5. [5] Y. Saeys, I. Iñaki , and P. Larrañaga, "A review of feature selection techniques in bioinformatics," bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007. [DOI:10.1093/bioinformatics/btm344] [PMID]
6. [6] V. Bolón-Canedo, S. Noelia , and A. Amparo, "A review of feature selection methods on synthetic data," Knowledge and information systems, vol. 34, no. 3, pp. 483-519, 2013. [DOI:10.1007/s10115-012-0487-8]
7. [7] J.Wang, Zh. Peilin , H. CH Steven , and J. Rong , "Online feature selection and its applications," IEEE Transactions on Knowledge and Data Engineering , vol. 29, no. 3, pp. 698-710, 2014. [DOI:10.1109/TKDE.2013.32]
8. [8] X. Wu, K. Yu, W. Ding, H. Wang, and X. Zhu, "Online feature selection with streaming features," IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 3, pp. 1178-1192, 2013. [DOI:10.1109/TPAMI.2012.197] [PMID]
9. [9] S. Perkins and J. Theiler, "Online feature selection using grafting," in Proceedings of the 20th International Conference on Machine Learning, 2003, pp. 592-599.
10. [10] K. Glocer, D. Eads, and J. Theiler, "Online feature selection for pixel classification," in Proceedings of the 22nd international conference on Machine learning, pp. 249-256.
11. [11] J. Zhou, D. P. Foster, R. A Stine, and L. H Ungar, "Streamwise feature selection," Journal of Machine Learning Research, vol. 7, pp. 1861-1885, 2006.
12. [12] M. Javidi and S. Eskandari, "Online streaming feature selection: a minimum redundancy, maximum significance approach," Pattern Analysis and Applications, pp. 1-15, 2018. [DOI:10.1007/s10044-018-0690-7]
13. [13] M. Rahmaninia and P. Moradi, "OSFSMI: Online stream feature selection method based on mutual information," Applied Soft Computing, vol. 68, pp. 733-746, 2018. [DOI:10.1016/j.asoc.2017.08.034]
14. [14] S. Eskandari and E. Akbas, "Supervised Infinite Feature Selection," CoRR arXiv, 2017.
15. [15] S. Perkins, K. Lacker, and J. Theiler, "Grafting: Fast, incremental feature selection by gradient descent in function space," Journal of machine learning research, vol. 3, pp. 1333-1356, 2003.
16. [16] P. Pudil, J. Novovičová, and J. Kittler, "Floating search methods in feature selection," Pattern recognition letters, vol. 15, no. 11, pp. 1119-1125, 1994. [DOI:10.1016/0167-8655(94)90127-9]
17. [17] L. H Ungar, J. Zhou, D. P Foster, and B. A Stine, "Streaming Feature Selection using IIC," in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, 2005.
18. [18] J. Zhou, D. Foster, R. Stine, and L. Ungar, "Streaming feature selection using alpha-investing," in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 384-393. [DOI:10.1145/1081870.1081914] [PMID]
19. [19] P. S Dhillon, D. Foster, and L. Ungar, "Feature selection using multiple streams," in Proceedings of International Workshop on Artificial Intelligence and Statistics, 2010.
20. [20] G. Roffo, S. Melzi, and M. Cristani, "Infinite feature selection," in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 4202-4210. [DOI:10.1109/ICCV.2015.478]
21. [21] Isabelle Guyon. (2003) http://clopinet.com/isabelle/Projects/NIPS2003/.
22. [22] M. Everingham, L. Van Gool, Ch. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (voc) challenge, 2007.
23. [23] M. Everingham, L. Van Gool, Ch. K. Williams, J. Winn, and A. Zisserman, The pascal visual object classes (voc) challenge, 2012.
24. [24] S. Maldonado and R. Weber, "A wrapper method for feature selection using support vector machines," Information Sciences, vol. 179, no. 13, pp. 2208-2217, 2009. [DOI:10.1016/j.ins.2009.02.014]
25. [25] R. Battiti, "Using mutual information for selecting features in supervised neural net learning," IEEE Transactions on neural networks, vol. 5, no. 4, pp. 537-550, 1994. [DOI:10.1109/72.298224] [PMID]
26. [26] G. Brown, A. Pocock, M. Zhao, and M. Luján, "Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection," The Journal of Machine Learning Research, vol. 13, pp. 27-66, 2012.
27. [27] M. Dash and H. Liu, "Consistency-based search in feature selection," Artificial intelligence, vol. 151, no. 1-2, pp. 155-176, 2003. [DOI:10.1016/S0004-3702(03)00079-1]
28. [28] P. Zhao and B. Yu, "On model selection consistency of Lasso," Journal of Machine learning research, no. 7, pp. 2541-2563, 2006.

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing