Search published articles


Showing 96 results for reza

Froozan Rashidi, Samad Nejatian, Hamid Parvin, Vahideh Rezaei, Karamolah Bagheri Fard,
Volume 20, Issue 1 (6-2023)
Abstract

Data clustering is one of the main tasks of data mining, which is responsible for exploring hidden patterns in unlabeled data. Due to the complexity of the problem and the weakness of the basic clustering methods, today most of the studies are directed towards clustering ensemble methods. Although for most datasets, there are individual clustering algorithms that provide acceptable results, but the ability of a single clustering algorithm is limited. In fact, the main purpose of clustering ensemble is to search for better and more stable results, using the combination of information and results obtained from several initial clustering. In this paper, a clustering ensemble-based method will be proposed, which, like most evidence accumulation methods, has two steps: 1- building a simultaneous participation matrix and 2- determining the final output from the proposed participation matrix. In the proposed method, some other information will be used in addition to the clustering of the samples to construct the simultaneous participation matrix. This information can be related to the degree of similarity of the samples, the size of the initial clusters, the degree of stability of the initial clusters, etc. In this paper, the clustering problem is defined as an explicit optimization problem by the mixed Gaussian model and is solved using the simulated annealing algorithm. Also, an evolutionary method based on simulated annealing will be presented to determine the final output from the proposed simultaneous participation matrix. The most important part of the evolutionary method is to determine the objective function that guarantees the final output will be of high quality. The experimental results show that the proposed method is better than other similar methods in terms of different clustering quality evaluation criteria.
 
Mr. Mohammad Reza Ghaderi, Dr. Vahid Tabataba Vakili, Dr. Mansour Sheikhan,
Volume 20, Issue 2 (9-2023)
Abstract

Nowadays, wireless sensor networks (WSNs) have found many applications in a variety of topics. The main purpose of these networks is to measure environmental phenomena and to send read data in multi-hop paths to the sink to be exploited by users. The most important challenge in WSNs is to minimize energy consumption in sensor batteries and increase network lifetime. One of the most important techniques for reducing energy consumption in WSNs is the compressive sensing (CS) technique. CS reduces network energy consumption by reducing data transmission in the network and increasing the network lifetime. The use of CS technique in a WSN results in the production of different models of CS signals. These models are based on spatial, temporal and spatio-temporal sensors readings. On the other hand, in order to overcome the challenge of energy consumption, the exact recognition of energy resources in the network is essential. 
Energy consumption in a sensor node can be divided into two parts: (a) the energy used for computing; and (b) the energy consumed by the communication. The energy used for the computing consists of three components: 1. sensor energy consumption (data reading), 2. background energy consumption, and 3. energy consumption for processing. The power consumption of the communication includes the following: 1. energy consumption for data transmission; 2. energy consumption for data receiving; 3. energy consumption for sending messages; and 4. energy consumption for receiving messages. Hence, the existence of a model for analyzing energy consumption in a CS-based WSN is necessary. Several models have been developed to analyze energy consumption in a WSN, but there is not a complete model for analyzing energy consumption in a CS-based WSN.
 In this paper, we study all energy consumption components mentioned above in a CS-based WSN and present a complete model for energy consumption analysis. This model can optimize the design of CS-based WSNs energy efficiency improvement approach. To evaluate the proposed model, we use this model to analyze energy consumption in the compressive data gathering technique which is a CS-based data aggregation method. Using this model can optimize the design of CS-based WSNs.
Ms. Elham Hamedi, Dr. Mitra Mirzarezaee,
Volume 20, Issue 2 (9-2023)
Abstract

ABSTRACT
Nowadays, we are witnessing financial markets becoming more competitive, and banks are facing many challenges to attract more deposits from depositors and increase their fee income. Meanwhile, many banks use performance-based incentive plans to encourage their employees to achieve their short-term goals. In the meantime, fairness in the payment of bonuses is one of the important challenges of banks, because not paying attention to this issue can become a factor that destroys the motivation among employees and prevents the bank from achieving its short-term and mid-term goals. This article is trying to tackle the problem of optimizing the coefficients of branch performance evaluation indicators based on their business environment in one of the state banks of Iran. In this article, a two-objective genetic algorithm is proposed to solve the problem.
This article is comprised of four main sections. The first section is dedicated to the problem definition which is what is our meaning of optimizing the importance coefficients of branches based on the business environment. The second section is about our proposed solution for the defined problem. In the third section, we are comparing the performance of the proposed two-objective genetic algorithm on the defined problem with the performance of four well-known multi-objective algorithms including NSGAII, SPEAII, PESAII, and MOEA/D. And finally, the set of ZDT problems which is a standard set of multi-objective problems is taken into account for evaluating the general performance of the proposed algorithm comparing four well-known multi-objective algorithms.
Our proposed solution for solving the problem of optimizing branch performance coefficients includes two main steps. First, identifying the business environment of the branches and second, optimizing the coefficients with the proposed two-objective genetic algorithm. In the first step, the k-means clustering algorithm is applied to cluster branches with similar business environments. In the second step, to optimize the coefficients, it is necessary to specify the fitness functions. The defined problem is a two-objective problem, the first objective is to minimize the deviation of the real performance of the branches from the expected performance of them, and the second objective is to minimize the deviation of the coefficients from the coefficients determined by the experts. To solve this two-objective problem, a two-objective genetic algorithm is proposed.
In this article, two approaches are adopted to compare the proposed solution performance. In the first stage, the results of applying the proposed two-objective genetic algorithm have been compared with the results of applying four well-known multi-objective genetic algorithms on the problem of optimizing the coefficients. The results of this comparison show that the proposed algorithm has outperformed the other compared methods based on the S indicator and run time, and it is also ranked second after the NSGAII algorithm in terms of the HV indicator.
Finally, for evaluating the performance of the proposed algorithm with other well-known methods, the set of ZDT problems including ZDT1, ZDT2, ZDT3, ZDT4, and ZDT6 has also been taken into consideration. At this stage, the performance of the proposed algorithm has been compared with the four mentioned algorithms based on four key indicators, including GD, S, H, and run time. The results show, the proposed algorithm has outperformed significantly in terms of run time in all five ZDT problems. In terms of GD indicator, the performance of our proposed algorithm is located in the first or second rank among all considered algorithms. In addition, in terms of S and H indicators in many cases, the proposed algorithm outperformed the other well-known algorithms.
Miss Seyedeh Zohreh Hosseini, Phd Reza Radfar, Phd Amirashkan Nasiripour, Phd Ali Rajabzadeh Ghatary,
Volume 20, Issue 3 (12-2023)
Abstract

The development of information technology and its use in the health system has taken many measures to protect and promote human health, however, the world still faces long-term threats and recurrence of infectious diseases.
Understanding the dynamics of infectious diseases is important in controlling the disease because the network and the mode of impact of infectious diseases are very complex. The management of infectious diseases can also be considered as a complex social system due to the fact that has many complexities (such as dimensions, parameters, interactions, behaviors and rules), for this reason, the approach of the present study is a multifaceted understanding of the spread of infectious diseases. To design the present model, an intelligent system with a combination of mathematical, machine learning and epidemiological dimensions is proposed.
The disease studied in this study, due to its importance and prevalence, is Covid 19.
In this study, with the approach of complex systems and using the Internet of Things and machine learning methods, an algorithm was presented that uses environmental and individual variables to predict the probability of disease in an individual. Therefore, this research can improve the prevention of infectious diseases by filling some of the gaps in 3 sections: 1- Re-emergence of infectious diseases and the potential of IoT and AI, 2- Speed of dissemination and importance of real-time tracking, and 3- Budget and cost.
The evaluation of the algorithm in this study was determined by two criteria of sensitivity and specificity.
The results of the proposed algorithm for predicting Covid 19 disease showed an accuracy of more than 98%. Sensitivity above 98% was also obtained. Which is very important for the diagnosis of Covid disease 19 and shows the low number of false negatives in the test results.
Therefore, the proposed model, combined with the Internet of Things and machine learning, can cause early diagnosis and prevent the spread of the Covid-19 disease with high specificity and sensitivity.
Miss Soheila Rezaei, Dr Hossein Ghayoumi Zadeh, Dr Mohammad Hossein Gholizadeh, Dr Ali Fayazi,
Volume 20, Issue 3 (12-2023)
Abstract

Predicting and estimating the time it takes for an event of interest to occur base on available information is special assistance in how to deal with the event and handle it or provide solution to prevent the occurrence of the event. In medicine, valuable information about evaluating the types of treatments and prognosis and providing solution to handle event can be gained by predicting the time that an event occurrence according to information recorded from patients. Many statistical solutions have been proposed for predicting the time that an event occurrence and the most professional method is Survival Analysis. The purpose of Survival Analysis is to predict the time that an event occurrence a model effective parameters in estimating the time, which can be control or eliminating problematic factors. Due to the importance and prevalence of breast cancer as the second leading cause of death among cancer patients in the world, access to models that can accurately predict the survival of breast cancer patients is very important. The present study is an analytical study. The data used in this study are taken from The Molecular Taxonomy Data of the International Federation of Breast Cancer (METABRIC) database, which is related to which is related to the molecular classification of breast cancer patients. The total number of patients studied was 1981. Of these, 888 patients were in care until the time of death and the rest did not continue the study during the study. In this database, 21 clinical features of patients have been considered, which includes a total of 6 quantitative features and 15 qualitative features. To predict survival, a deep neural network model called the optimized DeepHit is used. The optimized model has achieved the criterion of c_index = 0.73, which is a criterion for measuring the capability of survival analysis models. Comparisons with previous models based on real and synthetic datasets show that the optimized DeepHit has achieved great performance and statistically significant improvements over previous advanced methods.
Mr Ali Derogarmoghadam, Dr Mohammad Reza Karami Molaei, Dr Mohammadreza Hassanzadeh,
Volume 20, Issue 3 (12-2023)
Abstract

Background: In recent years, convolutional neural networks (CNNs) have been increasingly used in various applications of machine vision. CNNs simulate the function of the brain's visual cortex and have a powerful structure for analyzing visual images. However, the diversity of digital images, their content, and their features necessitate that CNN networks are specially designed, and their parameters are carefully adjusted to achieve higher efficiency in any classification problem. In this regard, in many previous studies, researchers have attempted to increase the efficiency of the CNNs by setting their adjustable parameters more accurately.
New method: New method: In this study, we presented a novel initializing method for the kernels of the first convolutional layer of the CNN networks. We designed a filter bank with specialized kernels and used them in the first convolution layer of the proposed models. These kernels, compared to the random kernels in traditional CNNs, extract more effective features from the input images without increasing the computational cost of the network, and improve the classification accuracy by covering all the important characteristics.
Results: The dataset used in this paper was the MNIST database of handwritten digits. We examined the performance of CNN networks when three different types of kernels were used in their first convolution layer. The first group of kernels had constant coefficients; the second group had random coefficients, and finally, the kernels of the third group were specially designed to extract a wide range of image features. Our experiments on a single-layer CNN network with three types of kernels (constant numbers, random numbers, and filter-bank) showed the average classification accuracy of MNIST images in 50 times of network training to be 74.94%, 86.47%, and 91.89%, respectively, and for a three-layer CNN network, 88.82%, 96.16%, and 99.14%, respectively.
Comparison with existing methods: Compared to the kernels with randomized coefficients, the use of specialized kernels in the first convolution layer of the CNN networks has several important advantages: 1) They can be designed to extract all important features of the input images, 2) They can be designed more effectively based on the problem in hand, 3) They cause the training to start from a more appropriate point, and in this way, the speed of training and the classification accuracy of the network increase.
Conclusion: This study provides a novel method for initializing kernels in convolution layers of CNN networks to enhance their performance in image classification works. Our results show that compared to random kernels, the kernels used in the proposed models extract more effective features from the images at different frequencies and increase the classification accuracy by starting the training algorithm from a more appropriate point, without increasing the computational cost. Therefore, it can be concluded that the initial coefficients of the convolution layer kernels are effective on the classification accuracy of CNN networks, and by using more effective kernels in the convolution layers, these networks can be made specific to the problem and, in this way, increase the efficiency of the network.
 
Mrs Mahdieh Vahedipoor, Dr Mahboubeh Shamsi, Dr Abdolreza Rasouli Kenari,
Volume 20, Issue 4 (3-2024)
Abstract

In recent years, the massive growth of generated content by the users in social networks and online marketing sites, allows people to share their feelings and opinions in a variety of opinions about different products and services. Sentiment analysis is an important factor for better decision-making that is done using natural language processing (NLP), computational methods, and text analysis to extract the polarity of unstructured documents. The complexity of human languages and sentiment analysis have created a challenging research context in computer science and computational linguistics. Many researchers used supervised machine learning algorithms such as Naïve Bayes (NB), Stochastic Gradient Descent (SGD), Support Vector Machine (SVM), Logistic Regression (LR) Random Forest (RF), and deep learning algorithms such as Convolution Neural Network (CNN) and Long Short-Term Memory (LSTM). Some researchers have used Dictionary-based methods. Despite the existence of effective techniques in text mining, there are still unresolved challenges. Note that user comments are unstructured texts; Therefore, in order to structure the textual inputs, parsing is usually done along with adding some features, linguistic interpretations and removing additional items, and inserting the next terms in the database, then extracting the patterns in the structured data and finally the outputs will evaluate and interpret. The imbalance of data with the difference in the number of samples in each class of a dataset is an important challenge in the learning phase. This phenomenon breaks the performance of the classifications because the machine does not learn the features of the unpopulated classes well. In this paper, words are weighted based on the prescribed dictionary to influence the most important words on the result of the opinion mining by giving higher weight. On the other hand, the combination of the adjacent words using n-gram methods will improve the outcome. The dictionaries are highly related to the domain of the application. Some words in an application are important but in mobile comments are not impressive. Another challenge is the unbalanced train data, in which the number of positive sentences is not equal to the number of negative sentences. In this paper, two ideas are applied to build an efficient opinion mining algorithm. First, we build a precise dictionary for mobile Persian comments, and the second idea is to balance the positive and negative comments in train data. In summary, the main achievements of the current research can be mentioned: creating a weighted comprehensive dictionary in the field of mobile phone opinions to increase the accuracy of opinion analysis, balancing positive and negative opinions to improve the accuracy of opinion analysis, and eliminating the negative effect of overfitting and providing a precise approach to Determining the polarity of users' opinions about mobile phones using machine learning and recurrent deep learning algorithms. This new method is presented on mobile phone products from the Digikala site and Senti-Pers data. The result is performed with Naive Bayesian, Support Vector Machine, Stochastic Gradient Descent, Logistic Regression, Random Forest, and deep learning methods such as Convolutional Neural Network and Long Short-Term Memory based on parameters such as Accuracy, Precision, Retrieval, and F-Measure. The proposed method increases accuracy on Digikala, with NB between 10% and 34% and SVM between 5% and 24%, SGD between 7% and 38%, LR between 5% to 38%, and RF between 4% Up to 22% and CNN by 4%. The results show an accuracy increment on Senti-Pers, with NB between 12% and 46% and SVM between 5% and 46%, SGD between 5% and 35%, LR between 6% to 46%, and RF between 4% Up to 46%.
 

Dr Samad Mohammadi, Dr Vahe Aghazarian, Dr Alireza Hedayati,
Volume 20, Issue 4 (3-2024)
Abstract

Movie recommendation systems are efficient tools to help users find their relevant movies by investigating the previous interests of users. These systems are established on considering the ratings of users provided for movies in the past and using them to predict their interests in the future. However, users mainly provide insufficient ratings leading to make a problem called data sparsity. This problem makes reducing the effectiveness of movie recommendation systems. On the other hand, other available data such as genres of movies and demographic information of users play a vital role in assisting recommenders in order to better produce recommendations. This paper proposes a movie recommendation method utilizing the movies’ genres and users’ demographic information. In particular, we propose an effective model to evaluate the user’s rating profile and determine the minimum number of ratings required to produce an accurate prediction. Then, appropriate virtual ratings are incorporated into the profiles with insufficient ratings to expand them. These virtual ratings are calculated using similarity values between users obtained by genres of movies and demographic information of users. Furthermore, an effective measure is introduced to determine how much an item is reliable. This measure guarantees the virtual ratings’ reliability. Finally, unknown ratings for target user are predicted based on the expanded rating profiles. Experiments performed on two well-known movie recommendation datasets demonstrate that the proposed approach is more efficient than other compared recommenders.
We propose a movie recommender system in this paper by employing the genres of movies and demographic information of users to address the above-mentioned challenges. To this end, first of all, a model is developed in order to determine whether the target user’s rating profile is appropriate to produce accurate recommendations or not. In other words, the developed model determines how many ratings are required for each user to generate an accurate prediction with a high probability. This criterion is used to demonstrate that a rating profile contains sufficient ratings for producing reliable recommendations or not. Then, the quality of rating profiles containing insufficient ratings is boosted using an effective profile expansion technique which incorporates some virtual ratings to these profiles. These virtual ratings are calculated using the similarity values between users which are computed according to the genres of movies and demographic information of users. Moreover, the reliability values of users and items are calculated using appropriate reliability measurements to guarantee that the incorporated virtual ratings are reliable. Experimental results on two movie recommendation datasets indicate the superiority of the proposed approach in respect to other models. In the following, we provide a list of the main contributions of this paper:
  • We develop a model in order to evaluate the users’ rating profiles and determine how many ratings are required for generating an accurate prediction.
  • We propose a powerful profile expansion technique which incorporates some virtual ratings to user-item ratings matrix for improving its quality.
  • Movies’ genres and users’ demographic information are used as additional data in the proposed movie recommender system.
  • The reliability measures of users and items are used in the proposed method to guarantee the reliability of calculated virtual ratings.
  • The proposed method generates a denser user-item ratings matrix than the original matrix which results in alleviating data sparsity problem significantly.   
The remaining parts of this paper are structured as follows: in section 2, related works are investigated, section 3 includes the details of the proposed method, section 4 refers to the discussion of experimental results, and section 5 provides some conclusions about the paper Movie recommendation systems are efficient tools to help users find their relevant movies by investigating the previous interests of users. These systems are established on considering the ratings of users provided for movies in the past and using them to predict their interests in the future. However, users mainly provide insufficient ratings leading to make a problem called data sparsity. This problem makes reducing the effectiveness of movie recommendation systems. On the other hand, other available data such as genres of movies and demographic information of users play a vital role in assisting recommenders in order to better produce recommendations. This paper proposes a movie recommendation method utilizing the movies’ genres and users’ demographic information. In particular, we propose an effective model to evaluate the user’s rating profile and determine the minimum number of ratings required to produce an accurate prediction. Then, appropriate virtual ratings are incorporated into the profiles with insufficient ratings to expand them. These virtual ratings are calculated using similarity values between users obtained by genres of movies and demographic information of users. Furthermore, an effective measure is introduced to determine how much an item is reliable. This measure guarantees the virtual ratings’ reliability. Finally, unknown ratings for target user are predicted based on the expanded rating profiles. Experiments performed on two well-known movie recommendation datasets demonstrate that the proposed approach is more efficient than other compared recommenders.

Mr Amir Khoshkbarchi, Dr Hamid Reza Shahriari,
Volume 20, Issue 4 (3-2024)
Abstract

Trust management systems are used in interactive environments, where an agent needs to make a decision about using a service. Due to the preponderance of these systems, malicious entities have strong incentives to influence trust management systems and divert their decisions. In spite of approaches presented in previous trust models to mitigate the malicious activities, many of them could not cope with the problem efficiently. For example, tackling the variable behavior of agents is a common failure point for many trust models. Moreover, no rigid, flexible and adaptive general approach has been presented and the problem somehow remains.
This paper presents a novel approach to prevent malicious actions and identify anomalies using an entropy-based trust management system. The system is capable of recognizing the intrinsic characteristics of actions, determining whether they are malicious or not. To achieve this, the information environment is divided into four main parts based on entropy changes. Trust calculation in this system relies on an entropy structure derived from information ethics theory. To enhance the system’s resistance and resilience against malicious behavior, it is important to understand the nature of the actions performed by the agents. To accomplish this, we define the patterns of entropy changes for the four parts and use these patterns to identify and refine the nature of actions as good, bad, or insignificant. The simulation-based experimental results indicate that the proposed system shows promising performance in terms of accurately calculating trust and detecting malicious behavior. Specifically, the proposed system exhibits a 10 percent advantage over well-known trust systems with regards to swiftly adapting to environmental changes and diverse agent behaviors. Moreover, the observed experiments have displayed a notable trend in the fluctuation of good, bad, and insignificant actions. The results indicate a consistent increase in the number of good actions and a corresponding decrease in bad actions. Put simply, the method demonstrates improvement over time through repeated system implementations. This improvement can be attributed to the agents’ heightened honesty as they gain a better understanding of the nature of their actions. Additionally, the provision of feedback on their behavior plays a pivotal role in reinforcing more accurate decision-making within the system.

Dr. Elham Parvinnia, Mr. Mohammad Safari, Mr Seyedalireza Khayami,
Volume 21, Issue 1 (6-2024)
Abstract

In order to protect rotating machines and prevent their operation in unusual situations, protective control systems and process data are traditionally used. In this article, a method has been proposed to detect the indirect effects of abnormal operating modes using data mining methods. One of the dangerous conditions of abnormal operation in compressors, as one of the important rotating machines in industries, is the surge condition. In this article, the real data stored during three years of a three-stage refrigerant compressor in a gas refinery are used. the relationship between the surge state of the compressor and the amount of vibration in its different parts has been investigated. It has been proven with data mining methods that there is a direct relationship between the state of surge and the amount of vibration. Also, more sensitive points to vibration during the surges have been identified and it has been proven that by measuring these points, surges can be detected. Therefore, in addition to the existing and previous traditional methods that use process data, it is possible to use the amount of vibration of the points as an extension protection system for surge detection. in this way, more protection of the compressor against the state of surge can be achieved. In this study, various data mining methods have been evaluated, and the results of the nearest neighbor method with the number of neighbors of two have the best performance, and the effects of the number of records in the data set on the quality and accuracy of the results have been investigated.
Mahdi Ahmadnia, Mojtaba Maghrebi, Reza Ghanbari,
Volume 21, Issue 2 (10-2024)
Abstract

Low-light images often suffer from low brightness and contrast, which makes some scene details hard to see. This can affect the performance of many computer vision tasks, such as object recognition, tracking, scene understanding, and occlusion detection. Therefore, it is important and useful to enhance low-light images. One technique to enhance low-light images is based on the Retinex theory, which decomposes images into two components: reflection and illumination. Several mathematical models have been recently developed to estimate the illumination map using this theory. These methods first compute an initial illumination map and then refine it by solving a mathematical model.
This paper introduces a novel method based on the Retinex theory to estimate the illumination map. The proposed method employs a new mathematical model with a differentiable objective function, unlike other similar models. This allows us to use more diverse methods to solve the proposed model, as classical optimization methods such as Newton, Gradient, and Trust-Region methods need the objective function to be differentiable. The proposed model also has linear constraints and is convex, which are desirable properties for optimization. We use the CPLEX solver to solve the proposed model, as it performs well and exploits the features of the model. Finally, we improve the illumination map obtained from the mathematical model using a simple linear transformation.
This paper introduces a new method based on the Retinex theory for enhancing low-light images. The proposed method improves the illumination and the visibility of the scene details. We compare the performance of our method with six existing methods: AMSR, NPE, SRIE, DONG, MF, and LIME. We use four common metrics to evaluate the visual quality of the enhanced images: AMBE, LOE, SSIM, and NIQE. The results demonstrate that our method is competitive with many of the state-of-the-art methods for low-light image enhancement.

Navid Samimi, Samad Nejatian, Hamid Parvin, Karamolah Bagheri Fard, Vahideh Rezaei,
Volume 21, Issue 3 (12-2024)
Abstract

Clustering is one of the fundamental tools in data analysis and data mining, enabling the extraction of hidden and meaningful structures from large datasets by grouping data based on intrinsic similarities. However, selecting optimal clusters in conventional clustering algorithms poses challenges, especially when clusters are dense or heterogeneous. In this study, a novel genetic algorithm-based method is proposed to identify the most stable clusters in ensemble clustering. By leveraging cluster stability criteria and a correlation matrix, the proposed approach improves the accuracy and stability of the final clustering results. The proposed method involves generating initial partitions of the data using six different clustering algorithms. Next, the Fisher criterion is applied to identify more stable clusters. These selected clusters are then evaluated and optimized using a genetic algorithm to construct an optimized correlation matrix. This matrix is subsequently fed into a hierarchical clustering algorithm, which produces the final consensus clustering. The proposed method was tested on standard datasets. Results demonstrated improvements of 12% and 5% in NMI and ARI metrics, respectively, compared to previous methods. The use of a genetic algorithm enabled the identification of clusters with higher stability and diversity, reducing the impact of noise and increasing the accuracy of the final clustering. Moreover, the method outperformed individual base clustering algorithms in providing more precise clustering results. Due to its ability to enhance the accuracy and stability of clustering, the proposed method holds potential for applications in domains such as big data analysis, machine learning, and information retrieval. The use of the Fisher criterion for selecting stable clusters and genetic algorithms for optimization are among the strengths of this research. This method not only preserves diversity among clusters but also significantly enhances clustering accuracy. Future studies could explore the combination of this approach with more advanced algorithms to assess its applicability to more complex datasets.

Mehdi Naghavi, Mohmmad Reza Hassani Ahangar, Ali Amiri Jezeh,
Volume 21, Issue 3 (12-2024)
Abstract

Named Entity Recognition (NER) has emerged as a critical and highly applicable task in the field of Natural Language Processing (NLP). Its significance stems from its essential role in numerous NLP applications, such as machine translation, question answering, text summarization, and information extraction. Recent studies highlight the substantial impact of advancements in Artificial Intelligence (AI), particularly Deep Neural Networks (DNNs), on improving the performance of NER systems.
Deep Neural Networks, with their ability to learn complex patterns and extract rich features, have opened new horizons in addressing NLP challenges. These methods leverage advanced language models like BERT and GPT to enable deeper comprehension of linguistic structures and semantic relationships. One of their prominent capabilities is to capture long-term dependencies in complex sentences while reducing the reliance on manually engineered features.
This research introduces a novel hybrid approach for Named Entity Recognition in both Persian and English languages, based on deep neural networks and semantic language models. To address the dependency on large datasets, the proposed method employs an iterative logic mechanism that facilitates effective learning with limited data. The proposed system was evaluated on three datasets: The CoNLL 2003 dataset for English, Two Persian datasets, Arman and Peyma.
Experimental results demonstrate that the proposed method achieves F1-scores of 95.3, 96.32, and 94.72 on the CoNLL, Arman, and Peyma datasets, respectively. These scores reflect significant improvements over previous methods.
The findings of this study suggest that combining advanced language models with deep neural networks can significantly enhance the accuracy and efficiency of NER systems. These achievements pave the way for developing effective NLP tools for low-resource languages, particularly Persian, and enable the application of this technology in both industrial and research contexts.

Mr Mohammad Rastgoo, Phd Hamid Reza Ghaffari,
Volume 21, Issue 4 (3-2025)
Abstract

The potential of social networks to extract valuable insights into user behavior has become a focal point of research. With the proliferation of social media platforms, people are increasingly sharing their experiences online. This wealth of user-generated data provides unique opportunities to understand movement patterns and predict future behavior. Location-based social networks like Foursquare exemplify this, allowing users to check in at various locations and enabling researchers to analyze these data points.By analyzing the data collected from these platforms, we can uncover patterns in user behavior, such as frequently visited locations and the factors influencing these choices. This information can be invaluable for businesses and urban planners.To improve the accuracy of predicting a user's next location, this study focuses on identifying the most influential friends or individuals in a user's social network. Factors such as the strength of these relationships, historical visit data, and temporal-spatial characteristics are considered. Additionally, the study emphasizes the importance of data quality, focusing on locations that have been visited more than 100 times to ensure reliability.
A key aspect of this research is understanding the influence of social connections on individual behavior. By analyzing the overlap in visited locations between friends, the study aims to identify the most influential friends for each user. These influential friends are then used to predict the user's next location.
The proposed method employs machine learning techniques, specifically RandomForest and recurrent neural networks (LSTM, RNN, and GRU), to predict user behavior. RandomForest is used to analyze the data and identify the most significant features, while recurrent neural networks are employed to model the sequential nature of user behavior. Among these, LSTM achieved the highest accuracy of 71% in predicting users' next locations.This research demonstrates that combining artificial intelligence with spatial-temporal data can provide profound insights into human behavior in urban and digital environments. By understanding these patterns, businesses can tailor their offerings to individual customers, and urban planners can design more efficient and user-friendly cities.


Zahra Firuz Mahjanabadi, Pouria Jafari, Mehdi Rezaei,
Volume 22, Issue 1 (5-2025)
Abstract

The properties of steels are intrinsically dependent on their microstructural components, known as phases, which form during the manufacturing process. Different steel phases can be observed in microscopic images of steel surfaces. Automatic detection and classification of these phases from images can significantly enhance the understanding of steel properties with improved speed and accuracy. This paper introduces, for the first time, an intelligent and automated method for classifying steel phases from microscopic images. This process requires defining and extracting suitable texture features unique to these images and segmenting the images into highly irregular regions based on the extracted features. To achieve this, the input image is initially divided into blocks, and texture features are extracted independently for each block. The dimensionality of these features is then reduced using Principal Component Analysis, and the refined features are subsequently fed into a Softmax neural network for classification.
The implementation results indicate that the proposed method achieves an accuracy of over 99% in distinguishing between two phases: acicular ferrite and granular ferrite. Furthermore, it attains an accuracy exceeding 86% when classifying three phases: granular ferrite, acicular ferrite, and Widmanstätten ferrite. This suggests that the widely used and conventional k-means clustering method, as a traditional machine learning approach, is incapable of effectively distinguishing microscopic steel phase blocks using extracted texture features. Notably, as of the writing of this paper, no prior research has been conducted on the automatic classification of different ferrite phases, making this study a novel contribution to the field.
In this research, an automated classification algorithm for ferrite phase structures in SEM images of steel is proposed using texture feature extraction methods and machine learning models. The dataset comprises images of 1024×768 resolution, which were divided into 128×128 blocks, with classification performed independently for each block. Due to the limited number of blocks available for training machine learning models, data augmentation techniques such as rotation and scaling were applied to increase the dataset size. Various image processing methods were used to extract 128 texture features. These extracted features were then used to classify different ferrite phases using two machine learning models: k-means clustering and the Softmax neural network. Additionally, PCA was employed to reduce feature dimensionality, which positively impacted the classification of granular and acicular ferrite. While k-means clustering, as a conventional and widely used machine learning method, failed to achieve satisfactory classification accuracy, the proposed approach using a smooth maximum neural network demonstrated exceptional performance. Despite the complex and irregular nature of ferrite shapes, the selected features and the proposed algorithm successfully achieved over 99% accuracy for two-phase classification and over 86% accuracy for three-phase classification.

Pardis Moradbeiki, Alireza Basiri,
Volume 22, Issue 1 (5-2025)
Abstract

The aim of the request recognition task in social networks is to understand the intent behind the posts, comments, or messages shared by users. Many businesses are actively present on various social networks, making it crucial to identify user needs for marketers in this space to foster the growth of online businesses and e-commerce. Detecting request messages automatically and filtering them is essential. However, social network messages often contain slang and numerous spelling errors, posing challenges for research in this domain. While extensive research has been conducted in English, studies on this task in Persian are limited. Telegram stands out as the most popular social network in Iran, with a large Persian-speaking user base. This study utilized a standard labeled Persian dataset from Telegram for training and testing purposes, comprising 85741 messages from the platform, evenly split between request and non-request categories. To tackle the significant challenges posed by sarcastic messages and spelling mistakes on social media platforms, we devised a multi-step hybrid strategy.
The initial step involves preprocessing. Social media data typically consists of unstructured and slang-ridden user messages, necessitating preprocessing to enhance Persian text processing and reduce slang usage. The pre-processing phase is crucial when dealing with social media platforms. Because Telegram is unique compared to other platforms the data cleaning process varies. This study's accomplishment includes developing a unique dataset and filtering out noise from Telegram enhancing improvement in the pre-processing phase. Also, this involves normalizing different word forms, such as "beautiful" and "beauty," to maintain the integrity of word meanings.
The subsequent step focuses on feature extraction. Various approaches to feature extraction come with their own set of advantages and drawbacks. Hence, we employed hybrid feature extraction methods to address this complexity. While Tf-Idf methods assess word importance without considering meaning, FastText retains semantic similarity. By combining the bag of words and FastText methods, our research aims to enhance accuracy. The final step involves classification, where deep learning networks are utilized to evaluate these features.
Experimental findings indicate that our final model achieves precision, recall, and f-score rates of nearly 90%, representing a 5% improvement on average compared to previous methodologies.
Ahmadreza Zarei, Dr. Payman Moallem,
Volume 22, Issue 3 (12-2025)
Abstract

Matching remote sensing images is a fundamental step in many image processing applications. Unlike regular images, remote sensing images often undergo complex and nonlinear background changes, making them difficult to match. They also pose challenges such as scale variations, rotation, and different viewing angles. One commonly used method for finding corresponding points between images is the Scale-Invariant Feature Transform (SIFT) algorithm; however, it often produces many incorrect matches when applied to such data. In contrast, deep learning-based approaches can extract and compare medium and high-level features for more accurate matching. Inspired by these advances, this work introduces a method that combines the SIFT algorithm with a Siamese deep neural network to improve the matching of remote sensing images.
The proposed method modifies the conventional SIFT by adjusting its parameters to increase the proportion of correct to incorrect correspondences. After keypoints are extracted and described, initial correspondences are established. Then, for each matched point, a local patch is extracted based on the keypoint’s position, scale, and orientation. These patch pairs are input to a trained Siamese network that estimates the probability of a correct match. Matches with confidence below a threshold are rejected. This hybrid approach leverages the strengths of both traditional and deep learning-based techniques to enhance accuracy. The proposed approach introduces several key innovations, including optimized keypoint extraction to maximize true matches, patch-based feature representation aligned with local image geometry, and a neural network-based verification step to suppress incorrect matches. Based on experiments conducted on a dataset of 35 pairs of remote sensing images, and comparing the results with the SIFT algorithm and deep learning-based methods, the proposed approach achieved an accuracy of 0.849 by reducing false matches and increasing correct ones.

Mr Omid Eslamifar, Dr Mohammadreza Soltani, Dr Seyed Mohamadjalal Rastegar Fatemi,
Volume 22, Issue 3 (12-2025)
Abstract

Understanding the structural and morphological characteristics of blood cells plays a crucial role in the early diagnosis and treatment of hematological disorders. Manual inspection of blood smears under a microscope is still the standard approach in many laboratories; however, this process is subjective, time-consuming, and highly dependent on the expertise of the hematologist. To overcome these limitations, the present study introduces an intelligent hybrid framework for multiclass classification of heterogeneous blood cells based on the integration of deep learning and metaheuristic optimization techniques.
In the proposed approach, the wavelet coefficients of microscopic images are first extracted to capture discriminative frequency-domain features. These coefficients are then fed into a YOLO-based convolutional neural network to detect candidate cell regions and identify spatial characteristics. A customized CNN architecture is subsequently employed for hierarchical feature learning, while a Golden Eagle Optimization (GEO) algorithm is utilized to perform feature selection and dimensionality reduction by eliminating redundant and less informative attributes.
To achieve robust decision-making, three classical classifiers Decision Tree (DT), Naïve Bayes (NB), and K-Nearest Neighbors (KNN) are combined through a weighted voting ensemble strategy. The model was trained and validated on a dataset consisting of microscopic images of five major white blood cell types: lymphocytes, monocytes, eosinophils, basophils, and neutrophils. Quantitative evaluation was performed using precision, recall, F1-score, and accuracy metrics.
Experimental results demonstrate that the proposed CNN GEO ensemble model achieves an overall accuracy of 95.7% and an average F1-score of 94.9%, outperforming comparable state-of-the-art methods such as CNN+SVM, PSO+KNN, and VGG-16 in both accuracy and computational efficiency. The findings highlight the capability of the proposed system to accurately distinguish among multiple blood cell categories, thereby providing a reliable and automated decision-support tool for early hematological diagnosis. Future work will focus on expanding the dataset and integrating domain adaptation mechanisms to further enhance cross-laboratory generalization.

Mr. Elyas Mosayebi, Dr. Reza Ebrahimi Atani,
Volume 22, Issue 3 (12-2025)
Abstract

In the era of digital transformation, government agencies and corporations increasingly rely on electronic services, generating vast volumes of sensitive data stored in distributed databases. While these records hold immense potential for knowledge discovery through data mining, their publication or sharing raises critical privacy concerns, particularly when sensitive individual information is at risk. Traditional Privacy-Preserving Distributed Data Publishing (PPDDP) methods rely heavily on Trusted Third-Party (TTP) intermediaries and Secure Multi-Party Computation (SMC), which introduce systemic vulnerabilities such as communication bottlenecks, synchronization failures, insider attacks, and inherent distrust in centralized entities. In healthcare analytics, hospitals leverage patient data to enhance diagnostic precision, optimize clinical workflows, and advance preventive and precision medicine. Yet, reliance on siloed datasets from individual institutions often restricts model generalizability and impedes comprehensive insights into health outcomes. Patient health is a multidimensional construct influenced not only by genetic and biological factors but also by behavioral patterns and socio-environmental determinants. Cross-institutional collaboration integrating diverse datasets from geographically distributed sources is essential to develop robust analytical models. However, such collaboration raises critical privacy concerns, as centralized aggregation of sensitive data risks exposure to breaches or misuse. Our probabilistic framework for privacy-preserving distributed data publishing directly addresses this challenge. By eliminating dependencies on trusted third parties and secure multi-party computation, our approach enables secure, decentralized integration of heterogeneous healthcare data. Through uncertainty-aware probabilistic anonymization and adaptive noise injection, the framework ensures compliance with stringent privacy regulations (e.g., GDPR, CPRA, HIPAA) while preserving the analytical utility required for accurate, actionable health outcome predictions. This balance of utility and privacy empowers researchers to harness the full potential of distributed datasets without compromising individual confidentiality, ultimately fostering innovation in precision medicine and population health management. This paper introduces a novel probabilistic framework for privacy preservation in distributed environments, eliminating dependencies on TTP and SMC. Unlike existing approaches, this method leverages uncertainty-aware probabilistic models to dynamically anonymize and perturb data across distributed nodes while preserving global data utility. First a survey of privacy preservation data publishing methods is presented in this paper and then we discuss about prose and cons of the techniques. After this we present the model and its implementation details. The results obtained by security evaluations shows that the presented method will balance out the privacy security and the accuracy of distributed data better, using the probability model without needing a Trusted Third-Party and Secure Multi-party Computation.


Niloufar Khosravirad, Reza Ahsan, Ahmad Sharif, Ali Karimi,
Volume 22, Issue 4 (3-2026)
Abstract

With the swift advancement of the Internet of Things (IoT), Vehicular Ad Hoc Networks (VANETs) have become a crucial component in enabling smart transportation systems by supporting real-time communication between vehicles and roadside units (RSUs). In these networks, vehicles function as mobile nodes that generate and transmit data across the system. A major challenge in VANETs is ensuring the integrity and trustworthiness of shared messages, as any malicious or inaccurate information could severely impact safety and system performance. This research introduces a trust management framework that integrates VANET with blockchain technology and fuzzy logic to improve the reliability of vehicle-to-vehicle communication. When an event is detected, a vehicle instantly broadcasts a corresponding message. RSUs then evaluate the sender’s trust level and verify the message before validation. To minimize communication overhead and avoid duplication, repeated messages are filtered prior to distribution. Unlike conventional trust models that depend on computationally heavy consensus mechanisms such as Proof of Work (PoW), the proposed system adopts a Chord-based distributed architecture. This approach significantly lowers processing times and boosts scalability. The framework utilizes a multi-phase trust evaluation process involving message scoring, dynamic trust calculation, and formation of evaluator groups by RSUs. Simulations reveal notable gains in message credibility: a 6% increase compared to the Score-Based Trust Management System (SBTMS) and an 11% improvement over PoW-based approaches. These results underline the effectiveness of the proposed model in achieving a balance between security, scalability, and low latency in VANET environments. By merging VANET architecture with decentralized trust mechanisms and soft computing techniques, this study presents an innovative and pragmatic solution to one of the key challenges in vehicular communications—facilitating secure, efficient, and trustworthy message exchange in highly dynamic, distributed networks.



Page 4 from 5     

© 2015 All Rights Reserved | Signal and Data Processing