Signal and Data Processing -

Search published articles

Showing 7 results for Weight

A new fuzzy rules weighting approach based on Genetic Programming for imbalanced classification

Mahboubeh Mahdizadeh, Mahdi Eftekhari,
Volume 11, Issue 2 (3-2015)

Abstract

In classiﬁcation problems, we often encounter datasets with different percentage of patterns (i.e. classes with a high pattern percentage and classes with a low pattern percentage). These problems are called “classiﬁcation Problems with imbalanced data-sets”. Fuzzy rule based classification systems are the most popular fuzzy modeling systems used in pattern classification problems. Rule weights have been usually used to improve the classification accuracy and fuzzy versions of confidence and support merits have been widely used for rules weighting in fuzzy rule based classifiers. In this paper, we propose an evolutionary approach based on genetic programming to generate weighting expressions. For producing expressions confidence, support, lift and recall merits are used as terminals of genetic programming. Experiments are performed over 20 imbalanced KEEL's datasets and the results are analyzed using statistical tests. The results show that the proposed method improves the classification accuracy of FRBCS.

An Approach for Extraction of Keywords and Weighting Words for Improvement Farsi Documents Classification

Vahideh Rezaie, Mahid Mohammadpour, Hamid Parvin, Samad Nejatian,
Volume 14, Issue 4 (3-2018)

Abstract

Due to ever-increasing information expansion and existing huge amount of unstructured documents, usage of keywords plays a very important role in information retrieval. Because of a manually-extraction of keywords faces various challenges, their automated extraction seems inevitable. In this research, it has been tried to use a thesaurus, (a structured word-net) to automatically extract them. Authors claim that extraction of more meaningful keywords out of documents can be attained via employment of a thesaurus. The keywords extracted by applying thesaurus, can improve the document classification. The steps to be taken to increase the comprehensiveness of search should be such that in the first step the stop words are removed and the remaining words are stemmed. Then, with the help of a thesaurus are found words equivalent, hierarchical and dependent. Then, to determine the relative importance of words, a numerical weight is assigned to each word, which represents effect of the word on the subject matter and in comparison with other words used in the text. According to the steps above and with the help of a thesaurus, an accurate text classification is performed. In this method, the KNN algorithm is used for the classification. Due to the simplicity and effectiveness of this algorithm (KNN), there is a great deal of use in the classification of texts. The cornerstone of KNN is to compare with the text trained and text tested to determine their similarity between. The empirical results show the quality and accuracy of extracted keywords are satisfiable for users. They also confirm that the document classification has been enhanced. In this research, it has been tried to extract more meaningful keywords out of texts using thesaurus (which is a structured word-net) rather than not using it.

The Lightweight Authentication Scheme with Capabilities of Anonymity and Trust in Internet of Things (IoT)

Msc Shadi Janbabaei, Prof. Hossein Gharaee, Prof. Naser Mohammadzadeh,
Volume 15, Issue 4 (3-2019)

Abstract

The Internet of Things (IoT), is a new concept that its emergence has caused ubiquity of sensors in the human life. All data are collected, processed, and transmitted by these sensors. As the number of sensors increases,   the first challenge in establishing a secure connection is authentication between sensors. Anonymity, lightweight, and trust between entities are other main issues that should be considered. However, this challenge also requires some features so that the authentication is done properly. Anonymity, light weight and trust between entities are among the issues that need to be considered. In this study, we have evaluated the authentication protocols concerning the Internet of Things and analyzed the security vulnerabilities and limitations found in them. A new authentication protocol is also proposed using the hash function and logical operators, so that the sensors can use them as computationally limited entities. This protocol is performed in two phases and supports two types of intra-cluster and inter-cluster communication. The analysis of proposed protocol shows that security requirements have been met and the protocol is resistant against various attacks. In the end, confidentiality and authentication of the protocol are proved applying AVISPA tool and the veracity of the protocol using the BAN logic. Focusing on this issue, in this paper, we have evaluated the authentication protocols in the Internet of Things and analyzed their limitations and security vulnerabilities. Moreover, a new authentication protocol is presented which the anonymity is its main target. The hash function and logical operators are used not only to make the protocol lightweight but also to provide some computational resources for sensors. In compiling this protocol, we tried to take into account three main approaches to covering the true identifier, generating the session key, and the update process after the authentication process. As with most authentication protocols, this protocol is composed of two phases of registration and authentication that initially register entities in a trusted entity to be evaluated and authenticated at a later stage by the same entity. It is assumed that in the proposed protocol we have two types of entities; a weak entity and a strong entity. The poor availability of SNs has low computing power and strong entities of CH and HIoTS that can withstand high computational overhead and carry out heavy processing.

We also consider strong entities in the proposed protocol as reliable entities since the main focus of this research is the relationship between SNs. On the other hand, given the authenticity of the sensors and the transfer of the key between them through these trusted entities, the authenticity of the sensors is confirmed, and the relationship between them is also reliable. This protocol supports two types of intra-cluster and inter-cluster communication. The analysis of the proposed protocol shows that security requirements such as untraceability, scalability, availability, etc. have been met and it is resistant against the various attacks like replay attack, eavesdropping attack.

Low latency IIR digital filter design by using metaheuristic optimization algorithms

Mehdi Kamandar, Mr Yaser Maghsoudi,
Volume 17, Issue 1 (6-2020)

Abstract

Filters are particularly important class of LTI systems. Digital filters have great impact on modern signal processing due to their programmability, reusability, and capacity to reduce noise to a satisfactory level. From the past few decades, IIR digital filter design is an important research field. Design of an IIR digital filter with desired specifications leads to a no convex optimization problem. IIR digital filter which design by minimizing the error between frequency response of desired and designed filters with some constraints such as stability, linear phase, and minimum phase by meta heuristic algorithms has gained increasing attention. The aim of this paper is to develop an IIR digital filter designing method that can provide relatively good time response characterizations beside good frequency response ones. One of the most important required time characterizations of digital filters for real time applications is low latency. To design a low latency digital filter, minimization of weighted partial energy of impulse response of the filter is used, in this paper. By minimizing weighted partial energy of impulse response, energy of impulse response concentrates on its beginning, consequently low latency for responding to inputs. This property beside minimum phase property of designed filter leads to good time specifications. In the proposed cost function in order to ensure the stability margin the term maximum pole radius is used, to ensure the minimum phase state the number of zeros outside the unit circle is considered, to achieve linear phase the constant group delay is considered. Due to no convexity of proposed cost function, three meta-heuristc algorithms GA, PSO, and GSA are used for optimization processes. Reported results confirmed the efficiency and the flexibility of the proposed method for designing various types of digital filters (frequency selective, differentiator, integrator, Hilbert, equalizers, and …) with low latency in comparison with the traditional methods. Designed low pass filter by proposed method has only 1/79 sample delay, that is ideal for most of the applications.

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Sedigheh Vahidi Ferdosi, Hossein Amirkhani,
Volume 17, Issue 2 (9-2020)

Abstract

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in this paper we propose a method to use weighting in the ensemble clustering problem. The accuracies of base clusterings are estimated using an algorithm from crowdsourcing literature called agreement/disagreement method (AD). This method exploits the agreements or disagreements between different labelers for estimating their accuracies. It assumes different labelers have labeled a set of samples, so each two persons have an agreement ratio in their labeled samples. Under some independence assumptions, there is a closed-form formula for the agreement ratio between two labelers based on their accuracies. The AD method estimates the labelers’ accuracies by minimizing the difference between the parametric agreement ratio from the closed-form formula and the agreement ratio from the labels provided by labelers. To adapt the AD method to the clustering problem, an agreement between two clusterings are defined as having the same opinion about a pair of samples. This agreement can be as either being in the same cluster or being in different clusters. In other words, if two clusterings agree that two samples should be in the same or different clusters, this is considered as an agreement. Then, an optimization problem is solved to obtain the base clusterings’ accuracies such that the difference between their available agreement ratios and the expected agreements based on their accuracies is minimized. To generate the base clusterings, we use four different settings including different clustering algorithms, different distance measures, distributed features, and different number of clusters. The used clustering algorithms are mean shift, k-means, mini-batch k-means, affinity propagation, DBSCAN, spectral, BIRCH, and agglomerative clustering with average and ward metrics. For distance measures, we use correlation, city block, cosine, and Euclidean measures. In distributed features setting, the k-means algorithm is performed for 40%, 50%,…, and 100% of randomly selected features. Finally, for different number of clusters, we run the k-means algorithm by k equals to 2 and also 50%, 75%, 100%, 150%, and 200% of true number of clusters. We add the estimated weights by the AD algorithm to two famous ensemble clustering methods, i.e., Cluster-based Similarity Partitioning Algorithm (CSPA) and Hyper Graph Partitioning Algorithm (HGPA). In CSPA, the similarity matrix is computed by taking a weighted average of the opinions of different clusterings. In HGPA, we propose to weight the hyperedges by different values such as the estimated clustering accuracies, size of clusters, and the silhouette of clusterings. The experiments are performed on 13 real and artificial datasets. The reported evaluation measures include adjusted rand index, Fowlkes-Mallows, mutual index, adjusted mutual index, normalized mutual index, homogeneity, completeness, v-measure, and purity. The results show that in the majority of cases, the proposed weighted-based method outperforms the unweighted ensemble clustering. In addition, the weighting is more effective in improving the HGPA algorithm than CSPA. For different weighting methods proposed for HGPA algorithm, the best average results are obtained when we use the accuracies estimated by the AD method to weight the hyperedges, and the worst results are obtained when using the normalized silhouette measure for weighting. Finally, among different methods for generating base clusterings, the best results in weighted HGPA are obtained when we use different clustering algorithms to come up with different base clusterings.

A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images

Dr Mohammad Momeny, Dr Mehdi Agha Sarram, Dr Alimohammad Latif, Razieh Sheikhpour,
Volume 17, Issue 4 (2-2021)

Abstract

Convolutional neural network is one of the effective methods for classifying images that performs learning using convolutional, pooling and fully-connected layers. All kinds of noise disrupt the operation of this network. Noise images reduce classification accuracy and increase convolutional neural network training time. Noise is an unwanted signal that destroys the original signal. Noise changes the output values of a system, just as the value recorded in the output differs from its actual value. In the process of image encoding and transmission, when the image is passed through noisy transmission channel, the impulse noise with positive and negative pulses causes the image to be destroyed. A positive pulse in the form of white and a negative pulse in the form of black affect the image. The purpose of this paper is to introduce dynamic pooling which make the convolutional neural network stronger against the noisy image. The proposed method classifies noise images by weighting the values in the dynamic pooling region. In this research, a new method for modifying the pooling operator is presented in order to increase the accuracy of convolutional neural network in noise image classification. To remove noise in the dynamic pooling layer, it is sufficient to prevent the noise pixel processing by the dynamic pooling operator. Preventing noise pixel processing in the dynamic pooling layer prevents selecting the amount of noise to be applied to subsequent CNN layers. This increases the accuracy of the classification. There is a possibility of destroying the pixels of the entire window in the image. Due to the fact that the dynamic pooling operator is repeated several times in the layers of the convolutional neural network, the proposed method for merging noise pixels can be used many times. In the proposed dynamic pooling layer, pixels with a probability of p being destroyed by noise are not included in the dynamic pooling operation with the same probability. In other words, the participation of a pixel in the dynamic pooling layer depends on the health of that pixel value. If a pixel is likely to be noisy, it will not be processed in the proposed dynamic pooling layer with the same probability. To compare the proposed method, the trained VGG-Net model with medium and slow architecture has been used. Five convolutional layers and three fully connected layers are the components of the proposed model. The proposed method with 26% error for images corrupted with impulse noise with a density of 5% has a better performance than the compared methods. Increased efficiency and speed of convolutional neural network based on dynamic pooling layer modification for noise image classification is seen in the simulation results.

Improving Precision of Keywords Extracted From Persian Text Using Word2Vec Algorithm

Mohammad Reza Hasni Ahangar, Ali Amiri Jezeh,
Volume 18, Issue 1 (5-2021)

Abstract

Keywords can present the main concepts of the text without human intervention according to the model. Keywords are important vocabulary words that describe the text and play a very important role in accurate and fast understanding of the content. The purpose of extracting keywords is to identify the subject of the text and the main content of the text in the shortest time. Keyword extraction plays an important role in the fields of text summarization, document labeling, information retrieval, and subject extraction from text. For example, summarizing the contents of large texts into smaller texts is difficult, but having keywords in the text can make you aware of the topics in the text. Identifying keywords from the text with common methods is time-consuming and costly. Keyword extraction methods can be classified into two types with observer and without observer. In general, the process of extracting keywords can be explained in such a way that first the text is converted into smaller units called the word, then the redundant words are removed and the remaining words are weighted, then the keywords are selected from these words. Our proposed method in this paper for identifying keywords is a method with observer. In this paper, we first calculate the word correlation matrix per document using a feed forward neural network and Word2Vec algorithm. Then, using the correlation matrix and a limited initial list of keywords, we extract the closest words in terms of similarity in the form of the list of nearest neighbors. Next we sort the last list in descending format, and select different percentages of words from the beginning of the list, and repeat the process of learning the neural network 10 times for each percentage and creating a correlation matrix and extracting the list of closest neighbors. Finally, we calculate the average accuracy, recall, and F-measure. We continue to do this until we get the best results in the evaluation, the results show that for the largest selection of 40% of the words from the beginning of the list of closest neighbors, the acceptable results are obtained. The algorithm has been tested on corpus with 800 news items that have been manually extracted by keywords, and laboratory results show that the accuracy of the suggested method will be 78%.

Page 1 from 1

Signal and Data Processing

Search published articles

Vote