Document images produced by scanners or digital cameras usually have photometric and geometric distortions. If either of these effects distorts document, recognition of words from such a document image using OCR is subject to errors. In this paper we propose a novel approach to significantly remove geometric distortion from document images. In this method first we extract document lines from document using morphological operators. Then, extracted document lines are divided into a number of equal size column strips.
This allows to assume that each segment of line document is not curved. Each extracted document line segment is aligned horizontally. For this purpose, a segment line of document is rotated at different angels and for each rotation horizontal projection is obtained. The rotation angle with maximum peak at the corresponding projection signal is selected to align the line segment, horizontally. In order to estimate the geometrical distortion, for each document line a reference point is extracted from each line segment. These points indicate the position of a document line at starting column of line segments. Using reference points of a document line a polynomial function is fitted to each document line. At the end, geometric distortion for each part of the document is eliminated using a perspective transformation.
This transformation is estimated based on the extracted polynomial function. To increase the stability of the proposed method for short text lines, the curve of adjacent text lines of longer length is used. A post processing stage is required after applying perspective transformation on document patches. Since this transformation is a continuous mapping but it is applied on digital images. To remove this distortion from the result, the consistency of each pixel value with the value of neighboring pixels are considered to correct the value of inconsistence pixels.
The proposed method is implemented on Persian and English databases and has been compared with the existing methods. The results indicate the efficiency and accuracy of the proposed method in elimination of geometric distortions.
Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optimal clustering of the data simultaneously.To increase the accuracy and speed of convergence, the structure of ICA changes. As in different applications, there is a need for data clustering which the number of clusters is not known before it is necessary to have methods that can cluster data without knowing the correct prediction of the number of clusters. In the other words, the proposed algorithm requires no background knowledge to classify the data. In addition, the proposed method is more accurate in comparison with other clustering methods based on evolutionary algorithms. In Imperialist Competitive Algorithm, firstly steps should be taken to increase search rates and explore possible solution while approaching to the global optimal response the steps should be reduced to ensure that the algorithm is not lost and it is not in the local optimal manner. For this purpose and improvement of imperialist competitive algorithm, mutation rate and revolution operator's operation rate are determined dynamically. DB and CS are cluster validity Indexes. In this paper, DB and CS cluster validity measurements are used as the objective function. To demonstrate the superiority of the proposed method, the average of fitness function and the number of clusters determined by the proposed method is compared with three automatic clustering algorithms based on evolutionary algorithms. The partitional clustering algorithms are based on three powerful well-known optimization algorithms, namely the genetic algorithm, the particle swarm optimization and differential evolutionary algorithm.
In robotic applications and especially 3D map generation of indoor environments, analyzing RGB-D images have become a key problem. The mapping problem is one of the most important problems in creating autonomous mobile robots. Autonomous mobile robots are used in mine excavation, rescue missions in collapsed buildings and even planets’ exploration. Furthermore, indoor mapping is beneficial in finding and rescuing missions. With recent advances, mobile robots are used in hazardous missions such as radioactive areas or collapsing buildings. Having the environment’s map beforehand can boost efficiency and effectiveness of the mission. In order to digitize the environment, several 3D scans are needed. However, these scans should be merged according to a global coordination system to create a correct, consistent model. This process is called image registration. If the robot with 3D scanner is able to accurately localize itself, the registration can be done directly by robots pose. However, due to imprecise robot sensors, self-localization is error prone. Therefore, the geometric structure of overlapping 3D scans is considered. In order to registering various points sets, Iterative Closest Point (ICP) algorithm is used. ICP is the most common approach to align point clouds in two consecutive image frames. This algorithm uses a point to point approach. RGB and depth images which are captured by Kinect are used in this study. In order to reducing data points and performing faster 3D map creation, depth images are converted to point clouds and then segmentation is done according to image planes. For this purpose
Due to ever-increasing information expansion and existing huge amount of unstructured documents, usage of keywords plays a very important role in information retrieval. Because of a manually-extraction of keywords faces various challenges, their automated extraction seems inevitable. In this research, it has been tried to use a thesaurus, (a structured word-net) to automatically extract them. Authors claim that extraction of more meaningful keywords out of documents can be attained via employment of a thesaurus. The keywords extracted by applying thesaurus, can improve the document classification. The steps to be taken to increase the comprehensiveness of search should be such that in the first step the stop words are removed and the remaining words are stemmed. Then, with the help of a thesaurus are found words equivalent, hierarchical and dependent. Then, to determine the relative importance of words, a numerical weight is assigned to each word, which represents effect of the word on the subject matter and in comparison with other words used in the text. According to the steps above and with the help of a thesaurus, an accurate text classification is performed. In this method, the KNN algorithm is used for the classification. Due to the simplicity and effectiveness of this algorithm (KNN), there is a great deal of use in the classification of texts. The cornerstone of KNN is to compare with the text trained and text tested to determine their similarity between. The empirical results show the quality and accuracy of extracted keywords are satisfiable for users. They also confirm that the document classification has been enhanced. In this research, it has been tried to extract more meaningful keywords out of texts using thesaurus (which is a structured word-net) rather than not using it.
Pose estimation is a process to identify how a human body and/or individual limbs are configured in a given scene. Hand pose estimation is an important research topic which has a variety of applications in human-computer interaction (HCI) scenarios, such as gesture recognition, animation synthesis and robot control. However, capturing the hand motion is quite a challenging task due to its high flexibility. Many sensor-based and vision-based methods have been proposed to fulfill the task.
In sensor-based systems, specialized hardware is used for hand motion capture. Generally, vision-based hand pose estimation methods can be divided into two categories: appearance-based methods and model-based methods. In appearance-based approaches, various features are extracted from the input images to estimate the hand pose. Usually a lot of training samples are used to train a mapping function from the features to the hand poses in advance. Given the learned mapping function, the hand pose can be estimated efficiently. In model-based approaches the hand pose is estimated by aligning a projected 3D hand model to the extracted hand features in the inputs. Therefore, the desired information to be provided includes state at any time. These methods require a lot of calculations which are not possible in practice to implement them immediately.
Hand pose estimation using (color/depth) images consist of three steps:
To extract necessary features for pose estimation, depending on used model and usage of hand gesture analysis, features such as fingertips position, number of fingers, palm position and joint angles are extracted.
In this paper a model-based markerless dynamic hand poses estimation scheme is presented. Motion Capture is the process of recording a live motion event and translating it into usable mathematical terms by tracking a number of key points in space over time and combining them to obtain a single 3D representation of the performance. The sequence of depth images, color images and skeleton data obtained from Kinect (a new tool for markerless motion capture) at 30 frames per second are as inputs of this scheme. The proposed scheme exploits both temporal and spatial features of the input sequences, and focuses on index and thumb fingertips localization and joint angles of the robot arm to mimic the user's arm movements in 3D space in an uncontrolled environment. The RoboTECH II ST240 is used as a real robot arm model. Depth and skeleton data are used to determine the angles of the robot joints. Three approaches to identify the tip of the thumb and index fingers are presented using existing data, each with its own limitations. In these approaches, concepts such as thresholding, edge detection, making convex hull, skin modeling and background subtraction are used. Finally, by comparing tracked trajectories of the user's wrist and robot end effector, the graphs show an error about 0.43 degree in average which is an appropriate performance in this research.
The key contribution of this work is hand pose estimation per every input frame and updating arm robot according to estimated pose. Thumb and index fingertips detection as part of feature vector resulted using presented approaches. User movements transmit to the corresponding Move instruction for robot. Necessary features for Move instruction are rotation values around joints in different directions and opening value of index and thumb fingers at each other.
Blood pressure is one of the vital signs. Specially, it is crucial for some cases such as hypertension patients and it should be monitored continuously in ICU/CCU. It must be noted that current systems to measure blood pressure, often require trained operators. As an example, in post-hospital cares, blood pressure control is difficult except with the presence of a nurse or use of a device that minimizes the patient's involvement in the measurements. In this way, Photoplotysmography (PPG), which is a noninvasive method for pulse wave recording, seems to be ideal to make simple tools for blood pressure measurement in home care. In other words, it is so helpful or rather necessary to design a non-invasive, cuff-less, subject-independent system for blood pressure measurement.
In this study, two optical sensors were located on the finger and the wrist. Twenty healthy volunteers in different situations were examined to record PPG signals. Also, blood pressure values were measured by cuff-based noninvasive blood pressure system on left arm as a reference value. Recorded signals were filtered and processed in MATLAB R2014a software. To promote the estimation accuracy and subject-independency, 16 temporal features in addition to the pulse transit time (PTT) were extracted from the wrist PPG signal. To estimate blood pressure values, three neural networks were used as the estimator: Feedforward Neural Network (FFN), Redial Basis Function Neural Network (RBFN) and General Regression Neural Network (GRNN). After comparison of their results; the General Regression Neural Network was used for blood pressure estimation. The MSE errors estimated by the best estimator, were 0.11±1.18 mmHg and 0.15±2.3 mmHg for systole and diastole pressure respectively.
Business process management systems (BPMS) are vital complex information systems to compete in the global market and to increase economic productivity. Workload balancing of resources in BPMS is one of the challenges have been long studied by researchers. Workload balancing of resources increases the system stability, improves the efficiency of the resources and enhances the quality of their products. Workload balancing of resources in BPMS is considered as an important factor of the performance and the stability in systems. Setting the workload of each source at a certain level increases the efficiency of the resources.
The main objectives of this research are the concept of resource workload balance and uniformity of the workload for each source at a specified level. To optimize the balance workload and uniformity of each source, the setting multi-process concurrency was offered and studied. Also, the regulation of multi-process concurrency was mentioned as an optimization problem. In this paper, tuning concurrency of the business process is introduced as a problem in BPMS, which is an application issue to improve at workload balance of resources and uniformity in the workload of each resource.
To solve this problem, a delay vector is defined, each element of delay vector makes the synthetic delay at the first of each business process, then a dynamic optimization algorithm is presented to compute delay vector and the speed of the proposed algorithms is compared with and state-space search algorithm and evolutionary algorithm of PSO. The comparison shows that the speed of the proposed algorithm is 37 hours to 5.8 years compared to the state-space search algorithm, while the POS algorithm solves the same problem in just 3 minutes. The experimental results on a real dataset show 21.64 percent improvement in the performance of the proposed algorithm.
Data collection and storage has been facilitated by the growth in electronic services, and has led to recording vast amounts of personal information in public and private organizations databases. These records often include sensitive personal information (such as income and diseases) and must be covered from others access. But in some cases, mining the data and extraction of knowledge from these valuable sources, creates the need for sharing them with other organizations. This would bring security challenges in user’s privacy. The concept of privacy is described as sharing of information in a controlled way. In other words, it decides what type of personal information should be shared and which group or person can access and use it. “Privacy preserving data publishing” is a solution to ensure secrecy of sensitive information in a data set, after publishing it in a hostile environment. This process aimed to hide sensitive information and keep published data suitable for knowledge discovery techniques. Grouping data set records is a broad approach to data anonymization. This technique prevents access to sensitive attributes of a specific record by eliminating the distinction between a number of data set records. So far a large number of data publishing models and techniques have been proposed but their utility is of concern when a high privacy requirement is needed. The main goal of this paper to present a technique to improve the privacy and performance data publishing techniques. In this work first we review previous techniques of privacy preserving data publishing and then we present an efficient anonymization method which its goal is to conserve accuracy of classification on anonymized data. The attack model of this work is based on an adversary inferring a sensitive value in a published data set to as high as that of an inference based on public knowledge. Our privacy model and technique uses a decision tree to prevent publishing of information that removing them provides privacy and has little effect on utility of output data. The presented idea of this paper is an extension of the work presented in [20]. Experimental results show that classifiers trained on the transformed data set achieving similar accuracy as the ones trained on the original data set.
Stress has affected human’s lives in many areas, today. Stress can adversely affect human’s health to such a degree as to either cause death or indicate a major contributor to death. Therefore, in recent years, some researchers have focused to developing systems to detect stress and then presenting viable solutions to manage this issue.
Generally, stress can be identified through three different methods including (1) Psychological Evaluation, (2) Behavioral Responses and finally (3) Physiological Signals. Physiological signals are internal signs of functioning the body, and therefore nowadays are commonly used in various medical and non-medical applications. Since these signals are correlated with the stress, they have been commonly used in detection of the stress in humans. Photoplethysmography (PPG) and Galvanic Skin Response (GSR) are two of the most common signals which have been widely used in many stress related studies. PPG is a noninvasive method to measure the blood volume changes in blood vessels and GSR refers to changes in sweat gland activity that are reflective of the intensity of human emotional state.
Design and fabrication of a real-time handheld system in order to detect and display the stress level is the main aim of this paper. The fabricated stress monitoring device is completely compatible with both wired and wireless sensor devices. The GSR and PPG signals are used in the developed system. The mentioned signals are acquired using appropriate sensors and are displayed to the user after initial signal processing operation. The main processor of the developed system is ARM-cortex A8 and its graphical user interface (GUI) is based on C++ programming language. Artificial Neural Networks such as MLP and Adaptive Neuro-Fuzzy Inference System (ANFIS) are utilized to modeling and estimation of the stress index. The results show that ANFIS model have a good accuracy with a coefficient of determination values of 0.9291 and average relative error of 0.007.
Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemble clustering combines results of existing clusterings to achieve better performance and higher accuracy. Instead of combining all of existing clusterings, recent decade researchers show, if only a set of clusterings is selected based on quality and diversity, the result of ensemble clustering would be more accurate. This paper proposes a new method for ensemble clustering based on quality and diversity. For this purpose, firstly first we need a lot of different base clusterings to combine them. Different base clusterings are generated by k-means algorithm with random k in each execution. After the generation of base clusterings, they are put into different groups according to their similarities using a new grouping method. So that clusterings which are similar to each other are put together in one group. In this step, we use normalized mutual information (NMI) or adjusted rand index (ARI) for computing similarities and dissimilarities between the base clustering. Then from each group, a best qualified clustering is selected via a voting based method. In this method, Cluster-validity-indices were used to measure the quality of clustering. So that all members of the group are evaluated by the Cluster-validity-indices. In each group, clustering that optimizes the most number of Cluster-validity-indices is selected. Finally, consensus functions combine all selected clustering. Consensus function is an algorithm for combining existing clusterings to produce final clusters. In this paper, three consensus functions including CSPA, MCLA, and HGPA have used for combining clustering. To evaluate proposed method, real datasets from UCI repository have used. In experiment section, the proposed method is compared with the well-known and powerful existing methods. Experimental results demonstrate that proposed algorithm has better performance and higher accuracy than previous works.
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. For example, annotated tree bank data has been crucial in syntactic research to test linguistic theories of sentence structure against large quantities of naturally occurring examples.
The natural language parser consists of two basic parts, POS tagger and the syntax parser. A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some languages and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'noun-plural'. A natural language parser is a program that works out the grammatical structure of sentences, for instance, which groups of words go together (as "phrases") and which words are the subject or object of a verb.
Probabilistic parsers use knowledge of language gained from hand-parsed sentences to try to produce the most likely analysis of new sentences. These statistical parsers still make some mistakes, but commonly work rather well. Inaccurate design of context-free grammars and using bad structures such as Chomsky normal form can reduce accuracy of probabilistic context-free grammar parser.
Weak independence assumption is one of the problems related to CFG. We have tried to improve this problem with parent and child annotation, which copies the label of a parent node onto the labels of its children, and it can improve the performance of a PCFG.
In grammar, a conjunction (conj) is a part of speech that connects words, phrases, or clauses that are called the conjuncts of the conjunctions. In this study, we examined the conjunction phrases in the Persian tree bank. The results of this study show that adding structural dependencies to grammars and modifying the basic rules can remove conjunction ambiguity and increase accuracy of probabilistic context-free grammar parser.
When a part-of-speech (PoS) tagger assigns word class labels to tokens, it has to select from a set of possible labels whose size usually ranges from fifty to several hundred labels depending on the language. In this study, we have investigated the effect of fine and coarse grain POS tags and merging non-terminals on Persian PCFG parser.
There are huge petitions of network traffic coming from various applications on Internet. In dealing with this volume of network traffic, network management plays a crucial rule. Traffic classification is a basic technique which is used by Internet service providers (ISP) to manage network resources and to guarantee Internet security. In addition, growing bandwidth usage, at one hand, and limited physical capacity of communication lines, at the other hand, lead providers to improve utilization quality of network resources. In fact, classification or identification of network is a critical task in network processing for traffic management, anomaly detection, and also to improve network quality-of-service (QoS). Port and payload based methods are two classical techniques which are applicable under traditional network conditions. However, many Internet applications use dynamic port numbers for communications, which lead to difficulties in identifying traffic using port numbers. Also many applications encrypt the data before transmitting to avoid detection. Therefore, payload-based techniques are inefficient for these traffics. In recent years, statistical feature-based traffic flow identification methods (STFIM) have attracted the interest of many researchers. The most important part of a STFIM is the selection of efficient statistical features.
Preliminary analysis shows that the problem of packet loss in data transmission is one of the major challenges in employing STFIM for network traffic identification. This affects the statistical characteristics of packets, such as the time interval between sending successive application packets, and in some cases significantly reduces the accuracy of traffic identification. The main goal of this paper is to examine the effects of packet loss on statistical features, and therefore the accuracy of identifying applications, as well as extracting appropriate features to overcome these effects. For this purpose, the behavior of four statistical features, including the packet size, the time interval between sending and receiving packets, the duration of the flows and the rate of sending packets, are investigated; then applications traffics are identified via considering characteristics of their distribution.
We collected a database of network traffic flow from seven applications with different rates of packet loss. We used the extracted features in a multilayer neural network, as a classifier, to differentiate between different traffic applications. Experimental results show that the extracted features are robust against the packets loss, and the accuracy of the network traffic identification is close to the ideal state (traffic flow with no packet lost).
In recent years, with the growing number of online social networks, these networks have become one of the best markets for advertising and commerce, so studying these networks is very important. Most online social networks are growing and changing with new communications (new edges). Forecasting new edges in online social networks can give us a better understanding of the growth of these networks. Link prediction has many important applications. These include predicting future social networking interactions, the ability to manage and design useful organizational communications, and predicting and preventing relationships in terrorist gangs.
There have been many studies of link prediction in the field of engineering and humanities. Scientists attribute the existence of a new relationship between two individuals for two reasons: 1) Proximity to the graph (structure) 2) Similar properties of the two individuals (Homophile law). Based on the two approaches mentioned, many studies have been carried out and the researchers have presented different similarity metrics for each category. However, studying the impact of the two approaches working together to create new edges remains an open problem.
Similarity metrics can also be divided into two categories; Neighborhood-based and path-based. Neighborhood-based metrics have the advantage that they do not need to access the whole graph to compute, whereas the whole graph must be available at the same time to calculate path-based metrics.
So far, above the two theoretical approaches (proximity and homophile) have not been found together in the neighborhood-based metrics. In this paper, we first attempt to provide a solution to determine importance of the proximity to the graph and similar features in the connectivity of the graphs. Then obtained weights are assigned to both proximity and homophile. Then the best similarity metric in each approach are obtained. Finally, the selected metric of homophily similarity and structural similarity are combined with the obtained weights.
The results of this study were evaluated on two datasets; Zanjan University Graduate School of Social Sciences and Pokec online Social Network. The first data set was collected for this study and then the questionnaires and data collection methods were filled out. Since this dataset is one of the few Iranian datasets that has been compiled with its users' specifications, it can be of great value. In this paper, we have been able to increase the accuracy of Neighborhood-based similarity metric by using two proximity in graph and homophily approaches.
In this study, a Brain-Computer Interface (BCI) in Silent-Talk application was implemented. The goal was an electroencephalograph (EEG) classifier for three different classes including two imagined words (Man and Red) and the silence. During the experiment, subjects were requested to silently repeat one of the two words or do nothing in a pre-selected random order. EEG signals were recorded by a 14 channel EMOTIV wireless headset. Two combinations of features and classifiers were used: Discrete Wavelet Transform (DWT) features with Support Vector Machine (SVM) classifier and Principle Component Analysis (PCA) features with a Minimum-Distance classifier. Both combinations were capable of discriminating between the three classes much better than the chance level (33.3%), none of them was reliable and accurate enough for a real application though. The first method (DWT+SVM) showed better results. In this case, feature set was D2, D3, D4 and A4 coefficients of 4-level DWT decomposition of the EEG signals, roughly corresponding to major frequency bands (Delta, Theta, Alpha and Beta) of these signals. Three binary SVM machines were used. Each machine was trained to classify between two of the three classes, namely Man/Red, Man/Silence or Red/Silence. Majority Selection Rule was used to determine final class. Once two of these classifiers presented the true class, a win (correct classification) was counted, otherwise a loss (false classification) was considered. Finally, Monte-Carlo Cross Validation showed an overall performance of about 56.8% correct classification which is comparable with the results reported for similar experiments.
© 2015 All Rights Reserved | Signal and Data Processing

