Volume 13, Issue 4 (3-2017)                   JSDP 2017, 13(4): 43-62 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Vafaei Jahan M. Feature Extraction of Computer Files Structure by Statistical Analysis . JSDP 2017; 13 (4) :43-62
URL: http://jsdp.rcisp.ac.ir/article-1-141-en.html
Islamic azad university mashhad branch
Abstract:   (5473 Views)

Files are the most important sources of information presenting in various formats such as texts, audio, video, images, web pages, etc. …; (in-depth) analysis of files for the purpose of recognition and investigating their unique properties (or characteristics) is one of the most significant issues in the field of personal security safety, information security, file-type identification, codes structuration analysis etc…. Statistical analytic methodology of working on the binary files contents based on the n-gram model has been opted for in the present paper in order to full investigate all different aspects of a file’s range of characteristics. Moreover, to reduce down the calculations volume and the n-gram model peculiar to the needed amount of memory, use has been made of word clustering. Later on analysis has been conducted on both files’ contents in two states of “blocking” and “full”: it is to be noted that in the “full” case such characteristics as Chi-square, Auto-correlation, Weighted term frequency-Inverse document frequency (TF-IDF), Fractal dimension etc … have been brought under comprehensive study; while in the “blocking” case, other properties like the entropy rate, the distance, etc … have been delved into. The gained results indicate that the extracted characteristics in the first method could well easily reflect the unique properties belonging to jpg, mp3, swf and html files; and in the second method, are able to clearly well reflect doc, html and pdf files properties.

Full-Text [PDF 4938 kb]   (1299 Downloads)    
Type of Study: Research | Subject: Paper
Received: 2013/07/3 | Accepted: 2016/10/5 | Published: 2017/06/6 | ePublished: 2017/06/6

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing