Steganalysis of Compressed Audio Files Based on Machine Learning

soleimani, mohsen; Chehel Amirani, Mahdi; Kabodian, Seied Jahanshah

doi:10.61186/jsdp.21.2.55

Volume 21, Issue 2 (10-2024) JSDP 2024, 21(2): 55-66 | Back to browse issues page

‎ 10.61186/jsdp.21.2.55

Mendeley

Zotero

RefWorks

soleimani M, Chehel Amirani M, Kabodian S J. Steganalysis of Compressed Audio Files Based on Machine Learning. JSDP 2024; 21 (2) : 5
URL: http://jsdp.rcisp.ac.ir/article-1-1273-en.html

Steganalysis of Compressed Audio Files Based on Machine Learning

Mohsen Soleimani

, Mahdi Chehel Amirani

, Seied Jahanshah Kabodian ^*

Razi Unv.

Abstract: (995 Views)

The science of hiding a message containing information in a carrier medium is called steganography, and the attempt to detect the presence or absence of a hidden message in a cover medium is called steganalysis. The MP3 compression format has been used among audio data as a suitable and comprehensive host for information encryption, and various encryption methods have been designed for this purpose. In this research, the aim is to present an algorithm for audio ateganalysis, specifically for compressed audio files in MP3 format, in which some data has been embedded using MP3stego software. To prepare encrypted data, text files with random texts have been used. First, by using the side information extracted from MP3 files, the necessary features are extracted and the audio data, which includes two categories of stego files and clean files, is divided into two parts: training data and test data. And then, using machine learning techniques (support vector machine), the detection system of infected files and clean files is designed, and finally, the efficiency of the system is measured using the test data. In this paper, a new feature called spectral peakiness (SPK) is extracted from the side information of MP3 file. The proposed system was tested using separate test data, which includes clean files and stego files with various encryption capacities, and it distinguished clean and stego files with 100% accuracy and without error. The results indicate the perfect classification of stego and clean files while reducing the computational complexity and increasing the speed of steganalysis compared to other methods.
Instead of using the audio signal information stored in the MP3 file, the proposed method uses the side information of the MP3 file, which is less dependent on the audio content of the file. In this method, the MDB side information in the compressed audio file is assumed as a sequence, and then, using a feature extraction method, a new feature in the frequency domain called spectral peakiness is calculated. This simple yet powerful feature is combined with features such as temporal average and spectral average of the MDB sequence and forms a low-dimensional (three-dimensional) feature vector. This feature vector will then be classified by a support vector machine (SVM) classifier as a suspicious file or a normal file. The feature extraction method, while being simple and having very few calculations, has 100% accuracy (recognition without any error) for MP3 files, even when the amount of the hidden information in the audio file is very low.

Article number: 5

Keywords: Compressed Audio File, Audio Steganography, Audio Steganalysis, MP3, MP3stego

Full-Text [PDF 1211 kb] (279 Downloads)

Type of Study: Research | Subject: Paper
Received: 2021/10/10 | Accepted: 2022/02/20 | Published: 2024/11/4 | ePublished: 2024/11/4

References

1. شیخ‌زادگان، جواد، آشنایی با نهان‌نگاری و پنهان‌سازی اطلاعات، تهران، انتشارات پژوهشکدة پردازش هوشمند علائم، 1392.

2. H. Ghasemzadeh, and M. H. Kayvanrad, "Comprehensive review of audio steganalysis methods", IET Signal Processing, vol. 12, pp. 673-687, 2018. [DOI:10.1049/iet-spr.2016.0651]

3. دالوند، الهام، «نهان کاوی فایل‌های صوتی MP3 با روش‌های مبتنی بر اطلاعات جانبی»، پایان‌نامة کارشناسی ارشد، دانشگاه رازی، كرمانشاه، 1394.

4. R. Kuriakose, and P. Premalatha, "A novel Method for MP3 Steganalysis", Proceedings of Intelligent Computing, Communication and Devices, ICCD, New Delhi, India, 2015, pp. 605-611. [DOI:10.1007/978-81-322-2012-1_65]

5. H. Song, T. Hu, Y. Huang, and M. Guo, "Detecting MP3Stego and Estimating the Hidden Size", in WSEAS International Conference on Computers, Athens, Greece, 2003, pp. 2881-2884.

6. X. Yu, R. Wang, D. Yan, and J. Zhu, "MP3 Audio Steganalysis Using Calibrated Side Information Feature", Journal of Computational Information Systems, vol. 8, pp. 4241-4248, 2012. [DOI:10.4304/jsw.8.10.2628-2636]

7. C. Jin, R. Wang, D. Yan Ma, P. and K. Yang, "A Novel Detection Scheme for MP3stego with Low Payload", in IEEE China Summit & International Conference on Signal and Information Processing, ChinaSIP, Xi'an, China, 2014, pp. 602-606. [DOI:10.1109/ChinaSIP.2014.6889314]

8. دارابی، علیرضا، «نهان یابی فایل‌های صوتی MP3 با تاکید بر روش‌های پردازش سیگنال»، پایان نامة کارشناسی ارشد، دانشگاه رازی، كرمانشاه، 1395.

9. گروه واژه‌گزینی انجمن رمز ایران، واژه‌نامه و فرهنگ امنیت فضای تولید و تبادل اطلاعات، ویرایش دوم، تهران، مؤسسه انتشارات علمی دانشگاه صنعتی شریف، 1394.

10. مهدوی جعفری، سمیه، "نهان‌نگاری اصوات دیجیتال"، پایان نامة کارشناسی ارشد، دانشگاه شهید باهنر، کرمان، 1388.

11. سلیمیان ریزی، عاطفه، "روش‌های مخفی‌سازی اطلاعات در فایل‌های صوتی MP3"،. پایان نامة کارشناسی ارشد، دانشگاه صنعتی اصفهان، اصفهان، 1393.

12. R. Raissi, "The Theory behind MP3", Personal Report, Dec. 2002.

13. Music Audio Benchmark Data Set, Dortmund University, [Online], Available: https://www-ai.cs.tu-dortmund.de/audio.html [Accessed: Sept. 2, 2024]

14. Ogg MP3 converter, [Online], Available: http://www.ogg-mp3.net [Accessed: Sept. 2, 2024]

15. اشعری، فاطمه، ریاحی، نوشین، «نهان‌کاوی صوت مبتنی بر همبستگی بین فریم و کاهش بازگشتی ویژگی»، پردازش علائم و داده‌ها، دورة 15، شمارة 3، صفحات 113-122، 1397.

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Signal and Data Processing

Vote