The science of hiding a message containing information in a carrier medium is called steganography, and the attempt to detect the presence or absence of a hidden message in a cover medium is called steganalysis. The MP3 compression format has been used among audio data as a suitable and comprehensive host for information encryption, and various encryption methods have been designed for this purpose. In this research, the aim is to present an algorithm for audio ateganalysis, specifically for compressed audio files in MP3 format, in which some data has been embedded using MP3stego software. To prepare encrypted data, text files with random texts have been used. First, by using the side information extracted from MP3 files, the necessary features are extracted and the audio data, which includes two categories of stego files and clean files, is divided into two parts: training data and test data. And then, using machine learning techniques (support vector machine), the detection system of infected files and clean files is designed, and finally, the efficiency of the system is measured using the test data. In this paper, a new feature called spectral peakiness (SPK) is extracted from the side information of MP3 file. The proposed system was tested using separate test data, which includes clean files and stego files with various encryption capacities, and it distinguished clean and stego files with 100% accuracy and without error. The results indicate the perfect classification of stego and clean files while reducing the computational complexity and increasing the speed of steganalysis compared to other methods.
Instead of using the audio signal information stored in the MP3 file, the proposed method uses the side information of the MP3 file, which is less dependent on the audio content of the file. In this method, the MDB side information in the compressed audio file is assumed as a sequence, and then, using a feature extraction method, a new feature in the frequency domain called spectral peakiness is calculated. This simple yet powerful feature is combined with features such as temporal average and spectral average of the MDB sequence and forms a low-dimensional (three-dimensional) feature vector. This feature vector will then be classified by a support vector machine (SVM) classifier as a suspicious file or a normal file. The feature extraction method, while being simple and having very few calculations, has 100% accuracy (recognition without any error) for MP3 files, even when the amount of the hidden information in the audio file is very low.
Rights and permissions | |
![]() |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |