Enhancing Audio Deepfake Detection using Support Vector Machines and Mel-Frequency Cepstral Coefficients
Main Article Content
Abstract
This paper presents a machine learning system designed to differentiate real from synthetic speech using a Support Vector Machine (SVM) classifier. Trained on the 'for-original' Fake-or-Real (FoR) dataset, which consists of over 195,000 genuine and computer-generated utterances, the system uses Mel Frequency Cepstral Coefficients (MFCCs) to extract features. Evaluation results show a promising accuracy of 97.28%, indicating the system's potential efficacy in real-world applications. The work lays the foundation for future improvements in detection robustness and reliability by highlighting the significance of raw data in classifier training for deepfake detection.
Article Details
Section
Articles