Deep Learning and Fourier Transform for Speaker Recognition(DLFSR) | ||||
Fayoum University Journal of Engineering | ||||
Volume 8, Issue 1, January 2025, Page 143-151 PDF (684.3 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/fuje.2024.313518.1090 | ||||
![]() | ||||
Authors | ||||
Taqwa Mahmoud Sayed ![]() ![]() | ||||
1tamiyyah-fayoum-egypt tamiyyah.fayoum.egypt | ||||
2Kyman Faryes Faculty of engineering | ||||
3Computers and Systems Engineering Department, Faculty of Engineering, Fayoum University,Fayoum ,Egypt | ||||
Abstract | ||||
Automatic Speaker recognition (ASR) and verification have gained increased visibility and significance in society as speech technology. Speaker recognition has undergone a revolution due to deep learning techniques, specifically deep neural networks (DNNs). With the use of models like convolutional neural networks (CNNs) and recurrent neural networks (RNNs), it is possible to learn discriminative features directly from unprocessed speech signals without the requirement for manual feature extraction. A growing number of people are using end-to-end speaker recognition models because of how well they work and how easily they can link speaker IDs to speech waveforms. It can recognize and authenticate people based on their distinct vocal traits. A lot of Applications of automatic speaker recognition can be found in many areas, such as voice-based digital device authentication, forensic analysis of audio recordings, access control, and phone-based customer support identification. Through our study, we introduce a Deep Learning and Fourier Transform for Speaker Recognition model (LDLSR)that based on Short Term Fourier Transform (STFT) in which the input speech can be transformed into spectrogram then we apply deep learning especially Convolutional Neural Network (CNN) to the spectrogram images to extract feature and classify the spoken person. The training and validation test are applied on speaker recognition dataset 16000pcm.This model performs excellent result with 98.8% correct identification and classification. | ||||
Keywords | ||||
ASR; STFT; CNN; RNN; DLFTSR; pcm dataset | ||||
Statistics Article View: 170 PDF Download: 123 |
||||