


default search action
International Journal of Speech Technology, Volume 28
Volume 28, Number 1, March 2025
- Hind Ait Mait, Noureddine Aboutabit
:
Unsupervised phoneme segmentation of continuous Arabic speech. 1-12 - K. V. Aljinu Khadar, R. K. Sunil Kumar, V. V. Sameer:
Speaker diarization based on X vector extracted from time-delay neural networks (TDNN) using agglomerative hierarchical clustering in noisy environment. 13-26 - Mohamed Daouad
, Fadoua Ataa-Allah
, El Wardani Dadi
:
Optimizing Whisper models for Amazigh ASR: a comparative analysis. 27-37 - Suresh Veesa
, Madhusudan Singh:
Deep learning countermeasures for detecting replay speech attacks: a review. 39-51 - Hossam Boulal, Farida Bouroumane, Mohamed Hamidi
, Jamal Barkani, Mustapha Abarkan:
Exploring data augmentation for Amazigh speech recognition with convolutional neural networks. 53-65 - Ahlam Khan
, Abdul Malik Abbasi
, Illahi Bakhsh
, Neda Kameh Khosh
:
An acoustic analysis of sentence level prominence in Pakistani English speech. 67-83 - Marwah Alian
, Ibtisam Hussein, Maisa Al-Khazaleh
:
Assessment of ChatGPT in extracting Arabic morphology: active and passive participles. 85-97 - Sahar Farazi, Yasser Shekofteh
:
Evaluation of phone posterior probabilities for pathology detection in speech data using deep learning models. 99-116 - A. Revathi, N. Sasikaladevi:
Robust sound-based bird classification using multiple features and random forest classifier. 117-127 - Zhipeng Yin, Xinzhou Xu
, Björn W. Schuller:
Request and complaint recognition in call-center speech using a pointwise-convolution recurrent network. 129-139 - Qiong Hu:
Sentiment analysis algorithm based on word embedding in text mining. 141-151 - Mahadevaswamy Shanthamallappa
, B. P. Pradeep Kumar:
Enhanced perceptual wavelet packet features for spontaneous Kannada sentence recognition under uncontrolled conditions. 153-174 - Khamis A. Al-Karawi:
Convolutional neural network-based detection of audio replay attacks in speaker verification systems. 175-184 - Sachin Anap
, Satish R. Jondhale, Balasaheb S. Agarkar, Sachin Chaudhari:
Parkinson's disease detection from speech using combination of empirical wavelet transform and Hilbert transform. 185-194 - Raja Bhargava, N. Arivazhagan, Kunchala Suresh Babu:
Hybrid RMDL-CNN for speech recognition from unclear speech signal. 195-217 - Ghizlane Bourahouat, Manar Abourezq, Najima Daoudi:
Enhancing Arabic text summarisation using Arabic bidirectional encoder representations from transformers-based models together with POS. 219-230 - Boumegouas Rahil, Djendi Mohamed
:
Blind audio-denoising algorithm based on the double-sensor affine projection approach. 231-244 - Manish Tiwari, Deepak Kumar Verma
:
Gender recognition in text-independent speaker identification using MFCC, spectrogram, Bi-LSTM, and rat swarm evolutionary algorithm optimization. 245-260 - Sirinda Palahan
, Yuwaree Tongvivat
:
Toward automating the rhythmic analysis of speech: a comparative study of English spoken by American and Thai speakers. 261-279 - Akanksha Yadav, Namrata Dhanda, Debabrata Singh:
Intelligent cognitive health monitoring for seniors using natural language processing-integrated GRU attention and Q-reinforcement learning. 281-298 - Meryam Telmem, Naouar Laaidi, Hassan Satori:
The impact of MFCC, spectrogram, and Mel-Spectrogram on deep learning models for Amazigh speech recognition system. 299-312 - Joan L. Imbwaga, Nagaratna B. Chittaragi, Shashidhar G. Koolagudi:
Correction: Automatic hate speech detection in audio using machine learning algorithms. 313 - Retraction Note: Efficient diabetes mellitus prediction with grid based random forest classifier in association with natural language processing. 315
Volume 28, Number 2, June 2025
- Benjamin O'Brien
, Anna Marcyzk:
A spectrotemporal modulation application for distinguishing modal and whistled speech. 317-324 - Dip Kumar Saha
, Tushar Deb Nath:
A lightweight CNN-based ensemble approach for early detecting Parkinson's disease with enhanced features. 325-339 - S. Satheeswara Reddy, Sheik Khadar Ahmad Mnoj, Karu Prasada Rao:
DNNT (Deep Neural Network for Telugu): a framework for speech recognition of Telugu language with parallel computing approach. 341-349 - Aonmoy Das
, Ananna Dev Aishi, Masbah Uddin Toha, Md Fazlul Kader:
Bangla sign language translator for deaf and speech impaired people using deep LSTM. 351-368 - Shubhojeet Paul
, Vandana Bhattacharjee, Sujan Kumar Saha:
Towards development of the first continuous speech recognition system in Indian language Nagpuri. 369-380 - K. V. Aljinu Khadar, R. K. Sunil Kumar, V. V. Sameer:
Speakers' height, weight, and speaking position identification from harmonic-related features. 381-396 - Mohit Kumar, Sushant, Arun Kumar Yadav:
Speech signal's phase information based Alzheimer's disease detection using deep learning. 397-410 - Abhishek Nandal, Mohit Dua:
A hybrid approach to secure automatic speaker verification: integrating clone detection and speaker identification. 411-429 - M. R. Prasad
, Manjunath B. Talawar
, N. Jagadisha, Sharana Basavana Gowda:
Evaluating and integrating superior noise suppression algorithm into speech coding system. 431-441 - Ling Pan:
The importance of deep learning models in speech signal processing: fundamentals, strategies, and future research directions. 443-459 - Amina Salifu, Henry Nunoo-Mensah, Eric Tutu Tchao, Francisca Adoma Acheampong, Andrew Selasi Agbemenu, Jerry John Kponyo:
Enhancing speech recognition through diverse shared features accent classification. 461-481 - A. Femina Jalin, J. Jaya Kumari:
An efficient text to speech system for Tamil language based on deep learning approach. 483-497 - Subreena Mushtaq, Samrah Mehraj, Shabir A. Parah:
Robust audio watermarking using adaptive embedding in intrinsic modes. 499-514 - Ammar El-Hassan, Mohammad Azzeh, Bashar El-Rashdan:
NQF: A deep learning model for classifying course outcomes in the national qualification framework. 515-529 - M. Mahesh
, Keshav M. V. Pradosh:
Mathematical and prosodic analysis of intonation in Malayalam. 531-540 - Ajay Babasaheb Kapase, Nilesh Uke:
A comprehensive review in affective computing: an exploration of artificial intelligence in unimodal and multimodal emotion recognition systems. 541-563 - Manju Suresh, Rajeev Rajan, Joshua Thomas:
Speaker independent dysarthria severity classification using synthesis-based augmentation. 565-580 - Ramzi Haraty, Mira Chehade:
Transfer learning and sentiment analysis of lebanese dialect data using a multilingual deep learning approach. 581-595 - Aishwarya Gupta
, Archana Purwar
:
Direct normalized cut clustering using a novel robust cluster estimation technique for multi-speaker diarization. 597-609 - Retraction Note: Predicting the supreme court decision on appeal cases using hierarchical convolutional neural network. 611
- Retraction Note: Firefly algorithm: an optimization solution in big data processing for the healthcare and engineering sector. 613
Volume 28, Number 3, September 2025
- Khamis A. Al-Karawi, Mahmoud M. Abdelwahab, Abdulrahman Saqer Alenizi:
Comprehensive review of automatic speaker verification with spoofing detection techniques attacks. 615-638 - Naouar Laaidi, Abderrahim Ezzine, Hassan Satori:
Embedded speech recognition for Arabic language in industrial command. 639-651 - Aparna Vyakaranam
, Tomas Maul, Bavani Ramayah:
Comparison of three hybrid architectures using 1D, 2D, and 3D CNNs for speech emotion recognition. 653-669 - Abdellah Kacha, Francis Grenez:
Cepstral analysis of noisy speech for disordered voices assessment. 671-684 - H. M. Chandrashekar, Veena Karjigi, N. Sreedevi:
Significance of magnitude and phase components for intelligibility assessment of pathological speech. 685-697 - Antor Mahamudul Hashan:
Enhancing hyperkinetic dysarthria speech recognition through deep cascade convolution and integrated attention mechanism. 699-707 - Hadhami Aouani, Yassine Ben Ayed:
The influence of optimizer function for emotion detection from speech with data augmentation. 709-728 - Yaping Yang, Zain Ul Abideen, Amir Ali, Muhammad Aoun, Tehseen Mazhar, Tariq Shahzad:
A hybrid model combining GCN transformer and Word2Vec for Chinese sequence labeling with deep linguistic knowledge. 729-743 - Abdelkader Benzirar, Mohamed Hamidi, Mouncef Filali Bouami:
Building a speech emotion recognition system using RNN, GRU and LSTM. 745-759 - Muhammad Owais, Khadija Sarwar, Khurram Khan Jadoon, Junaid Yousaf:
Emotion detection in Urdu speech: a deep hybrid learning approach. 761-772

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.