25th Interspeech 2024: Kos, Greece

[–] 

Refine list

showing all ?? records

Keynote 1 ISCA Medallist

L2 Speech, Bilingualism and Code-Switching

Speaker Diarization 1

Speech and Audio Analysis and Representations

Acoustic Event Detection and Classification 2

Detection and Classification of Bioacoustic Signals

Acoustic Echo Cancellation

Speech Synthesis: Voice Conversion 1

Neural Network Architectures for ASR 2

Decoding Algorithms

Pronunciation Assessment

Spoken Language Processing

Spoken Machine Translation 2

Biosignal-enabled Spoken Communication

Individual and Social Factors in Phonetics

Paralinguistics

Speaker Recognition: Adversarial and Spoofing Attacks

Audio Event Detection and Classification 1

Source Separation 2

Noise Reduction, Dereverberation, and Echo Cancellation

Computationally-Efficient Speech Enhancement

Zero-shot TTS

Noise Robustness, Far-Field, and Multi-Talker ASR

Contextual Biasing and Adaptation

Spoken Language Understanding

Spoken Machine Translation 1

Hearing Disorders

Speech Disorders 2

TAUKADIAL Challenge: Speech-Based Cognitive Assessment in Chinese and English (Special Session)

Show and Tell 1

Keynote 2

Phonetics and Phonology of Second Language Acquisition

Corpora-based Approaches in Automatic Emotion Recognition

Analysis of Speakers States and Traits

Spoofing and Deepfake Detection

Audio Captioning, Tagging, and Audio-Text Retrieval

Generative Speech Enhancement

Speech Synthesis: Evaluation

Multilingual ASR

General Topics in ASR

Spoken Language Understanding

Speech and Multimodal Resources

Pathological Speech Analysis 1

Speech and Language in Health: from Remote Monitoring to Medical Conversations - 1 (Special Session)

Speech and Brain

Innovative Methods in Phonetics and Phonology

Voice, Tones and F0

Emotion Recognition: Resources and Benchmarks

Speaker and Language Identification and Diarization

Audio-Text Retrieval

Speech Enhancement

Speech Coding

Speech Synthesis: Expressivity and Emotion

Speech Synthesis: Tools and Data

Speech Synthesis: Singing Voice Synthesis

LLM in ASR

Vision and Speech

Spoken Document Summarization

Speech and Language in Health: from Remote Monitoring to Medical Conversations - 2 (Special Sessions)

Show and Tell 2

Prosody

Foundational Models for Deepfake and Spoofed Speech Detection

Speaker Recognition 1

Source Separation 1

Audio-Visual and Generative Speech Enhancement

Speech Privacy and Bandwidth Expansion

Speech Synthesis: Prosody

Accented Speech, Prosodic Features, Dialect, Emotion, Sound Classification

Neural Network Adaptation

ASR and LLMs

Pathological Speech Analysis 3

Speech Disorders 3

Speech Recognition with Large Pretrained Speech Models for Under-represented Languages (Special Session)

Speech Processing Using Discrete Speech Units (Special Session)

Keynote 3

Databases and Progress in Methodology

Articulation, Convergence and Perception

Speech Emotion Recognition

Self-Supervised Models in Speaker Recognition

Speech Quality Assessment

Privacy and Security in Speech Communication 1

Speech Synthesis: Voice Conversion 2

Speech Synthesis: Text Processing

Training Methods, Self-Supervised Learning, Adaptation

Novel Architectures for ASR

Multimodality and Foundation Models

Spoken Dialogue Systems and Conversational Analysis 1

Speech Technology

Pathological Speech Analysis 2

Speech Science, Speech Technology, and Gender (Special Session)

Speech Production and Perception

Phonetics and Phonology: Segmentals and Suprasegmentals

Topics in Paralinguistics

Emotion Recognition: Fairness, Variability, Uncertainty

Speaker Verification

Spatial Audio and Acoustics

Generative Models for Speech and Audio

Speech and Audio Modelling

Multi-Channel Speech Enhancement

Speech Synthesis: Paradigms and Methods 1

Speech Synthesis: Paradigms and Methods 2

Neural Network Architectures for ASR 1

Error Correction and Rescoring

Spoken Language Understanding

Spoken Dialogue Systems and Conversational Analysis 2

Computational Models of Human Language Acquisition, Perception, and Production (Special Session)

Show and Tell 3

Phonetics, Phonology and Prosody

Segmentals

New Avenues in Emotion Recognition

Speaker Diarization 2

Speaker Recognition 2

Speech and Audio Analysis

Speech Quality and Intelligibility: Prediction and Enhancement

Speech Synthesis: Vocoders

ASR Model Training Methods

Cross-Lingual and Multilingual Processing

Speech Assessment

Question Answering from Speech and Spoken Dialogue Systems

Spoken Dialogue Systems and Conversational Analysis 3

Dysarthric Speech Assessment

Spoken Language Models for Universal Speech Processing (Special Session)

Keynote 4

L1/L2 Acquisition and Cross-Linguistic Factors

Speaker Stance, Emotion and Language-External Factors

Experimental Phonetics and Laboratory Phonology

Speaker recognition evaluation and resources

Speech Type Classification

Target Speaker Extraction

Speech Synthesis: Voice Conversion 3

Speech Synthesis: Paradigms and Methods 3

Privacy and Security in Speech Communication 2

Streaming ASR

Computational Resource Constrained ASR

Evaluation of Speech Technology Systems

Neural Network Training for Speech Recognition

Leveraging Large Language Models and Contextual Features for Phonetic Analysis (Special Session)

Responsible Speech Foundation Models (Special Session)

Multimodal Paralinguistics

Automatic Emotion Recognition

Self and Weakly-Labelled Speaker Verification

Acoustic Event Detection, Segmentation and Classification

Speech and Audio Modelling

Fake Audio Detection

Deep Learning-Based Speech Enhancement: Approaches, Scalability, and Evaluation

Speech Synthesis: Other Topics 1

Speech Synthesis: Other Topics 2

Speech synthesis: Cross-lingual and multilingual aspects

Noise, Far-Field, Multi-Talker, Enhancement, Audio Classification

Self-Supervised Learning for ASR

Spoken Term Detection and Speech Retrieval

Speech Disorders 1

Connecting Speech-science and Speech-technology for Children's Speech (Special Session)

Show and Tell 4