


default search action
ICSLP 1996: Philadelphia, PA, USA
- The 4th International Conference on Spoken Language Processing, Philadelphia, PA, USA, October 3-6, 1996. ISCA 1996

Plenary Lectures
- Anne Cutler:

The comparative study of spoken-language processing. 1 - James L. Flanagan:

Natural communication with machines - progress and challenge. 2522
Large Vocabulary
- Zhishun Li, Michel Héon, Douglas D. O'Shaughnessy:

New developments in the INRS continuous speech recognition system. 2-5 - Lori Lamel, Gilles Adda:

On designing pronunciation lexicons for large vocabulary, continuous speech recognition. 6-9 - Pablo Fetter, Frédéric Dandurand, Peter Regel-Brietzmann:

Word graph rescoring using confidence measures. 10-13 - Xavier L. Aubert, Peter Beyerlein, Meinhard Ullrich:

A bottom-up approach for handling unseen triphones in large vocabulary continuous speech recognition. 14-17 - V. Valtchev, Philip C. Woodland, Steve J. Young:

Discriminative optimisation of large vocabulary recognition systems. 18-21 - Tatsuo Matsuoka, Katsutoshi Ohtsuki, Takeshi Mori, Sadaoki Furui, Katsuhiko Shirai:

Japanese large-vocabulary continuous-speech recognition using a business-newspaper corpus. 22-25 - David M. Carter, Jaan Kaja, Leonardo Neumeyer, Manny Rayner, Fuliang Weng, Mats Wirén:

Handling compound nouns in a Swedish speech-understanding system. 26-29 - Javier Macías Guarasa, Ascensión Gallardo-Antolín, Javier Ferreiros, José Manuel Pardo, Luis Villarrubia Grande:

Initial evaluation of a preselection module for a flexible large vocabulary speech recognition system in. 30-33
Multimodal ASR (Face and Lips)
- Mamoun Alissali, Paul Deléglise, Alexandrina Rogozan:

Asynchronous integration of visual information in an automatic speech recognition system. 34-37 - Iain A. Matthews, J. Andrew Bangham, Stephen J. Cox:

Audiovisual speech recognition using multiscale nonlinear image decomposition. 38-41 - Qin Su, Peter L. Silsbee:

Robust audiovisual integration using semicontinuous hidden Markov models. 42-45 - Richard P. Schumeyer, Kenneth E. Barner:

The effect of visual information on word initial consonant perception of dysarthric speech. 46-49 - Devi Chandramohan, Peter L. Silsbee:

A multiple deformable template approach for visual speech recognition. 50-53 - Piero Cosi, Emanuela Magno Caldognetto, Franco Ferrero, M. Dugatto, Kyriaki Vagges:

Speaker independent bimodal phonetic recognition experiments. 54-57 - Juergen Luettin

, Neil A. Thacker, Steve W. Beet:
Speechreading using shape and intensity information. 58-61 - Juergen Luettin

, Neil A. Thacker, Steve W. Beet:
Speaker identification by lipreading. 62-65
Perception of Words
- David W. Gow Jr., Janis Melvold, Sharon Manuel:

How word onsets drive lexical access and segmentation: evidence from acoustics, phonology and processing. 66-69 - David van Kuijk, Peter Wittenburg, Ton Dijkstra:

RAW: a real-speech model for human word recognition. 70-73 - Mehdi Meftah, Sami Boudelaa:

How facilitatory can lexical information be during word recognition? evidence from moroccan arabic. 74-77 - Alette P. Haveman:

Effects of frequency on the auditory perception of open- versus closed-class words. 78-81 - Michael S. Vitevitch, Paul A. Luce, Jan Charles-Luce, David Kemmerer:

Phonotactic and metrical influences on adult ratings of spoken nonsense words. 82-85 - Edward T. Auer, Lynne E. Bernstein:

Lipreading supplemented by voice fundamental frequency: to what extent does the addition of voicing increase lexical uniqueness for the lipreader? 86-89 - Saskia te Riele, Sieb G. Nooteboom, Hugo Quené:

Strategies used in rhyme-monitoring. 90-93 - Wilma van Donselaar, Cecile T. L. Kuijpers, Anne Cutler:

How do dutch listeners process words with epenthetic schwa? 94-97
Phonetics, Transcription, and Analysis
- Patrick Juola, Philip Zimmermann:

Whole-word phonetic distances and the PGPfone alphabet. 98-101 - Shuping Ran, J. Bruce Millar, Phil Rose:

Automatic vowel quality description using a variable mapping to an eight cardinal vowel reference set. 102-105 - Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel:

Automatic detection and segmentation of pronunciation variants in German speech corpora. 106-109 - Stephanie Seneff, Raymond Lau, Helen M. Meng:

ANGIE: a new framework for speech analysis based on morpho-phonological modelling. 110-113 - Byunggon Yang:

Perceptual contrast in the Korean and English vowel system normalized. 114-117 - Yong-Ju Lee, Sook-Hyang Lee:

On phonetic characteristics of pause in the Korean read speech. 118-120 - Sami Boudelaa, Mehdi Meftah:

Cross-language effects of lexical stress in word recognition: the case of Arabic English bilinguals. 121-124 - Maria-Barbara Wesenick:

Automatic generation of German pronunciation variants. 125-128 - Maria-Barbara Wesenick, Andreas Kipp:

Estimating the quality of phonetic transcriptions and segmentations of speech signals. 129-132 - Bojan Petek, Rastislav Sustarsic, Smiljana Komar:

An acoustic analysis of contemporary vowels of the standard slovenian language. 133-136 - Sandrine Robbe, Anne Bonneau, Sylvie Coste, Yves Laprie:

Using decision trees to construct optimal acoustic cues. 137-140 - Donna Erickson, Osamu Fujimura:

Maximum jaw displacement in contrastive emphasis. 141-144 - Rebecca Herman, Mary E. Beckman, Kiyoshi Honda:

Subglottal pressure and final lowering in English. 145-148 - Cecile T. L. Kuijpers, Wilma van Donselaar, Anne Cutler:

Phonological variation: epenthesis and deletion of schwa in Dutch. 149-152
Spoken Language Processing for Special Populations
- James J. Mahshie:

Feedback considerations for speech training systems. 153-156 - Anne-Marie Öster:

Clinical applications of computer-based speech training for children with hearing impairment. 157-160 - Valérie Hazan, Andrew Simpson:

Enhancing information-rich regions of natural VCV and sentence materials presented in noise. 161-164 - Valérie Hazan, Alan Adlard:

Speech perceptual abilities of children with specific reading difficulty (dyslexia). 165-168 - Larry D. Paarmann, Michael K. Wynne:

Bimodal perception of spectrum compressed speech. 169-172 - Dragana Barac-Cikoja, Sally Revoile:

Effect of sentential context on syllabic stress perception by hearing-impaired listeners. 173-175 - Martin J. Russell, Catherine Brown, Adrian Skilling, Robert W. Series, Julie L. Wallace, Bill Bohnam, Paul Barker:

Applications of automatic speech recognition to speech and language development in young children. 176-179 - D. R. Campbell:

Sub-band adaptive speech enhancement for hearing aids. 180-183 - Thomas Portele, Jürgen Krämer:

Adapting a TTS system to a reading machine for the blind. 184-187
Dialogue Special Sessions
- Katsuhiko Shirai:

Modeling of spoken dialogue with and without visual information. 188-191 - Stephanie Seneff, David Goddeau, Christine Pao, Joseph Polifroni:

Multimodal discourse modelling in a multi-user multi-domain environment. 192-195 - Kenji Kita, Yoshikazu Fukui, Masaaki Nagata, Tsuyoshi Morimoto:

Automatic acquisition of probabilistic dialogue models. 196-199 - Paul Heisterkamp, Scott McGlashan:

Units of dialogue management: an example. 200-203 - Sharon L. Oviatt, Robert VanGent:

Error resolution during multimodal human-computer interaction. 204-207 - Ramesh R. Sarukkai, Dana H. Ballard:

Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting. 208-211 - Kai Hübener, Uwe Jost, Henrik Heine:

Speech recognition for spontaneously spoken German dialogues. 212-215 - Paul Taylor, Hiroshi Shimodaira, Stephen Isard, Simon King, Jacqueline C. Kowtko:

Using prosodic information to constrain language models for spoken dialogue. 216-219 - Peter A. Heeman, Kyung-ho Loken-Kim, James F. Allen:

Combining the detection and correction of speech repairs. 362-365 - Yuji Sagawa, Wataru Sugimoto, Noboru Ohnishi:

Generating spontaneous elliptical utterance. 366-369 - Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House, Birgitta Lastow, Paul Touati:

Developing the modelling of Swedish prosody in spontaneous dialogue. 370-373 - Shimei Pan, Kathleen R. McKeown:

Spoken language generation in a multimedia system. 374-377 - Keikichi Hirose, Mayumi Sakata, Hiromichi Kawanami:

Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features. 378-381 - Shuichi Tanaka, Shu Nakazato, Keiichiro Hoashi, Katsuhiko Shirai:

Spoken dialogue interface in a dual task situation. 382-385 - Yasuhisa Niimi, Yutaka Kobayashi:

A dialogue control strategy based on the reliability of speech recognition. 534-537 - Alexander I. Rudnicky, Stephen Reed, Eric H. Thayer:

Speechwear: a mobile speech system. 538-541 - Helen M. Meng, Senis Busayapongchai, James R. Glass, David Goddeau, I. Lee Hetherington, Edward Hurley, Christine Pao, Joseph Polifroni, Stephanie Seneff, Victor Zue:

WHEELS: a conversational system in the automobile classifieds domain. 542-545 - M. David Sadek, Alexandre Ferrieux, A. Cozannet, Philippe Bretier, Franck Panaget, J. Simonin:

Effective human-computer cooperative spoken dialogue: the AGS demonstrator. 546-549 - Samir Bennacef, Laurence Devillers, Sophie Rosset, Lori Lamel:

Dialog in the RAILTEL telephone-based system. 550-553 - Alon Lavie, Lori S. Levin, Yan Qu, Alex Waibel, Donna Gates, Marsal Gavaldà, Laura Mayfield, Maite Taboada:

Dialogue processing in a conversational speech translation system. 554-557
Language Modeling
- Thomas Niesler, Philip C. Woodland:

Combination of word-based and category-based language models. 220-223 - Francisco J. Valverde-Albacete, José Manuel Pardo:

A multi-level lexical-semantics based language model design for guided integrated continuous speech recognition. 224-227 - Florian Gallwitz, Elmar Nöth, Heinrich Niemann:

A category based approach for recognition of out-of-vocabulary words. 228-231 - Kristie Seymore, Ronald Rosenfeld:

Scalable backoff language models. 232-235 - Rukmini Iyer, Mari Ostendorf:

Modeling long distance dependence in language: topic mixtures vs. dynamic cache models. 236-239 - Marcello Federico:

Bayesian estimation methods for n-gram language model adaptation. 240-243 - Man-Hung Siu, Mari Ostendorf:

Modeling disfluencies in conversational speech. 386-389 - John Miller, Fil Alleva:

Evaluation of a language model using a clustered model backoff. 390-393 - Antonio Bonafonte, José B. Mariño:

Language modeling using x-grams. 394-397 - Klaus Ries, Finn Dag Buø, Alex Waibel:

Class phrase models for language modelling. 398-401 - Petra Geutner:

Introducing linguistic constraints into statistical language modeling. 402-405 - Jianying Hu, William Turin, Michael K. Brown:

Language modeling with stochastic automata. 406-409
Feature Extraction for Speech Recognition
- Don X. Sun:

Feature dimension reduction using reduced-rank maximum likelihood estimation for hidden Markov models. 244-247 - Kai Hübener:

Using multi-level segmentation coefficients to improve HMM speech recognition. 248-251 - Thomas Eisele, Reinhold Haeb-Umbach, Detlev Langmann:

A comparative study of linear feature transformation techniques for automatic speech recognition. 252-255 - Ben Milner:

Inclusion of temporal information into features for speech recognition. 256-259 - Hubert Wassner, Gérard Chollet:

New cepstral representation using wavelet analysis and spectral transformation for robust speech recognition. 260-263 - Christopher John Long, Sekharajit Datta:

Wavelet based feature extraction for phoneme recognition. 264-267 - Andrzej Drygajlo:

New fast wavelet packet transform algorithms for frame synchronized speech processing. 410-413 - Srinivasan Umesh, Leon Cohen, Nenad Marinovic, Douglas J. Nelson:

Frequency-warping in speech. 414-417 - Daisuke Kobayashi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:

Extracting speech features from human speech-like noise. 418-421 - Shoji Kajita, Kazuya Takeda, Fumitada Itakura:

Subband-crosscorrelation analysis for robust speech recognition. 422-425 - Hervé Bourlard, Stéphane Dupont:

A new ASR approach based on independent processing and recombination of partial frequency bands. 426-429 - Climent Nadeu, José B. Mariño, Javier Hernando, Albino Nogueiras:

Frequency and time filtering of filter-bank energies for HMM speech recognition. 430-433
Speech Production - Measurement and Modeling
- Yves Laprie, Marie-Odile Berger:

Extraction of tongue contours in x-ray images with minimal user interaction. 268-271 - Didier Demolin, Thierry Metens, Alain Soquet:

Three-dimensional measurement of the vocal tract by MRI. 272-275 - Philip Gleason, Betty Tuller, J. A. Scott Kelso:

Syllable affiliation of final consonant clusters undergoes a phase transition over speaking rates. 276-278 - Arthur Lobo, Michael H. O'Malley:

Towards a biomechanical model of the larynx. 279-282 - Yann Morlec, Gérard Bailly, Véronique Aubergé:

Generating intonation by superposing gestures. 283-286 - Hideki Kawahara, Hiroko Kato, J. C. Williams:

Effects of auditory feedback on F0 trajectory generation. 287-290
Speech Coding / HMMs and NNs in ASR
- Ian S. Burnett, John J. Parry:

On the effects of accent and language on low rate speech coders. 291-294 - Jeng-Shyang Pan, Fergus R. McInnes, Mervyn A. Jack:

VQ codevector index assignment using genetic algorithms for noisy channels. 295-298 - Gavin C. Cawley:

An improved vector quantization algorithm for speech transmission over noisy channels. 299-301 - C. Murgia, Gang Feng, Alain Le Guyader, Catherine Quinquis:

Very low delay and high quality coding of 20 hz-15 khz speech signals at 64 kbit/s. 302-305 - Carlos M. Ribeiro, Isabel Trancoso:

Application of speaker modification techniques to phonetic vocoding. 306-309 - Tadashi Yonezaki, Kiyohiro Shikano:

Entropy coded vector quantization with hidden Markov models. 310-313 - Minoru Kohata:

An application of recurrent neural networks to low bit rate speech coding. 314-317 - Kazuhito Koishida, Keiichi Tokuda, Takao Kobayashi, Satoshi Imai:

CELP coding system based on mel-generalized cepstral analysis. 318-321 - Cheung-Fat Chan, Wai-Kwong Hui:

Wideband re-synthesis of narrowband CELP-coded speech using multiband excitation model. 322-325 - Takuya Koizumi, Mikio Mori, Shuji Taniguchi, Mitsutoshi Maruya:

Recurrent neural networks for phoneme recognition. 326-329 - M. A. Mokhtar, A. Zein-el-Abddin:

A model for the acoustic phonetic structure of arabic language using a single ergodic hidden Markov model. 330-333 - Yifan Gong, Irina Illina, Jean Paul Haton:

Modelling long term variability information in mixture stochastic trajectory framework. 334-337 - Thierry Moudenc, Robert Sokol, Guy Mercier:

Segmental phonetic features recognition by means of neural-fuzzy networks and integration in an n-best solutions post-processing. 338-341 - Irina Illina, Yifan Gong:

Stochastic trajectory model with state-mixture for continuous speech recognition. 342-345 - Hermann Hild, Alex Waibel:

Recognition of spelled names over the telephone. 346-349 - Gilles Boulianne, Patrick Kenny:

Optimal tying of HMM mixture densities using decision trees. 350-353 - Hwan Jin Choi, Yung-Hwan Oh:

Speech recognition using an enhanced FVQ based on a codeword dependent distribution normalization and codeword weighting by fuzzy objective function. 354-357 - Mikko Kurimo, Panu Somervuo:

Using the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs. 358-361
Vowels
- Carrie E. Lang, John J. Ohala:

Temporal cues for vowels and universals of vowel inventories. 434-437 - Ann K. Syrdal:

Acoustic variability in spontaneous conversational speech of american English talkers. 438-441 - Raquel Willerman, Patricia K. Kuhl:

Cross-language speech perception: Swedish, English, and Spanish speakers' perception of front rounded vowels. 442-445 - John C. L. Ingram, See-Gyoon Park:

Inter-language vowel perception and production by Korean and Japanese listeners. 446-449 - Diane Kewley-Port, Reiko Akahane-Yamada, Kiyoaki Aikawa:

Intelligibility and acoustic correlates of Japanese accented English vowels. 450-453 - Kiyoko Yoneyama:

Segmentation strategies for spoken language recognition: evidence from semi-bilingual Japanese speakers of English. 454-457
NNs and Stochastic Modeling
- Geunbae Lee, Jong-Hyeok Lee, Kyubong Park, Byung-Chang Kim:

Integrating connectionist, statistical and symbolic approaches for continuous spoken Korean processing. 458-461 - Hynek Hermansky, Sangita Tibrewala, Misha Pavel:

Towards ASR on partially corrupted speech. 462-465 - Herbert Gish, Kenney Ng:

Parametric trajectory models for speech recognition. 466-469 - Kate M. Knill, Mark J. F. Gales, Steve J. Young:

Use of Gaussian selection in large vocabulary continuous speech recognition using HMMs. 470-473 - Jesper Högberg, Kåre Sjölander:

Cross phone state clustering using lexical stress and context. 474-477 - Eduardo Lleida-Solano, Richard C. Rose:

Likelihood ratio decoding and confidence measures for continuous speech recognition. 478-481 - Xiaohui Ma, Yifan Gong, Yuqing Fu, Jiren Lu, Jean Paul Haton:

A study on continuous Chinese speech recognition based on stochastic trajectory models. 482-485 - Yoshiaki Itoh, Jiro Kiyama, Hiroshi Kojima, Susumu Seki, Ryuichi Oka:

A proposal for a new algorithm of reference interval-free continuous DP for real-time speech or text retrieval. 486-489 - Akinori Ito, Masaki Kohda:

Language modeling by string pattern n-gram for Japanese speech recognition. 490-493 - Reinhard Kneser:

Statistical language modeling using a variable context length. 494-497 - Finn Tore Johansen:

A comparison of hybrid HMM architectures using global discriminative training. 498-501 - Wei Wei, Etienne Barnard, Mark A. Fanty:

Improved probability estimation with neural network models. 502-505 - Ha-Jin Yu, Yung-Hwan Oh:

A neural network using acoustic sub-word units for continuous speech recognition. 506-509 - Louis ten Bosch, Roel Smits:

On the error criteria in neural networks as a tool for human classification modelling. 510-513 - Gordon Ramsay:

A non-linear filtering approach to stochastic training of the articulatory-acoustic mapping using the EM algorithm. 514-517 - Y. P. Yang, John R. Deller Jr.:

A tool for automated design of language models. 518-521 - Felix Freitag, Enric Monte:

Acoustic-phonetic decoding based on elman predictive neural networks. 522-525 - Tan Lee, P. C. Ching:

On improving discrimination capability of an RNN based recognizer. 526-529 - Yumi Wakita, Jun Kawai, Hitoshi Iida:

An evaluation of statistical language modeling for speech recognition using a mixed category of both words and parts-of-speech. 530-533
Neural Models of Speech Processing
- Boris Aleksandrovsky, James Whitson, Gretchen Andes, Gary Lynch, Richard Granger:

Novel speech processing mechanism derived from auditory neocortical circuit analysis. 558-561 - Ping Tang, Jean Rouat:

Modeling neurons in the anteroventral cochlear nucleus for amplitude modulation (AM) processing: application to speech sound. 562-565 - Halewijn Vereecken, Jean-Pierre Martens:

Noise suppression and loudness normalization in an auditory model-based acoustic front-end. 566-569 - James J. Hant, Brian Strope, Abeer Alwan:

A psychoacoustic model for the noise masking of voiceless plosive bursts. 570-573 - Martin Hunke, Thomas Holton:

Training machine classifiers to match the performance of human listeners in a natural vowel classification task. 574-577 - Kiyoaki Aikawa, Hideki Kawahara, Minoru Tsuzaki:

A neural matrix model for active tracking of frequency-modulated tones. 578-581
Utterance Verification and Word Spotting
- Richard C. Rose, Eduardo Lleida-Solano, G. W. Erhart, R. V. Grubbe:

A user-configurable system for voice label recognition. 582-585 - Philippe Gelin, Christian Wellekens:

Keyword spotting enhancement for video soundtrack indexing. 586-589 - Rachida El Méliani, Douglas D. O'Shaughnessy:

New efficient fillers for unlimited word recognition and keyword spotting. 590-593 - Michelle S. Spina, Victor Zue:

Automatic transcription of general audio data: preliminary analyses. 594-597 - Francis Kubala, Tasos Anastasakos, Hubert Jin, Long Nguyen, Richard M. Schwartz:

Transcribing radio news. 598-601 - Anand R. Setlur, Rafid A. Sukkar, John Jacob:

Correcting recognition errors via discriminative utterance verification. 602-605
Acquisition/Learning Training L2 Learners
- Reiko Akahane-Yamada, Yoh'ichi Tohkura, Ann R. Bradlow, David B. Pisoni:

Does training in speech perception modify speech production? 606-609 - Motoko Ueyama:

Phrase-final lengthening and stress-timed shortening in the speech of native speakers and Japanese learners of English. 610-613 - Nobuko Yamada:

Japanese accentuations by foreign students and Japanese speakers of non-tokyo dialect. 614-617 - J. Kevin Varden, Tsutomu Sato:

Devoicing of Japanese vowels by taiwanese learners of Japanese. 618-621 - Danièle Archambault, Catherine Foucher, Blagovesta Maneva:

Fluency and use of segmental dialect features in the acquisition of a second language (French) by English speakers. 622-625 - P. Martland, Sandra P. Whiteside, Steve W. Beet, Ladan Baghai-Ravary:

Estimating child and adolescent formant frequency values from adult data. 626-629
Focus, Stress and Accent
- Agaath M. C. Sluijter, Vincent J. van Heuven:

Acoustic correlates of linguistic stress and accent in dutch and american English. 630-633 - Hiroya Fujisaki, Sumio Ohno, Osamu Tomita:

On the levels of accentuation in spoken Japanese. 634-637 - Linda Thibault, Marise Ouellet:

Tonal distinctions between emphatic stress and pretonic lengthening in quebec French. 638-641 - Anja (Petzold) Elsner:

Distinction between 'normal' focus and 'contrastive/emphatic' focus. 642-645 - Yukihiro Nishinuma, Masako Arai, Takako Ayusawa:

Perception of tonal accent by americans learning Japanese. 646-649 - Elizabeth Shriberg, D. Robert Ladd, Jacques M. B. Terken:

Modeling intra-speaker pitch range variation: predicting F0 targets when "speaking up". 650-653
Spoken Language Dialogue and Conversation
- Norbert Reithinger, Ralf Engel, Michael Kipp, Martin Klesen:

Predicting dialogue acts for a speech-to-speech translation system. 654-657 - Johannes Müller, Holger Stahl, Manfred K. Lang:

Automatic speech translation based on the semantic structure. 658-661 - Lewis M. Norton, Carl Weir, K. W. Scholz, Deborah A. Dahl, Ahmed Bouzid:

A methodology for application development for spoken language systems. 662-664 - Stephanie Seneff, Joseph Polifroni:

A new restaurant guide conversational system: issues in rapid prototyping for specialized domains. 665-668 - Tadahiko Kumamoto, Akira Ito:

Semantic interpretation of a Japanese complex sentence in an advisory dialogue - focused on the postpositional word "KEDO, " which works as a conjunction between clauses. 669-672 - Youngkuk Hong, Myoung-Wan Koo, Gijoo Yang:

A Korean morphological analyzer for speech translation system. 673-676 - Rolf Carlson, Sheri Hunnicutt:

Generic and domain-specific aspects of the waxholm NLP and dialog modules. 677-680 - Megumi Kameyama, Goh Kawai, Isao Arima:

A real-time system for summarizing human-human spontaneous spoken dialogues. 681-684 - Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer:

Evaluation of spoken language understanding and dialogue systems. 685-688 - Kuniko Kakita:

Inter-speaker interaction of F0 in dialogs. 689-692 - Hans Brandt-Pook

, Gernot A. Fink, Bernd Hildebrandt, Franz Kummert, Gerhard Sagerer:
A robust dialogue system for making an appointment. 693-696 - Kazuyuki Takagi, Shuichi Itahashi:

Segmentation of spoken dialogue by interjections, disfluent utterances and pauses. 697-700 - David Goddeau, Helen M. Meng, Joseph Polifroni, Stephanie Seneff, Senis Busayapongchai:

A form-based dialogue manager for spoken language applications. 701-704 - Steve Whittaker, David Attwater:

The design of complex telephony applications using large vocabulary speech technology. 705-708 - Stephen Sutton, David G. Novick, Ronald A. Cole, Pieter J. E. Vermeulen, Jacques de Villiers, Johan Schalkwyk, Mark A. Fanty:

Building 10, 000 spoken dialogue systems. 709-712 - Yen-Ju Yang, Lee-Feng Chien, Lin-Shan Lee:

Speaker intention modeling for large vocabulary Mandarin spoken dialogues. 713-716 - P. E. Kenne, Mary O'Kane:

Hybrid language models and spontaneous legal discourse. 717-720 - P. E. Kenne, Mary O'Kane:

Topic change and local perplexity in spoken legal dialogue. 721-724 - Jennifer J. Venditti, Marc Swerts:

Intonational cues to discourse structure in Japanese. 725-728 - Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær:

Principles for the design of cooperative spoken human-machine dialogue. 729-732 - Karen L. Jenkin, Michael S. Scordilis:

Development and comparison of three syllable stress classifiers. 733-736
Speech Disorders
- Donald G. Jamieson, Li Deng, M. Price, Vijay Parsa, J. Till:

Interaction of speech disorders with speech coders: effects on speech intelligibility. 737-740 - Maurílio Nunes Vieira, Arnold G. D. Maran, Fergus R. McInnes, Mervyn A. Jack:

Detecting arytenoid cartilage misplacement through acoustic and electroglottographic jitter analysis. 741-744 - Maurílio Nunes Vieira, Fergus R. McInnes, Mervyn A. Jack:

Robust F0 and jitter estimation in pathological voices. 745-748 - Fabrice Plante, H. Kessler, Barry M. G. Cheetham, J. E. Earis:

Speech monitoring of infective laryngitis. 749-752 - Jean Schoentgen, Raoul De Guchteneere:

Searching for nonlinear relations in whitened jitter time series. 753-756 - Liliana Gavidia-Ceballos, John H. L. Hansen, James F. Kaiser:

Vocal fold pathology assessment using AM autocorrelation analysis of the teager energy operator. 757-760 - David P. Kuehn:

Continuous positive airway pressure (CPAP) in the treatment of hypernasality. 761-763 - Carol Y. Espy-Wilson, Venkatesh R. Chari, Caroline B. Huang:

Enhancement of alaryngeal speech by adaptive filtering. 764-767 - Li Deng, Xuemin Shen, Donald G. Jamieson, J. Till:

Simulation of disordered speech using a frequency-domain vocal tract model. 768-771 - Yasuo Endo, Hideki Kasuya:

A stochastic model of fundamental period perturbation and its application to perception of pathological voice quality. 772-775 - Eric J. Wallen, John H. L. Hansen:

A screening test for speech pathology assessment using objective quality measures. 776-779 - Douglas A. Cairns, John H. L. Hansen, James F. Kaiser:

Recent advances in hypernasal speech detection using the nonlinear teager energy operator. 780-783
Vocal Tract Geometry
- Kiyoshi Honda, Shinji Maeda, Michiko Hashi, Jim Dembowski, John R. Westbury:

Human palate and related structures: their articulatory consequences. 784-787 - Edward P. Davis, Andrew Douglas, Maureen L. Stone:

A continuum mechanics representation of tongue deformation. 788-792 - Philbert Bangayan, Abeer Alwan, Shrikanth S. Narayanan:

From MRI and acoustic data to articulatory synthesis: a case study of the lateral approximants in american English. 793-796 - Shrikanth S. Narayanan, Abigail Kaun, Dani Byrd, Peter Ladefoged, Abeer Alwan:

Liquids in tamil. 797-800 - Chang-Sheng Yang, Hideki Kasuya:

Speaker individualities of vocal tract shapes of Japanese vowels measured by magnetic resonance images. 949-952 - Samir El-Masri, Xavier Pelorson, Pierre Saguet, Pierre Badin:

Vocal tract acoustics using the transmission line matrix (TLM) method. 953-956 - Gérard Bailly:

Building sensori-motor prototypes from audiovisual exemplars. 957-960 - Mats Båvegård, Gunnar Fant:

Parameterized VT area function inversion. 961-964 - Jianwu Dang, Kiyoshi Honda:

An improved vocal tract model of vowel production implementing piriform resonance and transvelar nasal coupling. 965-968 - C. Simon Blackburn, Steve J. Young:

Pseudo-articulatory speech synthesis for recognition using automatic feature extraction from x-ray data. 969-972
Prosody in ASR and Segmentation
- Sharon L. Oviatt, Gina-Anne Levow, Margaret MacEachern, Karen Kuhn:

Modeling hyperarticulate speech during human-computer error resolution. 801-804 - Siripong Potisuk, Mary P. Harper, Jackson T. Gandour:

Using stress to disambiguate spoken Thai sentences containing syntactic ambiguity. 805-808 - Hung-Yun Hsieh, Ren-Yuan Lyu, Lin-Shan Lee:

Use of prosodic information to integrate acoustic and linguistic knowledge in continuous Mandarin speech recognition with very large vocabulary. 809-812 - G. V. Ramana Rao, J. Srichand:

Word boundary detection using pitch variations. 813-816 - Atsuhiro Sakurai, Keikichi Hirose:

Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours. 817-820 - Vincent Pagel, Noelle Carbonell, Yves Laprie:

A new method for speech delexicalization, and its application to the perception of French prosody. 821-824
Acquisition and Learning by Machine
- Udo Bub:

Task adaptation for dialogues via telephone lines. 825-828 - Ronald A. Cole, Yonghong Yan, Troy Bailey:

The influence of bigram constraints on word recognition by humans: implications for computer speech recognition. 829-832 - Tetsunori Kobayashi:

ALICE: acquisition of language in conversational environment - an approach to weakly supervised training of spoken language system for language porting. 833-836 - Takashi Yoshimura, Satoru Hayamizu, Hiroshi Ohmura, Kazuyo Tanaka:

Pitch pattern clustering of user utterances in human-machine dialogue. 837-840 - Juan-Carlos Amengual, Enrique Vidal, José-Miguel Benedí:

Simplifying language through error-correcting decoding. 841-844 - Mauro Cettolo, Anna Corazza, Renato de Mori:

A mixed approach to speech understanding. 845-848
Dialogue Systems
- Jean-Luc Gauvain, Jean-Jacques Gangolf, Lori Lamel:

Speech recognition for an information kiosk. 849-852 - Helmer Strik, Albert Russel, Henk van den Heuvel, Catia Cucchiarini, Lou Boves:

Localizing an automatic inquiry system for public transport information. 853-856 - Stephen M. Marcus, Deborah W. Brown, Randy G. Goldberg, Max S. Schoeffler, William R. Wetzel, Richard R. Rosinski:

Prompt constrained natural language - evolving the next generation of telephony services. 857-860 - Tatsuya Kawahara, Chin-Hui Lee, Biing-Hwang Juang:

Key-phrase detection and verification for flexible speech understanding. 861-864 - Bernhard Suhm, Brad A. Myers, Alex Waibel:

Interactive recovery from speech recognition errors in speech user interfaces. 865-868 - Sunil Issar:

Estimation of language models for new spoken language applications. 869-872
Speech Enhancement and Robust Processing
- Xuemin Shen, Li Deng, Anisa Yasmin:

H-infinity filtering for speech enhancement. 873-876 - Saeed Vaseghi, Ben P. Milner:

A comparitive analysis of channel-robust features and channel equalization methods for speech recognition. 877-880 - Jia-Lin Shen, Wen-Liang Hwang, Lin-Shan Lee:

Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum. 881-884 - Kevin Power:

Durational modelling for improved connected digit recognition. 885-888 - Carlos Avendaño, Hynek Hermansky:

Study on the dereverberation of speech based on temporal envelope filtering. 889-892 - Thorsten Brants:

Estimating Markov model structures. 893-896 - Eric K. Ringger, James F. Allen:

A fertility channel model for post-correction of continuous speech recognition. 897-900 - Hiroshi Yasukawa:

Restoration of wide band signal from telephone speech using linear prediction error processing. 901-904 - Hiroshi Matsumoto, Noboru Naitoh:

Smoothed spectral subtraction for a frequency-weighted HMM in noisy speech recognition. 905-908 - William S. Woods, Martin Hansen, Thomas Wittkop, Birger Kollmeier:

A simple architecture for using multiple cues in sound separation. 909-912 - Bojan Petek, Ove Andersen, Paul Dalsgaard:

On the robust automatic segmentation of spontaneous speech. 913-916 - C. G. Miglietta, Chafic Mokbel, Denis Jouvet, Jean Monné:

Bayesian adaptation of speech recognizers to field speech data. 917-920 - A. J. Darlington, D. J. Campbell:

Sub-band adaptive filtering applied to speech enhancement. 921-924 - John P. Openshaw, John S. Mason:

Noise robust estimate of speech dynamics for speaker recognition. 925-928 - Javier Ortega-Garcia, Joaquin Gonzalez-Rodriguez:

Overview of speech enhancement techniques for automatic speaker recognition. 929-932 - Naomi Harte, Saeed Vaseghi, Ben P. Milner:

Dynamic features for segmental speech recognition. 933-936 - Takuya Koizumi, Mikio Mori, Shuji Taniguchi:

Speech recognition based on a model of human auditory system. 937-940 - Josep M. Salavedra, Enrique Masgrau:

APVQ encoder applied to wideband speech coding. 941-944 - Jin Zhou, Yair Shoham, Ali N. Akansu:

Simple fast vector quantization of the line spectral frequencies. 945-948
Speaker Adaptation and Normalization I
- Tomoko Matsui, Sadaoki Furui:

N-best-based instantaneous speaker adaptation method for speech recognition. 973-976 - Claude Montacié, Marie-José Caraty, Claude Barras:

Mixture splitting technic and temporal control in a HMM-based recognition system. 977-980 - Lei Yao, Dong Yu, Taiyi Huang:

A unified spectral transformation adaptation approach for robust speech recognition. 981-984 - Qiang Huo, Chin-Hui Lee:

On-line adaptive learning of the correlated continuous density hidden Markov models for speech recognition. 985-988 - Nikko Ström:

Speaker adaptation by modeling the speaker variation in a continuous speech recognition system. 989-992 - Yasuo Ariki, Shigeaki Tagashira:

An enquiring system of unknown words in TV news by spontaneous repetition (application of speaker normalization by speaker subspace projection). 993-996 - Jinsong Zhang, Beiqian Dai, Changfu Wang, HingKeung Kwan, Keikichi Hirose:

Adaptive recognition method based on posterior use of distribution pattern of output probabilities. 1129-1132 - Philip C. Woodland, David Pye, Mark J. F. Gales:

Iterative unsupervised adaptation using maximum likelihood linear regression. 1133-1136 - Tasos Anastasakos, John W. McDonough, Richard M. Schwartz, John Makhoul:

A compact model for speaker-adaptive training. 1137-1140 - Shigeru Homma, Jun-ichi Takahashi, Shigeki Sagayama:

Iterative unsupervised speaker adaptation for batch dictation. 1141-1144 - Daniel C. Burnett, Mark A. Fanty:

Rapid unsupervised adaptation to children's speech on a connected-digit task. 1145-1148 - Jun Ishii, Masahiro Tonomura, Shoichi Matsunaga:

Speaker adaptation using tree structured shared-state HMMs. 1149-1152
Spoken Language and NLP
- Richard M. Schwartz, Scott Miller, David Stallard, John Makhoul:

Language understanding using hidden understanding models. 997-1000 - Allen L. Gorin:

Processing of semantic information in fluently spoken language. 1001-1004 - Andreas Stolcke, Elizabeth Shriberg:

Automatic linguistic segmentation of conversational speech. 1005-1008 - Manuela Boros, Wieland Eckert, Florian Gallwitz, Günther Görz, Gerhard Hanrieder, Heinrich Niemann:

Towards understanding spontaneous speech: word accuracy vs. concept accuracy. 1009-1012 - Wolfgang Minker, Samir Bennacef, Jean-Luc Gauvain:

A stochastic case frame approach for natural language understanding. 1013-1016 - Frank Seide, Bernhard Rueber, Andreas Kellner:

Improving speech understanding by incorporating database constraints and dialogue history. 1017-1020 - Finn Dag Buø, Alex Waibel:

Learning to parse spontaneous speech. 1153-1156 - Jean-Yves Antoine:

Spontaneous speech and natural language processing ALPES: a robust semantic-led parser. 1157-1160 - Jorge Alvarez-Cercadillo, F. Javier Caminero-Gil, Carlos Crespo-Casas, Daniel Tapias Merino:

The natural language processing module for a voice assisted operator at telef nica i+D. 1161-1164 - André Berton, Pablo Fetter, Peter Regel-Brietzmann:

Compound words in large-vocabulary German speech recognition systems. 1165-1168 - Anton Batliner, Anke Feldhaus, Stefan Geißler, Tibor Kiss, Ralf Kompe, Elmar Nöth:

Prosody, empty categories and parsing - a success story. 1169-1172 - B. Srinivas:

"almost parsing" technique for language modeling. 1173-1176
Spoken Discourse Analysis/Synthesis
- Tetsuro Chino, Hiroyuki Tsuboi:

A new discourse structure model for spontaneous spoken dialogue. 1021-1024 - David Duff, Barbara Gates, Susann LuperFoy:

An architecture for spoken dialogue management. 1025-1028 - Monique E. van Donzel, Florien J. Koopmans-van Beinum:

Pausing strategies in discourse in dutch. 1029-1032 - Marc Swerts, Anne Wichmann, Robbert-Jan Beun:

Filled pauses as markers of discourse structure. 1033-1036 - Cheol-jae Seong, Minsoo Hahn:

The prosodic analysis of Korean dialogue speech - through a comparative study with read speech. 1037-1040 - Mary O'Kane, P. E. Kenne:

Changing the topic: how long does it take? 1041-1044
Acoustic Modeling
- Christian-Michael Westendorf, Jens Jelitto:

Learning pronunciation dictionary from speech data. 1045-1048 - Ariane Lazaridès, Yves Normandin, Roland Kuhn:

Improving decision trees for acoustic modeling. 1053-1056 - Gongjun Li, Taiyi Huang:

An improved training algorithm in HMM-based speech recognition. 1057-1060 - Ji Ming, Peter O'Boyle, John G. McMahon, Francis Jack Smith:

Speech recognition using a strong correlation assumption for the instantaneous spectra. 1061-1064 - Pau Pachès-Leal, Climent Nadeu:

On parameter filtering in continuous subword-unit-based speech recognition. 1065-1068 - Shigeki Okawa, Katsuhiko Shirai:

Estimation of statistical phoneme center considering phonemic environments. 1069-1072 - Xue Wang, Louis ten Bosch, Louis C. W. Pols:

Integration of context-dependent durational knowledge into HMM-based speech recognition. 1073-1076 - Toshiaki Fukada, Michiel Bacchiani, Kuldip K. Paliwal, Yoshinori Sagisaka:

Speech recognition based on acoustically derived segment units. 1077-1080 - Rivarol Vergin, Azarshid Farhat, Douglas D. O'Shaughnessy:

Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification. 1081-1084 - Tae-Young Yang, Won-Ho Shin, Weon-Goo Kim, Dae Hee Youn:

A codebook adaptation algorithm for SCHMM using formant distribution. 1085-1088 - Jacques Simonin, S. Bodin, Denis Jouvet, Katarina Bartkova:

Parameter tying for flexible speech recognition. 1089-1092 - Tsuneo Nitta, Shin'ichi Tanaka, Yasuyuki Masai, Hiroshi Matsuura:

Word-spotting based on inter-word and intra-word diphone models. 1093-1096 - Antonio Bonafonte, Josep Vidal, Albino Nogueiras:

Duration modeling with expanded HMM applied to speech recognition. 1097-1100 - Ricardo de Córdoba, José Manuel Pardo:

Different strategies for distribution clustering using discrete, semicontinuous and continuous HMMs in CSR. 1101-1104 - Ilija Zeljkovic, Shrikanth S. Narayanan:

Improved HMM phone and triphone models for realtime ASR telephony applications. 1105-1108 - Yasuhiro Minami, Sadaoki Furui:

Improved extended HMM composition by incorporating power variance. 1109-1112 - Gordon Ramsay, Li Deng:

Optimal filtering and smoothing for speech recognition using a stochastic target model. 1113-1116 - Zhihong Hu, Johan Schalkwyk, Etienne Barnard, Ronald A. Cole:

Speech recognition using syllable-like units. 1117-1120 - Jean-Claude Junqua, Lorenzo Vassallo:

Context modeling and clustering in continuous speech recognition. 2262-2265 - Li Deng, Jim Jian-Xiong Wu:

Hierarchical partition of the articulatory state space for overlapping-feature based speech recognition. 2266-2269 - Olivier Oppizzi, David Fournier, Philippe Gilles, Henri Meloni:

A fuzzy acoustic-phonetic decoder for speech recognition. 2270-2273 - Katrin Kirchhoff:

Syllable-level desynchronisation of phonetic features for speech recognition. 2274-2276 - James R. Glass, Jane W. Chang, Michael K. McCandless:

A probabilistic framework for feature-based speech recognition. 2277-2280 - Jim Jian-Xiong Wu, Li Deng, Jacky Chan:

Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese. 2281-2284
Physics and Simulation of the Vocal Tract
- Cecil H. Coker, Michael H. Krane, B. Y. Reis, R. A. Kubli:

Search for unexplored effects in speech production. 1121-1124 - Pierre Badin, Christian Abry:

Articulatory synthesis from x-rays and inversion for an adaptive speech robot. 1125-1128 - Hisayoshi Suzuki, Takayoshi Nakai, Hiroshi Sakakibara:

Analysis of acoustic properties of the nasal tract using 3-d FEM. 1285-1288 - Johan Liljencrants:

Experiments with analysis by synthesis of glottal airflow. 1289-1292
Duration and Rhythm
- Marise Ouellet, Benoît Tardif:

From segmental duration properties to rhythmic structure: a study of interactions between high and low level. 1177-1180 - Xue Wang, Louis C. W. Pols, Louis ten Bosch:

Analysis of context-dependent segmental duration for automatic speech recognition. 1181-1184 - Delphine Dahan:

The role of the rhythmic groups in the segmentation of continuous French speech. 1185-1188 - Zita McRobbie-Utasi:

The implications of temporal patterns for the prosody of boundary signaling in connected speech. 1189-1192 - Hyunbok Lee, Cheol-jae Seong:

Experimental phonetic study of the syllable duration of Korean with respect to the positional effect. 1193-1196 - Dik J. Hermes:

Timing of pitch movements and accentuation of syllables. 1197-1200
Acoustic Analysis
- Goangshiuan S. Ying, Leah H. Jamieson, Carl D. Mitchell:

A probabilistic approach to AMDF pitch detection. 1201-1204 - Alain Soquet, Véronique Lecuit, Thierry Metens, Didier Demolin:

From sagittal cut to area function: an RMI investigation. 1205-1208 - Léonard Janer, Juan José Bonet, Eduardo Lleida-Solano:

Pitch detection and voiced/unvoiced decision algorithm based on wavelet transforms. 1209-1212 - Yannis Stylianou:

Decomposition of speech signals into a deterministic and a stochastic part. 1213-1216 - Cheol-Woo Jo, Ho-Gyun Bang, William A. Ainsworth:

Improved glottal closure instant detector based on linear prediction and standard pitch concept. 1217-1220 - Xihong Wang, Stephen A. Zahorian, Stefan Auberg:

Analysis of speech segments using variable spectral/temporal resolution. 1221-1224 - Brian Eberman, William Goldenthal:

Time-based clustering for phonetic segmentation. 1225-1228 - Parham Zolfaghari, Tony Robinson:

Formant analysis using mixtures of Gaussians. 1229-1232 - Hywel B. Richards, John S. Mason, Melvyn J. Hunt, John S. Bridle:

Deriving articulatory representations from speech with various excitation modes. 1233-1236 - Manish Sharma, Richard J. Mammone:

"blind" speech segmentation: automatic segmentation of speech without linguistic knowledge. 1237-1240 - Hiroshi Ohmura, Kazuyo Tanaka:

Speech synthesis using a nonlinear energy damping model for the vocal folds vibration effect. 1241-1244 - Munehiro Namba, Hiroyuki Kamata, Yoshihisa Ishida:

Neural networks learning with L1 criteria and its efficiency in linear prediction of speech signals. 1245-1248 - Anna Esposito, Eugène C. Ezin, M. Ceccarelli:

Preprocessing and neural classification of English stop consonants [b, d, g, p, t, k]. 1249-1252 - K. S. Ananthakrishnan:

A comparison of modified k-means(MKM) and NN based real time adaptive clustering algorithms for articulatory space codebook formation. 1253-1256 - Wen Ding, Hideki Kasuya:

A novel approach to the estimation of voice source and vocal tract parameters from speech signals. 1257-1260 - Hartmut R. Pfitzinger, Susanne Burger, Sebastian Heid:

Syllable detection in read and spontaneous speech. 1261-1264 - Kuansan Wang, Chin-Hui Lee, Biing-Hwang Juang:

Maximum likelihood learning of auditory feature maps for stationary vowels. 1265-1268 - Antonio Bonafonte, Albino Nogueiras, Antonio Rodriguez-Garrido:

Explicit segmentation of speech using Gaussian models. 1269-1272 - E. Mousset, William A. Ainsworth, José A. R. Fonollosa:

A comparison of several recent methods of fundamental frequency and voicing decision estimation. 1273-1276 - Toshihiko Abe, Takao Kobayashi, Satoshi Imai:

Robust pitch estimation with harmonics enhancement in noisy environments based on instantaneous frequency. 1277-1280 - Asunción Moreno, Miquel Rutllán:

Integrated polispectrum on speech recognition. 1281-1284
Speech Recognition Using HMMs and NNs
- Joao P. Neto, Ciro Martins, Luís B. Almeida:

An incremental speaker-adaptation technique for hybrid HMM-MLP recognizer. 1293-1296 - Youngjoo Suh, Youngjik Lee:

Phoneme segmentation of continuous speech using multi-layer perceptron. 1297-1300 - Jeff A. Bilmes, Nelson Morgan, Su-Lin Wu, Hervé Bourlard:

Stochastic perceptual speech models with durational dependence. 1301-1304 - Gary D. Cook, Anthony J. Robinson:

Boosting the performance of connectionist large vocabulary speech recognition. 1305-1308 - Nicolas Pican, Dominique Fohr, Jean-François Mari:

HMMs and OWE neural network for continuous speech recognition. 1309-1312 - Steve R. Waterhouse, Dan J. Kershaw, Tony Robinson:

Smoothed local adaptation of connectionist systems. 1313-1316
Adverse Environments and Multiple Microphones
- Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano:

Robust speech recognition with speaker localization by a microphone array. 1317-1320 - Ea-Ee Jan, James L. Flanagan:

Sound source localization in reverberant environments using an outlier elimination algorithm. 1321-1324 - Dan J. Kershaw, Tony Robinson, Steve Renals:

The 1995 abbot LVCSR system for multiple unknown microphones. 1325-1328 - Diego Giuliani, Maurizio Omologo, Piergiorgio Svaizer:

Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM. 1329-1332 - Joaquin Gonzalez-Rodriguez, Javier Ortega-Garcia, César Martin, Luis Hernández:

Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. 1333-1336 - Kuan-Chieh Yen, Yunxin Zhao:

Robust automatic speech recognition using a multi-channel signal separation front-end. 1337-1340
Prosodic Synthesis in Dialogue
- Anders Lindström, Ivan Bretan, Mats Ljungqvist:

Prosody generation in text-to-speech conversion using dependency graphs. 1341-1344 - Hisako Asano, Hisashi Ohara, Yoshifumi Ooyama:

Extraction method of non-restrictive modification in Japanese as a marked factor of prosody. 1345-1348 - Scott Prevost:

Modeling contrast in the generation and synthesis of spoken language. 1349-1352 - Hajime Tsukada:

A left-to-right processing model of pausing in Japanese based on limited syntactic information. 1353-1356 - Dimitrios Galanis, Vassilios Darsinos, George Kokkinakis:

Modeling of intonation bearing emphasis for TTS-synthesis of greek dialogues. 1357-1360 - Barbara Heuft, Thomas Portele:

Synthesizing prosody: a prominence-based approach. 1361-1364
Speech Synthesis
- Richard Sproat:

Multilingual text analysis for text-to-speech synthesis. 1365-1368 - Yoshifumi Ooyama, Hisako Asano, Koji Matsuoka:

Spoken-style explanation generator for Japanese kanji using a text-to-speech system. 1369-1372 - Ken-ichi Magata, Tomoki Hamagami, Mitsuo Komura:

A method for estimating prosodic symbol from text for Japanese text-to-speech synthesis. 1373-1376 - Eduardo López Gonzalo, Jose M. Rodriguez-Garcia:

Statistical methods in data-driven modeling of Spanish prosody for text to speech. 1377-1380 - Jung-Chul Lee, Youngjik Lee, Sanghun Kim, Minsoo Hahn:

Intonation processing for TTS using stylization and neural network learning method. 1381-1384 - Alan W. Black, Andrew J. Hunt:

Generating F0 contours from toBI labels using linear regression. 1385-1388 - Wern-Jun Wang, Shaw-Hwa Hwang, Sin-Horng Chen:

The broad study of homograph disambiguity for Mandarin speech synthesis. 1389-1392 - Thierry Dutoit, Vincent Pagel, Nicolas Pierret, F. Bataille, Olivier van der Vrecken:

The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes. 1393-1396 - Makoto Hashimoto, Norio Higuchi:

Training data selection for voice conversion using speaker selection and vector field smoothing. 1397-1400 - Ki-Seung Lee, Dae Hee Youn, Il-Whan Cha:

A new voice transformation method based on both linear and nonlinear prediction analysis. 1401-1404 - Geneviève Baudoin, Yannis Stylianou:

On the transformation of the speech spectrum for voice conversion. 1405-1408 - Cristina Delogu, Andrea Paoloni, Susanna Ragazzini, Paola Ridolfi:

Spectral analysis of synthetic speech and natural speech with noise over the telephone line. 1409-1412 - Weizhong Zhu, Hideki Kasuya:

A new speech synthesis system based on the ARX speech production model. 1413-1416 - Geraldo Lino de Campos, Evandro B. Gouvêa:

Speech synthesis using the CELP algorithm. 1417-1420 - Shaw-Hwa Hwang, Sin-Horng Chen, Yih-Ru Wang:

A Mandarin text-to-speech system. 1421-1424 - Mike D. Edgington, A. Lowry:

Residual-based speech modification algorithms for text-to-speech synthesis. 1425-1428 - Per Olav Heggtveit:

A generalized LR parser for text-to-speech synthesis. 1429-1432 - Mat P. Pollard, Barry M. G. Cheetham, Colin C. Goodyear, Mike D. Edgington, A. Lowry:

Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis. 1433-1436 - Yasuhiko Arai, Ryo Mochizuki, Hirofumi Nishimura, Takashi Honda:

An excitation synchronous pitch waveform extraction method and its application to the VCV-concatenation synthesis of Japanese spoken words. 1437-1440 - Ren-Hua Wang, Qingfeng Liu, Difei Tang:

A new Chinese text-to-speech system with high naturalness. 1441-1444 - Ansgar Rinscheid:

Voice conversion based on topological feature maps and time-variant filtering. 1445-1448
Instructional Technology for Spoken Language
- Yoram Meron, Keikichi Hirose:

Language training system utilizing speech modification. 1449-1452 - Donald G. Jamieson, K. Yu:

Perception of English /r/ and /l/ speech contrasts by native Korean listeners with extensive English-language experience. 1453-1456 - Leonardo Neumeyer, Horacio Franco, Mitchel Weintraub, Patti Price:

Automatic text-independent pronunciation scoring of foreign language student speech. 1457-1460 - Antônio Simoes:

Assessing the contribution of instructional technology in the teaching of pronunciation. 1461-1464 - Maxine Eskénazi:

Detection of foreign speakers' pronunciation errors for second language training - preliminary results. 1465-1468 - Hansjörg Mixdorff:

Foreign accent in intonation patterns - a contrastive study applying a quantitative model of the F0 contour. 1469-1472 - Duncan J. Markham, Yasuko Nagano-Madsen:

Input modality effects in foreign accent. 1473-1476
Multimodal Spoken Language Processing
- Lynne E. Bernstein, Christian Benoît:

For speech perception by humans or machines, three senses are better than one. 1477-1480 - Kaoru Sekiyama, Yoh'ichi Tohkura, Michio Umeda:

A few factors which affect the degree of incorporating lip-read information into speech perception. 1481-1484 - Eric Vatikiotis-Bateson, Kevin G. Munhall, Y. Kasahara, Frederique Garcia, Hani Yehia:

Characterizing audiovisual information during speech. 1485-1488 - Charlotte M. Reed:

The implications of the tadoma method of speechreading for spoken language processing. 1489-1492 - Ruth Campbell:

Seeing speech in space and time: psychological and neurological findings. 1493-1496 - Kerry P. Green:

Studies of the mcgurk effect: implications for theories of speech perception. 1652-1655 - N. Michael Brooke:

Using the visual component in automatic speech recognition. 1656-1659 - Robert E. Remez:

Perceptual organization of speech in one and several modalities: common functions, common resources. 1660-1663 - David B. Pisoni, Helena M. Saldaña, Sonya M. Sheffert:

Multi-modal encoding of speech in memory: a first report. 1664-1667
Prosody - Phonological/Phonetic Measures
- Volker Strom, Christina Widera:

What's in the "pure" prosody? 1497-1500 - Marc Swerts, Eva Strangert, Mattias Heldner:

F0 declination in read-aloud and spontaneous speech. 1501-1504 - Yeon-Jun Kim, Yung-Hwan Oh:

Prediction of prosodic phrase boundaries considering variable speaking rate. 1505-1508 - Yoichi Yamashita, Riichiro Mizoguchi:

Prediction of F0 parameter of contextualized utterances in dialogue. 1509-1512 - Veronika Makarova, J. Matsui:

The production and perception of potentially ambiguous intonation contours by speakers of Russian and Japanese. 1513-1516 - Robert Eklund:

What is invariant and what is optional in the realization of a FOCUSED word? a cross-dialectal study of Swedish sentences with moving focus. 1517-1520
Phonetics and Perception
- Christine H. Shadle, Sheila J. Mair:

Quantifying spectral characteristics of fricatives. 1521-1524 - Natasha Warner:

Acoustic characteristics of ejectives in ingush. 1525-1528 - R. J. J. H. van Son, Louis C. W. Pols:

An acoustic profile of consonant reduction. 1529-1532 - Danièle Archambault, Blagovesta Maneva:

Devoicing in post-vocalic canadian-French obstruants. 1533-1536 - Alexander L. Francis, Howard C. Nusbaum:

Paying attention to speaking rate. 1537-1540 - Irene Appelbaum:

The lack of invariance problem and the goal of speech perception. 1541-1544
Language Acquisition
- Jean E. Andruski, Patricia K. Kuhl:

The acoustic structure of vowels in mothers' speech to infants and adults. 1545-1548 - Chris J. Clement, Florien J. Koopmans-van Beinum, Louis C. W. Pols:

Acoustical characteristics of sound production of deaf and normally hearing infants. 1549-1552 - John Kingston, Christine Bartels, José Benkí, Deanna Moore, Jeremy Rice, Rachel Thorburn, Neil Macmillan:

Learning non-native vowel categories. - Pierre A. Hallé, Toshisada Deguchi, Yuji Tamekawa, Benedicte de Boysson-Bardies, Shigeru Kiritani:

Word recognition by Japanese infants. 1557-1560 - Peter W. Jusczyk:

Investigations of the word segmentation abilities of infants. 1561-1564 - Akiko Hayashi, Yuji Tamekawa, Toshisada Deguchi, Shigeru Kiritani:

Developmental change in perception of clause boundaries by 6- and 10-month-old Japanese infants. 1565-1568
Production and Prosody Posters
- Paavo Alku, Erkki Vilkman:

A frequency domain method for parametrization of the voice source. 1569-1572 - Krzysztof Marasek:

Glottal correlates of the word stress and the tense/lax opposition in German. 1573-1576 - Suzanne Boyce, Carol Y. Espy-Wilson:

Coarticulatory stability in american English /r/. 1577-1580 - Shinobu Masaki, Reiko Akahane-Yamada, Mark K. Tiede, Yasuhiro Shimada, Ichiro Fujimoto:

An MRI-based analysis of the English /r/ and /l/ articulations. 1581-1584 - David van Kuijk:

Does lexical stress or metrical stress better predict word boundaries in Dutch? 1585-1588 - Alan Wrench, Alan D. McIntosh, William J. Hardcastle:

Optopalatograph (OPG): a new apparatus for speech production analysis. 1589-1592 - René Carré:

Prediction of vowel systems using a deductive approach. 1593-1596 - Sheila J. Mair, Celia Scully, Christine H. Shadle:

Distinctions between [t] and [tch] using electropalatography data. 1597-1600 - Michiko Hashi, Raymond D. Kent, John R. Westbury, Mary J. Lindstrom:

Relating formants and articulation in intelligibility test words. 1601-1604 - Imad Znagui, Mohamed Yeou:

The role of coarticulation in the perception of vowel quality in modern standard Arabic. 1605-1608 - Simon Arnfield, Wilf Jones:

Updating the reading EPG. 1609-1611 - Goangshiuan S. Ying, Leah H. Jamieson, Ruxin Chen, Carl D. Mitchell:

Lexical stress detection on stress-minimal word pairs. 1612-1615 - Jing Wang:

An acoustic study of the interaction between stressed and unstressed syllables in spoken Mandarin. 1616-1619 - Nobuaki Minematsu, Seiichi Nakagawa:

Automatic detection of accent nuclei at the head of words for speech recognition. 1620-1623 - Fu-Chiang Chou, Chiu-yu Tseng, Lin-Shan Lee:

Automatic generation of prosodic structure for high quality Mandarin speech synthesis. 1624-1627 - Tomoki Hamagami, Ken-ichi Magata, Mitsuo Komura:

A study on Japanese prosodic pattern and its modeling in restricted speech. 1628-1631 - Steve Hoskins:

A phonetic study of focus in intransitive verb sentences. 1632-1635 - Stefan Rapp:

Goethe for prosody. 1636-1639 - K. A. Straub:

Prosodic cues in syntactically ambiguous strings; an interactive speech planning mechanism. 1640-1643 - Jinfu Ni, Ren-Hua Wang, Deyu Xia:

A functional model for generation of the local components of F0 contours in Chinese. 1644-1647 - Marie Fellbaum:

The acquisition of voiceless stops in the interlanguage of second language learners of English and Spanish. 1648-1651
User-Machine Interfaces
- Brian Mellor, Chris Baber, C. Tunley:

Evaluating automatic speech recognition as a component of a multi-input device human-computer interface. 1668-1671 - Andrew Life, Ian Salter, Jean-Noël Temem, Franck Bernard, Sophie Rosset, Samir Bennacef, Lori Lamel:

Data collection for the MASK kiosk: WOz vs prototype system. 1672-1675 - Murat Karaorman, Ted H. Applebaum, Tatsuro Itoh, Mitsuru Endo, Yoshio Ohno, Masakatsu Hoshimi, Takahiro Kamai, Kenji Matsui, Kazue Hata, Steve Pearson, Jean-Claude Junqua:

An experimental Japanese/English interpreting video phone system. 1676-1679 - Sara Basson, Stephen Springer, Cynthia Fong, Hong C. Leung, Edward Man, Michele Olson, John F. Pitrelli, Ranvir Singh, Suk Wong:

User participation and compliance in speech automated telecommunications applications. 1680-1683 - Samuel Bayer:

Embedding speech in web interfaces. 1684-1687 - Toshihiro Isobe, Masatoshi Morishima, Fuminori Yoshitani, Nobuo Koizumi, Ken'ya Murakami:

Voice-activated home banking system and its field trial. 1688-1691
TTS Systems and Rules
- Sangho Lee, Yung-Hwan Oh:

A text analyzer for Korean text-to-speech systems. 1692-1695 - Helen E. Karn:

Design and evaluation of a phonological phrase parser for Spanish text-to-speech. 1696-1699 - Ove Andersen, Roland Kuhn, Ariane Lazaridès, Paul Dalsgaard, Jürgen Haas, Elmar Nöth:

Comparison of two tree-structured approaches for grapheme-to-phoneme conversion. 1700-1703 - Martin J. Adamson, Robert I. Damper:

A recurrent network that learns to pronounce English text. 1704-1707 - Eleonora Cavalcante Albano, Agnaldo Antonio Moreira:

Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese. 1708-1711 - Yuki Yoshida, Shin'ya Nakajima, Kazuo Hakoda, Tomohisa Hirokawa:

A new method of generating speech synthesis units based on phonological knowledge and clustering technique. 1712-1715
Prosody and Labeling
- Martine Grice, Matthias Reyelt, Ralf Benzmüller, Jörg Mayer, Anton Batliner:

Consistency in transcription and labelling of German intonation with GToBI. 1716-1719 - Anton Batliner, Ralf Kompe, Andreas Kießling, Heinrich Niemann, Elmar Nöth:

Syntactic-prosodic labeling of large spontaneous speech data-bases. 1720-1723 - Florien J. Koopmans-van Beinum, Monique E. van Donzel:

Relationship between discourse structure and dynamic speech rate. 1724-1727 - Nigel Ward:

Using prosodic clues to decide when to produce back-channel utterances. 1728-1731 - Marion Mast, Ralf Kompe, Stefan Harbeck, Andreas Kießling, Heinrich Niemann, Elmar Nöth, Ernst Günter Schukat-Talamazzini, Volker Warnke:

Dialog act classification with the help of prosody. 1732-1735 - David van Kuijk, Henk van den Heuvel, Lou Boves:

Using lexical stress in continuous speech recognition for dutch. 1736-1739
Speaker/Language Identification and Verification
- Karsten Kumpf, Robin W. King:

Automatic accent classification of foreign accented australian English speech. 1740-1743 - Filipp Korkmazskiy, Biing-Hwang Juang:

Discriminative adaptation for speaker verification. 1744-1747 - Verna Stockmal, D. Muljani, Zinny S. Bond:

Perceptual features of unknown foreign languages as revealed by multi-dimensional scaling. 1748-1751 - Kin Yu, John S. Mason:

On-line incremental adaptation for speaker verification using maximum likelihood estimates of CDHMM parameters. 1752-1755 - Dominique Genoud, Frédéric Bimbot, Guillaume Gravier, Gérard Chollet:

Combining methods to improve speaker verification decision. 1756-1759 - Cesar Martín del Alamo, J. Álvarez, Celinda de la Torre, F. J. Poyatos, Lis Hernández:

Incremental speaker adaptation with minimum error discriminative training for speaker identification. 1760-1763 - Konstantin P. Markov, Seiichi Nakagawa:

Frame level likelihood normalization for text-independent speaker identification using Gaussian mixture models. 1764-1767 - Ann E. Thymé-Gobbel, Sandra E. Hutchins:

On using prosodic cues in automatic language identification. 1768-1771 - Tadashi Kitamura, Shinsai Takei:

Speaker recognition model using two-dimensional mel-cepstrum and predictive neural network. 1772-1775 - HingKeung Kwan, Keikichi Hirose:

Unknown language rejection in language identification system. 1776-1779 - James Hieronymus, Shubha Kadambe:

Spoken language identification using large vocabulary speech recognition. 1780-1783 - Carlos Teixeira, Isabel Trancoso, António Joaquim Serralheiro:

Accent identification. 1784-1787 - Sarel van Vuuren:

Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch. 1788-1791 - Xue Yang, J. Bruce Millar, Iain MacLeod:

On the sources of inter- and intra-speaker variability in the acoustic dynamics of speech. 1792-1795 - Kay M. Berkling, Etienne Barnard:

Language identification with inaccurate string matching. 1796-1799 - Michael J. Carey, Eluned S. Parris, Harvey Lloyd-Thomas, Stephen J. Bennett:

Robust prosodic features for speaker identification. 1800-1803 - Enric Monte, Javier Hernando Pericas, Xavier Miró, A. Adolf:

Text independent speaker identification on noisy environments by means of self organizing maps. 1804-1807 - Paul Dalsgaard, Ove Andersen, Hanne Hesselager, Bojan Petek:

Language identification using language-dependent phonemes and language-independent speech units. 1808-1811
Emotion in Recognition and Synthesis
- Klaus R. Scherer:

Adding the affective dimension: a new look in speech analysis and synthesis. 1811 - John J. Ohala:

Ethological theory and the expression of emotion in the voice. 1812-1815 - Iain R. Murray, John L. Arnott:

Synthesizing emotions in speech: is it time to get excited? 1816-1819 - Frank Dellaert, Thomas Polzin, Alex Waibel:

Recognizing emotion in speech. 1970-1973 - Barbara Heuft, Thomas Portele, Monika Rauth:

Emotions in time domain synthesis. 1974-1977 - Simon Arnfield:

Word class driven synthesis of prosodic annotations. 1978-1980 - Michael Banbrook, Steve McLaughlin:

Dynamical modelling of vowel sounds as a synthesis tool. 1981-1984 - Tom Johnstone:

Emotional speech elicited using computer games. 1985-1988 - Roddy Cowie, Ellen Douglas-Cowie:

Automatic statistical analysis of the signal and prosodic signs of emotion in speech. 1989-1992
Stochastic Techniques in Robust Speech Recognition
- Chin-Hui Lee, Biing-Hwang Juang, Wu Chou, J. J. Molina-Perez:

A study on task-independent subword selection and modeling for speech recognition. 1820-1823 - Mazin G. Rahim, Chin-Hui Lee:

Simultaneous ANN feature and HMM recognizer design using string-based minimum classification error (MCE) training. 1824-1827 - Sunil K. Gupta, Frank K. Soong, Raziel Haimi-Cohen:

Quantizing mixture-weights in a tied-mixture HMM. 1828-1831 - Mark J. F. Gales, David Pye, Philip C. Woodland:

Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation. 1832-1835 - Arun C. Surendran, Chin-Hui Lee, Mazin G. Rahim:

Maximum-likelihood stochastic matching approach to non-linear equalization for robust speech recognition. 1836-1839 - Jen-Tzung Chien, Hsiao-Chuan Wang, Lee-Min Lee:

Estimation of channel bias for telephone speech recognition. 1840-1843
Prosodic Synthesis in Text to Speech
- M. E. Johnson:

Synthesis of English intonation using explicit models of reading and spontaneous speech. 1844-1847 - Merle Horne, Marcus Filipsson:

Implementation and evaluation of a model for synthesis of Swedish intonation. 1848-1851 - Nobuyuki Katae, Shinta Kimura:

Natural prosody generation for domain specific text-to-speech systems. 1852-1855 - Mark Tatham, Eric Lewis:

Improving text-to-speech synthesis. 1856-1859 - Sahar E. Bou-Ghazale, John H. L. Hansen:

Synthesis of stressed speech from isolated neutral speech using HMM-based models. 1860-1863 - Ales Dobnikar:

Modeling segment intonation for Slovene TTS system. 1864-1867
Dialogue Events
- Elizabeth Shriberg, Andreas Stolcke:

Word predictability after hesitations: a corpus-based study. 1868-1871 - Li-chiung Yang:

Interruptions and intonation. 1872-1875 - Robin J. Lickley, Ellen Gurman Bard:

On not recognizing disfluencies in dialogue. 1876-1879 - Philip N. Garner, Sue Browning, Roger K. Moore, Martin J. Russell:

A theory of word frequencies and its application to dialogue move recognition. 1880-1883 - David R. Traum, Peter A. Heeman:

Utterance units and grounding in spoken dialogue. 1884-1887 - David G. Novick, Brian Hansen, Karen Ward:

Coordinating turn-taking with gaze. 1888-1891
Databases and Tools
- Peter Roach, Simon Arnfield, William J. Barry, J. Baltova, Marian Boldea, Adrian Fourcin, Wiktor Gonet, Ryszard Gubrynowicz, E. Hallum, Lori Lamel, Krzysztof Marasek, Alain Marchal, Einar Meister, Klára Vicsi:

BABEL: an eastern european multi-language database. 1892-1893 - Ren-Hua Wang, Deyu Xia, Jinfu Ni, Bicheng Liu:

USTC95 - a putonghua corpus. 1894-1897 - Edward Hurley, Joseph Polifroni, James R. Glass:

Telephone data collection using the world wide web. 1898-1901 - M. Falcone, A. Gallo:

The "SIVA" speech database for speaker verification: description and evaluation. 1902-1905 - Christoph Draxler:

A multi-level description of date expressions in German telephone speech. 1906-1909 - Robert H. Halstead Jr., Ben Serridge, Jean-Manuel Van Thong, William Goldenthal:

Viterbi search visualization using vista: a generic performance visualization tool. 1910-1913 - Toomas Altosaar, Matti Karjalainen, Martti Vainio:

A multilingual phonetic representation and analysis system for different speech databases. 1914-1917 - Detlev Langmann, Reinhold Haeb-Umbach, Lou Boves, Els den Os:

FRESCO: the French telephone speech data collection - part of the european Speechdat(m) project. 1918-1921 - Johannes Müller, Holger Stahl, Manfred K. Lang:

Predicting the out-of-vocabulary rate and the required vocabulary size for speech processing applications. 1922-1925 - Nathalie Parlangeau, Alain Marchal:

AMULET: automatic MUltisensor speech labelling and event tracking: study of the spatio-temporal correlations in voiceless plosive production. 1926-1929 - Minsoo Hahn, Sanghun Kim, Jung-Chul Lee, Yong-Ju Lee:

Constructing multi-level speech database for spontaneous speech processing. 1930-1933 - Marian Boldea, Alin Doroga, Tiberiu Dumitrescu, Maria Pescaru:

Preliminaries to a romanian speech database. 1934-1937 - Klaus J. Kohler:

Labelled data bank of spoken standard German the Kiel corpus of read/spontaneous speech. 1938-1941 - I. Lee Hetherington, Michael K. McCandless:

SAPPHIRE: an extensible speech analysis and recognition tool based on tcl/tk. 1942-1945 - Jiro Kiyama, Yoshiaki Itoh, Ryuichi Oka:

Automatic detection of topic boundaries and keywords in arbitrary speech using incremental reference interval-free continuous DP. 1946-1949 - Bo-Ren Bai, Lee-Feng Chien, Lin-Shan Lee:

Very-large-vocabulary Mandarin voice message file retrieval using speech queries. 1950-1953 - Håkan Melin:

Gandalf - a Swedish telephone speaker verification database. 1954-1957 - Ellen Gurman Bard, Catherine Sotillo, Anne H. Anderson, M. M. Taylor:

The DCIEM map task corpus: spontaneous dialogue under sleep deprivation and drug treatment. 1958-1961 - Xavier Menéndez-Pidal, James B. Polikoff, Shirley M. Peters, Jennie E. Leonzio, H. Timothy Bunnell:

The nemours database of dysarthric speech. 1962-1965 - Jean Hennebert, Dijana Petrovska-Delacrétaz:

POST: parallel object-oriented speech toolkit. 1966-1969
Robust Speech Processing
- Xiaoyu Zhang, Richard J. Mammone:

Channel and noise normalization using affine transformed cepstrum. 1993-1996 - Tom Claes, Fei Xie, Dirk Van Compernolle:

Spectral estimation and normalisation for robust speech recognition. 1997-2000 - Wu Chou, Nambi Seshadri, Mazin G. Rahim:

Trellis encoded vector quantization for robust speech recognition. 2001-2004 - Brian Mak, Etienne Barnard:

Phone clustering using the bhattacharyya distance. 2005-2008 - Atsushi Wakao, Kazuya Takeda, Fumitada Itakura:

Variability of lombard effects under different noise conditions. 2009-2012 - Sang-Mun Chi, Yung-Hwan Oh:

Lombard effect compensation and noise suppression for noisy Lombard speech recognition. 2013-2016
Dialects and Speaking Styles
- A. W. F. Huggins, Yogen Patel:

The use of shibboleth words for automatically classifying speakers by dialect. 2017-2020 - Ikuo Kudo, Takao Nakama, Tomoko Watanabe, Reiko Kameyama:

Data collection of Japanese dialects and its influence into speech recognition. 2021-2024 - David R. Miller, James Trischitta:

Statistical dialect classification based on mean phonetic features. 2025-2027 - Knut Kvale:

Norwegian numerals: a challenge to automatic speech recognition. 2028-2031 - Celinda de la Torre, F. Javier Caminero-Gil, Jorge Alvarez-Cercadillo, Cesar Martín del Alamo, Luis A. Hernández Gómez:

Evaluation of the telef nica i+d natural numbers recognizer over different dialects of Spanish from Spain and America. 2032-2035
Production and Perception of Prosody
- Fred Cummins, Robert F. Port:

Rhythmic constraints on English stress timing. 2036-2039 - Irene Vogel, Steve Hoskins:

On the interaction of clash, focus and phonological phrasing. 2040-2043 - Gunnar Fant, Anita Kruckenberg:

On the quantal nature of speech timing. 2044-2047 - David House:

Differential perception of tonal contours through the syllable. 2048-2051 - Martti Vainio, Toomas Altosaar:

Pitch, loudness, and segmental duration correlates: towards a model for the phonetic aspects of finnish prosody. 2052-2055 - Nobuaki Minematsu, Seiichi Nakagawa, Keikichi Hirose:

Prosodic manipulation system of speech material for perceptual experiments. 2056-2059
Topics in ASR and Search
- Joerg P. Ueberla, I. R. Gransden:

Clustered language models with context-equivalent states. 2060-2062 - Yuji Yonezawa, Masato Akagi:

Modeling of contextual effects and its application to word spotting. 2063-2066 - Jochen Junkawitsch, L. Neubauer, Harald Höge, Günther Ruske:

A new keyword spotting algorithm with pre-calculated optimal thresholds. 2067-2070 - Roxane Lacouture, Yves Normandin:

Detection of ambiguous portions of signal corresponding to OOV words or misrecognized portions of input. 2071-2074 - Fabio Brugnara, Marcello Federico:

Techniques for approximating a trigram language model. 2075-2078 - Keizaburo Takagi, Koichi Shinoda, Hiroaki Hattori, Takao Watanabe:

Unsupervised and incremental speaker adaptation under adverse environmental conditions. 2079-2082 - Hugo Van hamme, Filip Van Aelten:

An adaptive-beam pruning technique for continuous speech recognition. 2083-2086 - Carlos Avendaño, Sarel van Vuuren, Hynek Hermansky:

Data based filter design for RASTA-like channel normalization in ASR. 2087-2090 - Stefan Ortmanns, Hermann Ney, Frank Seide, Ingo Lindam:

A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition. 2091-2094 - Stefan Ortmanns, Hermann Ney, Andreas Eiden:

Language-model look-ahead for large vocabulary speech recognition. 2095-2098 - Jean-Luc Husson, Yves Laprie:

A new search algorithm in segmentation lattices of speech signals. 2099-2102 - Tomokazu Yamada, Shigeki Sagayama:

LR-parser-driven viterbi search with hypotheses merging mechanism using context-dependent phone models. 2103-2106 - Jan Nouza:

Discrete-utterance recognition with a fast match based on total data reduction. 2107-2110 - F. Javier Caminero-Gil, Celinda de la Torre, Luis Villarrubia, Cesar Martín del Alamo, Lis Hernández:

On-line garbage modeling with discriminant analysis for utterance verification. 2111-2114 - Paul Placeway, John D. Lafferty:

Cheating with imperfect transcripts. 2115-2118 - Naoto Iwahashi:

Novel training method for classifiers used in speaker adaptation. 2119-2122 - Katsuki Minamino:

Large vocabulary word recognition based on a graph-structured dictionary. 2123-2126 - Bach-Hiep Tran, Frank Seide, Volker Steinbiss:

A word graph based n-best search in continuous speech recognition. 2127-2130 - David M. Goblirsch:

Viterbi beam search with layered bigrams. 2131-2134 - Eric R. Buhrke, Wu Chou, Qiru Zhou:

A wave decoder for continuous speech recognition. 2135-2138 - Eric Thelen:

Long term on-line speaker adaptation for large vocabulary dictation. 2139-2142 - Gerhard Sagerer, Heike Rautenstrauch, Gernot A. Fink, Bernd Hildebrandt, A. Jusek, Franz Kummert:

Incremental generation of word graphs. 2143-2146 - Irina Illina, Yifan Gong:

Improvement in n-best search for continuous speech recognition. 2147-2150 - Antonio Bonafonte, José B. Mariño, Albino Nogueiras:

Sethos: the UPC speech understanding system. 2151-2154 - Pietro Laface, Luciano Fissore, A. Maro, Franco Ravera:

Segmental search for continuous speech recognition. 2155-2158
Multimodal Dialogue/HCI
- Andrew P. Breen, E. Bowers, W. Welsh:

An investigation into the generation of mouth shapes for a talking head. 2159-2162 - Bertrand Le Goff, Christian Benoît:

A text-to-audiovisual-speech synthesizer for French. 2163-2166 - Yuri Iwano, Shioya Kageyama, Emi Morikawa, Shu Nakazato, Katsuhiko Shirai:

Analysis of head movements and its role in spoken dialogue. 2167-2170 - Satoru Hayamizu, Osamu Hasegawa, Katunobu Itou, Katsuhiko Sakaue, Kazuyo Tanaka, Shigeki Nagaya, Masayuki Nakazawa, T. Endoh, Fumio Togawa, Kenji Sakamoto, Kazuhiko Yamamoto:

RWC multimodal database for interactions by integration of spoken language and visual information. 2171-2174 - Christian Cavé, Isabelle Guaïtella, Roxane Bertrand, Serge Santi, Françoise Harlay, Robert Espesser:

About the relationship between eyebrow movements and F0 variations. 2175-2178 - Laurel Fais, Kyung-ho Loken-Kim, Tsuyoshi Morimoto:

How many words is a picture really worth? 2179-2182 - A. Lagana, Fabio Lavagetto, A. Storace:

Visual synthesis of source acoustic speech through kohonen neural networks. 2183-2186 - Helena M. Saldaña, David B. Pisoni, Jennifer M. Fellowes, Robert E. Remez:

Audio-visual speech perception without speech cues. 2187-2190
Multilingual Speech Processing
- Jim Barnett, Andrés Corrada, G. Gao, Larry Gillick, Yoshiko Ito, Steve Lowe, Linda Manganaro, Barbara Peskin:

Multilingual speech recognition at dragon systems. 2191-2194 - Joachim Köhler:

Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds. 2195-2198 - Atsushi Nakamura, Shoichi Matsunaga, Tohru Shimizu, Masahiro Tonomura, Yoshinori Sagisaka:

Japanese speech databases for robust speech recognition. 2199-2202 - Lori Lamel, Martine Adda-Decker, Jean-Luc Gauvain, Gilles Adda:

Spoken language processing in a multilingual context. 2203-2206 - Victor Zue, Stephanie Seneff, Joseph Polifroni, Helen M. Meng, James R. Glass:

Multilingual human-computer interactions: from information access to language learning. 2207-2210 - Ulla Ackermann, Bianca Angelini, Fabio Brugnara, Marcello Federico, Diego Giuliani, Roberto Gretter, Gianni Lazzari, Heinrich Niemann:

Speedata: multilingual spoken data entry. 2211-2214 - Hiyan Alshawi:

Head automata for speech translation. 2360-2363 - Ye-Yi Wang, John D. Lafferty, Alex Waibel:

Word clustering with parallel spoken language corpora. 2364-2367 - Jae-Woo Yang, Youngjik Lee:

Toward translating Korean speech into other languages. 2368-2370 - Thomas Bub, Johannes Schwinn:

VERBMOBIL: the evolution of a complex large speech-to-speech translation system. 2371-2374 - Alon Lavie, Alex Waibel, Lori S. Levin, Donna Gates, Marsal Gavaldà, Torsten Zeppenfeld, Puming Zhan, Oren Glickman:

Translation of conversational speech with JANUS-II. 2375-2378
Acoustics in Synthesis
- William H. Edmondson, Jon P. Iles, Dorota J. Iskra:

Pseudo-articulatory representations in speech synthesis and recognition. 2215-2218 - David R. Williams:

Synthesis of initial (/s/-) stop-liquid clusters using HLsyn. 2219-2222 - Chilin Shih:

Synthesis of trill. 2223-2226 - Wai Kit Lo, P. C. Ching:

Phone-based speech synthesis with neural network and articulatory control. 2227-2230 - P. Martland, Sandra P. Whiteside, Steve W. Beet, Ladan Baghai-Ravary:

Analysis of ten vowel sounds across gender and regional/cultural accent. 2231-2234 - Masanobu Abe:

Speech morphing by gradually changing spectrum parameter and fundamental frequency. 2235-2238
Pitch and Rate
- Edouard Geoffrois:

The multi-lag-window method for robust extended-range F0 determination. 2239-2242 - Kenneth E. Barner:

Nonlinear estimation of DEGG signals with applications to speech pitch detection. 2243-2246 - John A. Maidment, María Luisa García Lecumberri:

Pitch analysis methods for cross-speaker comparison. 2247-2249 - Steve W. Beet, Ladan Baghai-Ravary:

Continuous adaptation of linear models with impulsive excitation. 2250-2253 - Sumio Ohno, Masamichi Fukumiya, Hiroya Fujisaki:

Quantitative analysis of the local speech rate and its application to speech synthesis. 2254-2257 - Jan P. Verhasselt, Jean-Pierre Martens:

A fast and reliable rate of speech detector. 2258-2261
General ASR Posters
- Puming Zhan, Klaus Ries, Marsal Gavaldà, Donna Gates, Alon Lavie, Alex Waibel:

JANUS-II: towards spontaneous Spanish speech recognition. 2285-2288 - Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle:

Reduced semi-continuous models for large vocabulary continuous speech recognition in Dutch. 2289-2292 - Andrei Constantinescu, Olivier Bornet, Gilles Caloz, Gérard Chollet:

Validating different flexible vocabulary approaches on the Swiss French Polyphone and Polyvar databases. 2293-2296 - Néstor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack:

Use of a reliability coefficient in noise cancelling by neural net and weighted matching algorithms. 2297-2300 - Kazuhiko Ozeki:

Likelihood normalization using an ergodic HMM for continuous speech recognition. 2301-2304 - Laurence Candille, Henri Meloni:

Dynamic control of a production model. 2305-2308 - Hiroaki Hattori, Eiko Yamada:

Speech recognition using sub-word units dependent on phonetic contexts of both training and recognition vocabularies. 2309-2312 - Bruno Jacob, Christine Sénac:

Hidden Markov models merging acoustic and articulatory information to automatic speech recognition. 2313-2315 - Mats Blomberg, Kjell Elenius:

Creation of unseen triphones from diphones and monophones using a speech production approach. 2316-2319 - Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang:

Speaker-independent dictation of Chinese speech with 32k vocabulary. 2320-2323 - Jason J. Humphries, Philip C. Woodland, David J. B. Pearce:

Using accent-specific pronunciation modelling for robust speech recognition. 2324-2327 - Tilo Sloboda, Alex Waibel:

Dictionary learning for spontaneous speech recognition. 2328-2331 - Johan de Veth, Lou Boves:

Comparison of channel normalisation techniques for automatic speech recognition over the phone. 2332-2335 - Manuel A. Leandro, José Manuel Pardo:

Anchor point detection for continuous speech recognition in Spanish: the spotting of phonetic events. 2336-2339 - Bhiksha Raj, Evandro Bacci Gouvêa, Pedro J. Moreno, Richard M. Stern:

Cepstral compensation by polynomial approximation for environment-independent speech recognition. 2340-2343 - B. T. Lilly, Kuldip K. Paliwal:

Effect of speech coders on speech recognition performance. 2344-2347 - Léonard Janer, Josep Martí, Climent Nadeu, Eduardo Lleida-Solano:

Wavelet transforms for non-uniform speech recogntion systems. 2348-2351 - Tsuyoshi Usagawa, Markus Bodden, Klaus Rateitschek:

A binaural model as a front-end for isolated word recognition. 2352-2355 - Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata:

A new speech enhancement: speech stream segregation. 2356-2359
Data-based Synthesis
- Andrew Slater, John Coleman

:
Non-segmental analysis and synthesis based on a speech database. 2379-2382 - Ralf Benzmüller, William J. Barry:

Microsegment synthesis - economic principles in a low-cost solution. 2383-2386 - Xuedong Huang, Alex Acero, J. Adcock, Hsiao-Wuen Hon, John Goldsmith, Jingsong Liu, Mike Plumpe:

Whistler: a trainable text-to-speech system. 2387-2390 - Thomas Portele, Karlheinz Stöber, Horst Meyer, Wolfgang Hess:

Generation of multiple synthesis inventories by a bootstrapping procedure. 2391-2394 - Bernd Möbius, Jan P. H. van Santen:

Modeling segmental duration in German text-to-speech synthesis. 2395-2398 - Nick Campbell:

Autolabelling Japanese ToBI. 2399-2402
Speaker Identification and Verification
- Sarangarajan Parthasarathy, Aaron E. Rosenberg:

General phrase speaker verification using sub-word background models and likelihood-ratio scoring. 2403-2406 - Jin'ichi Murakami, Masahide Sugiyama, Hideyuki Watanabe:

Unknown-multiple signal source clustering problem using ergodic HMM and applied to speaker classification. 2407-2410 - Jean-Luc Le Floch, Claude Montacié, Marie-José Caraty:

GMM and ARVM cooperation and competition for text-independent speaker recognition on telephone speech. 2411-2414 - Qiguang Lin, Ea-Ee Jan, ChiWei Che, Dong-Suk Yuk, James L. Flanagan:

Selective use of the speech spectrum and a VQGMM method for speaker identification. 2415-2418 - Michael Newman, Larry Gillick, Yoshiko Ito, Don McAllaster, Barbara Peskin:

Speaker verification through large vocabulary continuous speech recognition. 2419-2422 - Andrea Paoloni, Susanna Ragazzini, Giacomo Ravaioli:

Predictive neural networks in text independent speaker verification: an evaluation on the SIVA database. 2423-2426
Acoustic Phonetics
- Nisheeth Shrotriya, Rajesh Verma, Sunil K. Gupta, S. S. Agrawal:

Durational characterstics of hindi consonant clusters. 2427-2430 - Beng T. Tan, Minyue Fu, Andrew Spray, Phillip Dermody:

The use of wavelet transforms in phoneme recognition. 2431-2434 - Hisao Kuwabara:

Acoustic properties of phonemes in continuous speech for different speaking rate. 2435-2438 - Hiroya Fujisaki, Sumio Ohno:

Prosodic parameterization of spoken Japanese based on a model of the generation process of F0 contours. 2439-2442 - Arman Maghbouleh:

A logistic regression model for detecting prominences. 2443-2445 - Beat Pfister:

High-quality prosodic modification of speech signals. 2446-2449
Perception of Vowels and Consonants
- Jialu Zhang:

On the syllable structures of Chinese relating to speech recognition. 2450-2453 - Takashi Otake, Kiyoko Yoneyama:

Can a moraic nasal occur word-initially in Japanese? 2454-2457 - Winifred Strange, Reiko Akahane-Yamada, B. H. Fitzgerald, Rieko Kubo:

Perceptual assimilation of american English vowels by Japanese listeners. 2458-2461 - Winifred Strange, Ocke-Schwen Bohn, S. A. Trent, M. C. McNair, K. C. Bielec:

Context and speaker effects in the perceptual assimilation of German vowels by american listeners. 2462-2465 - Mohamed Zahid:

Examination of a perceptual non-native speech contrast: pharyngealized/non-pharyngealized discrimination by French-speaking adults. 2466-2469 - Roel Smits:

Context-dependent relevance of burst and transitions for perceived place in stops: it's in production, not perception. 2470-2473 - Ryoji Baba, Kaori Omuro, Hiromitsu Miyazono, Tsuyoshi Usagawa, Masahiko Higuchi:

The perception of morae in long vowels comparison among Japanese, Korean and English speakers. 2474-2477 - Robin J. Lickley:

Juncture cues to disfluency. 2478-2481 - James R. Sawusch:

Effects of duration and formant movement on vowel perception. 2482-2485 - Neeraj Deshmukh, Richard Duncan, Aravind Ganapathiraju, Joseph Picone:

Benchmarking human performance for continuous speech recognition. 2486-2489 - Takayuki Arai, Misha Pavel, Hynek Hermansky, Carlos Avendaño:

Intelligibility of speech with filtered time trajectories of spectral envelopes. 2490-2493 - Douglas H. Whalen, Sonya M. Sheffert:

Perceptual use of vowel and speaker information in breath sounds. 2494-2497 - Philippe Mousty, Monique Radeau, Ronald Peereman, Paul Bertelson:

The role of neighborhood relative frequency in spoken word recognition. 2498-2501 - James M. McQueen, Mark A. Pitt:

Transitional probability and phoneme monitoring. 2502-2505 - Anne Bonneau:

Identification of vowel features from French stop bursts. 2506-2509 - Zinny S. Bond, Thomas J. Moore, Beverley Gable:

Listening in a second language. 2510-2513 - Denis Burnham, Elizabeth Francis, Di Webster, Sudaporn Luksaneeyanawin, Chayada Attapaiboon, Francisco Lacerda, Peter Keller:

Perception of lexical tone across languages: evidence for a linguistic mode of processing. 2514-2517 - James S. Magnuson, Reiko Akahane-Yamada:

Acoustic correlates to the effects of talker variability on the perception of English /r/ and /l/ by Japanese listeners. 2518-2521

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














