default search action
ICASSP 1998: Seattle, Washington, USA
- Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98, Seattle, Washington, USA, May 12-15, 1998. IEEE 1998, ISBN 0-7803-4428-6
Volume 1
Features for Automatic Speech Recognition I
- Philip N. Garner, Wendy J. Holmes:
On the robust incorporation of formant features into hidden Markov models for automatic speech recognition. 1-4 - Christoph Neukirchen, Daniel Willett, Stefan Eickeler, Stefan Müller:
Exploiting acoustic feature correlations by joint neural vector quantizer design in a discrete HMM system. 5-8 - Gerhard Rigoll, Daniel Willett:
A NN/HMM hybrid for continuous speech recognition with a discriminant nonlinear feature extraction. 9-12 - Partha Niyogi, Padma Ramesh:
Incorporating voice onset time to improve letter recognition accuracies. 13-16 - Rathinavelu Chengalvarayan:
On the use of normalized LPC error towards better large vocabulary speech recognition systems. 17-20 - David L. Thomson, Rathinavelu Chengalvarayan:
Use of periodicity and jitter as speech recognition features. 21-24 - Keikichi Hirose, Koji Iwano:
Accent type recognition and syntactic boundary detection of Japanese using statistical modeling of moraic transitions of fundamental frequency contours. 25-28 - Tsuneo Nitta:
A novel feature-extraction for speech recognition based on multiple acoustic-feature planes. 29-32
Spectral Quantization
- Tadashi Yonezaki, Koji Yoshida, Toshio Yagi:
An error correction approach based on the MAP algorithm combined with hidden Markov models. 33-36 - Thomas Eriksson, Hong-Goo Kang, Yannis Stylianou:
Quantization of the spectral envelope for sinusoidal coders. 37-40 - Srinivas Nandkumar, Kumar Swaminathan, Udaya Bhaskar:
Robust speech mode based LSF vector quantization for low bit rate coders. 41-44 - Hai Le Vu, László Lois:
A new general distance measure for quantization of LSF and their transformed coefficients. 45-48 - Pasi Ojala, Ari Lakaniemi:
Variable model order LPC quantization. 49-52 - Damith J. Mudugamuwa, Alan B. Bradley:
Optimal transform for segmented parametric speech coding. 53-56 - Sridha Sridharan, John Leis:
Two novel lossless algorithms to exploit index redundancy in VQ speech compression. 57-60 - Costas S. Xydeas, Thomas M. Chapman:
Multicodebook vector quantization of LPC parameters. 61-64 - Wenhui Jin, Wai-Yip Chan:
Personal speech coding. 65-68 - Lin Yu Tseng, Shiueng Bien Yang:
A genetic approach to the design of general-tree-structured vector quantizers for speech coding. 69-72
Speaker Adaptation and Normalization in Adverse Environments
- Mohamed Afify, Jean-Paul Haton:
Minimum cross-entropy adaptation of hidden Markov models. 73-76 - Hui Jiang, Keikichi Hirose, Qiang Huo:
Improving Viterbi Bayesian predictive classification via sequential Bayesian learning in robust speech recognition. 77-80 - Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, Wen Gao, Doosung Hwang:
Discriminative learning of additive noise and channel distortions for robust speech recognition. 81-84 - Kari Laurila, Marcel Vasilache, Olli Viikki:
A combination of discriminative and maximum likelihood techniques for noise robust speech recognition. 85-88 - Lionel Delphin-Poulat, Chafic Mokbel, Jérôme Idier:
Frame-synchronous stochastic matching based on the Kullback-Leibler information. 89-92 - Yasuo Ariki, Miharu Sakuragi:
Unsupervised speaker normalization using canonical correlation analysis. 93-96 - Jun Ishii, Toshiaki Fukuda:
Speaker independent acoustic modeling using speaker normalization. 97-100 - Theodoros Salonidis, Vassilios Digalakis:
Robust speech recognition for multiple topological scenarios of the GSM mobile phone system. 101-104
Speaker Recognition
- Aaron E. Rosenberg, Olivier Siohan, Sarangarajan Parthasarathy:
Speaker verification using minimum verification error training. 105-108 - Olivier Siohan, Aaron E. Rosenberg, Sarangarajan Parthasarathy:
Speaker identification using minimum classification error training. 109-112 - William Mistretta, Kevin R. Farrell:
Model adaptation methods for speaker verification. 113-116 - Tomoko Matsui, Kiyoaki Aikawa:
Robust model for speaker verification against session-dependent utterance variation. 117-120 - Andrzej Drygajlo, Mounir El-Maliki:
Speaker verification in noisy environments with combined spectral subtraction and missing feature theory. 121-124 - Jean-Benoît Pierrot, Johan Lindberg, Johan Koolwaaij, Hans-Peter Hutter, Dominique Genoud, Mats Blomberg, Frédéric Bimbot:
A comparison of a priori threshold setting procedures for speaker verification in the CAVE project. 125-128 - Dominique Genoud, Miguel Moreira, Eddy Mayoraz:
Text dependent speaker verification using binary classifiers. 129-132 - Qi Li, Biing-Hwang Juang:
Speaker verification using verbal information verification for automatic enrolment. 133-136
CELP Coding
- Hironori Ito, Masahiro Serizawa, Kazunori Ozawa, Toshiyuki Nomura:
An adaptive multi-rate speech codec based on MP-CELP coding algorithm for ETSI AMR standard. 137-140 - Janne Vainio, Hannu Mikkola, Kari Järvinen, Petri Haavisto:
GSM EFR based multi-rate codec family. 141-144 - Roar Hagen, Erik Ekudden, Björn Johansson, W. Bastiaan Kleijn:
Removal of sparse-excitation artifacts in CELP. 145-148 - Hong Kook Kim:
Adaptive encoding of fixed codebook in CELP coders. 149-152 - Kazunori Ozawa, Masahiro Serizawa:
High quality multi-pulse based CELP speech coding at 6.4 kb/s and its subjective evaluation. 153-156 - Jürgen Schnitzler:
A 13.0 kbit/s wideband speech codec based on SB-ACELP. 157-160 - Kazuhito Koishida, Gou Hirabayashi, Keiichi Tokuda, Takao Kobayashi:
A wideband CELP speech coder at 16 kbit/s based on mel-generalized cepstral analysis. 161-164 - Anil Ubale, Allen Gersho:
A low-delay wideband speech coder at 24-kbps. 165-168
Language Modeling and Understanding
- Kae-Cherng Yang, Tai-Hsuan Ho, Lee-Feng Chien, Lin-Shan Lee:
Statistics-based segment pattern lexicon-a new direction for Chinese language modeling. 169-172 - Shuanhu Bai, Haizhou Li, Zhiwei Lin, Baosheng Yuan:
Building class-based language models with contextual statistics. 173-176 - Thomas Niesler, Edward W. D. Whittaker, Philip C. Woodland:
Comparison of part-of-speech and automatically derived category-based language models for speech recognition. 177-180 - Atsunori Ogawa, Kazuya Takeda, Fumitada Itakura:
Balancing acoustic and linguistic probabilities. 181-184 - Andreas Kellner:
Initial language models for spoken dialogue systems. 185-188 - Kishore Papineni, Salim Roukos, Todd Ward:
Maximum likelihood and discriminative training of direct translation models. 189-192 - Hsien-Chang Wang, Jhing-Fa Wang:
A telephone number inquiry system with dialog structure. 193-196 - Alexandros Potamianos, Shrikanth S. Narayanan:
Spoken dialog systems for children. 197-200 - Esther Levin, Roberto Pieraccini, Wieland Eckert:
Using Markov decision process for learning dialogue strategies. 201-204 - Nam-Yong Han, Un-Cheon Choi, Youngjik Lee:
An implementation of a partial parser in the spoken language translator. 205-208
Utterance Verification and Word Spotting
- Jon G. Vaver:
Experiments in confidence scoring using Spanish CallHome data. 209-212 - Myoung-Wan Koo, Chin-Hui Lee, Biing-Hwang Juang:
A new decoder based on a generalized confidence score. 213-216 - Takatoshi Jitsuhiro, Satoshi Takahashi, Kiyoaki Aikawa:
Rejection of out-of-vocabulary words using phoneme confidence likelihood. 217-220 - Jochen Junkawitsch, Harald Höge:
Keyword verification considering the correlation of succeeding feature vectors. 221-224 - Frank Wessel, Klaus Macherey, Ralf Schlüter:
Using word probabilities as confidence measures. 225-228 - Rafid A. Sukkar:
Subword-based minimum verification error (SB-MVE) training for task independent utterance verification. 229-232 - Satya Dharanipragada, Salim Roukos:
A fast vocabulary independent algorithm for spotting words in speech. 233-236 - Richard C. Rose, H. Yao, Giuseppe Riccardi, Jerry H. Wright:
Integration of utterance verification with statistical language modeling and spoken language understanding. 237-240
Techniques for Adverse Acoustic Environments
- Eduardo Lleida, Julián Fernández, Enrique Masgrau:
Robust continuous speech recognition system based on a microphone array. 241-244 - Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano:
Hands-free speech recognition based on 3-D Viterbi search using a microphone array. 245-248 - Tadd B. Hughes, Hong-Seok Kim, Joseph H. DiBiase, Harvey F. Silverman:
Using a real-time, tracking microphone array as input to an HMM speech recognizer. 249-252 - Franck Giron, Yasuhiro Minami, Masashi Tanaka, Ken'ichi Furuya:
Compensation of speaker directivity in speech recognition using HMM composition. 253-256 - Alexander Fischer, Volker Stahl:
Subword unit based speech recognition in car environments. 257-260 - Lamia Karray, Abdellatif Ben Jelloun, Chafic Mokbel:
Solutions for robust recognition over the GSM cellular network. 261-264 - Zhong-Hua Wang, Patrick Kenny:
Speech recognition in non-stationary adverse environments. 265-268 - Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano:
Robust speech recognition in car environments. 269-272
Speech Synthesis and Voice Conversion
- Ann K. Syrdal, Yannis Stylianou, Laurie Garrison, Alistair Conkie, Jürgen Schröter:
TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis. 273-276 - Chu Min, P. C. Ching:
A hybrid approach to synthesize high quality Cantonese speech. 277-280 - Yannis Stylianou, Olivier Cappé:
A system for voice conversion based on probabilistic classification and a harmonic plus noise model. 281-284 - Alexander Kain, Michael W. Macon:
Spectral voice conversion for text-to-speech synthesis. 285-288 - Levent M. Arslan, David Talkin:
Speaker transformation using sentence HMM based alignments and detailed prosody modification. 289-292 - Hsiao-Wuen Hon, Alex Acero, Xuedong Huang, Jingsong Liu, Mike Plumpe:
Automatic generation of synthesis units for trainable text-to-speech systems. 293-296 - Ralf Haury, Martin Holzapfel:
Optimization of a neural network for speaker and task dependent F 0-generation. 297-300 - E. Bryan George:
Practical high-quality speech and voice synthesis using fixed frame rate ABS/OLA sinusoidal modeling. 301-304
Lexical Modeling and Topic Spotting
- Laura Mayfield, Klaus Ries:
An automatic method for learning a Japanese lexicon for recognition of spontaneous speech. 305-308 - Bhuvana Ramabhadran, Lalit R. Bahl, Peter DeSouza, Mukund Padmanabhan:
Acoustics-only based automatic phonetic baseform generation. 309-312 - William Byrne, Michael Finke, Sanjeev Khudanpur, John W. McDonough, Harriet J. Nock, Michael Riley, Murat Saraçlar, Charles Wooters, George Zavaliagkos:
Pronunciation modelling using a hand-labelled corpus for conversational speech recognition. 313-316 - Jason J. Humphries, Philip C. Woodland:
The use of accent-specific pronunciation dictionaries in acoustic model training. 317-320 - Rachida El Méliani, Douglas D. O'Shaughnessy:
Specific language modelling for new-word detection in continuous-speech recognition. 321-324 - Kenney Ng, Victor W. Zue:
Phonetic recognition for spoken document retrieval. 325-328 - Katsutoshi Ohtsuki, T. Matsutoka, Shoichi Matsunaga, Sadaoki Furui:
Topic extraction with multiple topic-words in broadcast-news speech. 329-332 - Jon Yamron, Ira Carp, Larry Gillick, Steve Lowe, Paul van Mulbregt:
A hidden Markov model approach to text segmentation and event tracking. 333-336
Topics in Speech Coding I
- Sean A. Ramprashad:
A two stage hybrid embedded speech/audio coding structure. 337-340 - Toshiyuki Nomura, Masahiro Iwadare, Masahiro Serizawa, Kazunori Ozawa:
A bitrate and bandwidth scalable CELP coder. 341-344 - Marcos Faúndez-Zanuy, Francesc Vallverdú, Enrique Monte:
Nonlinear prediction with neural nets in ADPCM. 345-348 - Michele Covell, Margaret Withgott, Malcolm Slaney:
MACH1: nonuniform time-scale modification of speech. 349-352 - Philippe Lemmerling, Ioannis Dologlou, Sabine Van Huffel:
Speech compression based on exact modeling and structured total least norm optimization. 353-356 - David F. Marston:
Gender adapted speech coding. 357-360 - Mikael Skoglund, Jan Skoglund:
On nonlinear utilization of intervector dependency in vector quantization. 361-364 - Jongseo Sohn, Wonyong Sung:
A voice activity detector employing soft decision based noise spectrum adaptation. 365-368 - Manohar N. Murthi, Bhaskar D. Rao:
Towards a synergistic multistage speech coder. 369-372 - Tim Fingscheidt, Peter Vary, Jesús A. Andonegui:
Robust speech decoding: can error concealment be better than error correction? 373-376
Speech Enhancement I
- Jun Huang, Yunxin Zhao:
An energy-constrained signal subspace method for speech enhancement and recognition in colored noise. 377-380 - Eric A. Wan, Alex T. Nelson:
Removal of noise from speech using the dual EKF algorithm. 381-384 - Djamila Mahmoudi, Andrzej Drygajlo:
Combined Wiener and coherence filtering in wavelet domain for microphone array speech enhancement. 385-388 - Gaafar M. K. Saleh, Mahesan Niranjan:
Speech enhancement in a Bayesian framework. 389-392 - David A. Heide, George S. Kang:
Speech enhancement for bandlimited speech. 393-396 - Stefan Gustafsson, Peter Jax, Peter Vary:
A novel psychoacoustically motivated audio enhancement algorithm preserving background noise characteristics. 397-400 - Zenton Goh, Kah-Chye Tan, B. T. G. Tan:
Speech enhancement based on a voiced-unvoiced speech model. 401-404 - B. Yegnanarayana, P. Satyanarayana Murthy, Carlos Avendaño, Hynek Hermansky:
Enhancement of reverberant speech using LP residual. 405-408
Acoustic Modeling - Miscellaneous Topics
- Ji Ming, Francis Jack Smith:
Improved phone recognition using Bayesian triphone models. 409-412 - Cristobal Corredor-Ardoy, Lori Lamel, Martine Adda-Decker, Jean-Luc Gauvain:
Multilingual phone recognition of spontaneous telephone speech. 413-416 - Joachim Köhler:
Language adaptation of multilingual phone models for vocabulary independent speech recognition tasks. 417-420 - Jonathan Hamaker, Aravind Ganapathiraju, Joseph Picone, John J. Godfrey:
Advances in alphadigit recognition using syllables. 421-424 - Vaibhava Goel, William Byrne, Sanjeev Khudanpur:
LVCSR rescoring with modified loss functions: a decision theoretic perspective. 425-428 - Udo Bub, Harald Höge:
Boosting long-term adaptation of hidden-Markov-models: incremental splitting of probability density functions. 429-432 - Subrata K. Das, Don Nix, Michael Picheny:
Improvements in children's speech recognition performance. 433-436 - Toshiaki Fukada, Yoshinori Sagisaka:
Speaker normalized acoustic modeling based on 3-D Viterbi decoding. 437-440 - Karl E. Nelson, Michael A. Soderstrand:
Adaptive heterodyne filters (AHF) for detection and attenuation of narrow band signals. 441-444 - Bernhard Sick:
Online tool wear monitoring in turning using time-delay neural networks. 445-448
Discriminative Training I
- Cristina Chesta, Aldo Girardi, Pietro Laface, Mario Nigra:
Discriminative training of hidden Markov models using a classification measure criterion. 449-452 - Lalit R. Bahl, Mukund Padmanabhan:
A discriminant measure for model complexity adaptation. 453-456 - Malan B. Gandhi, John Jacob:
Natural number recognition using MCE trained inter-word context dependent acoustic models. 457-460 - Ajit V. Rao, Kenneth Rose, Allen Gersho:
Deterministically annealed design of speech recognizers and its performance on isolated letters. 461-464 - Jörg Rottland, Christoph Neukirchen, Gerhard Rigoll:
Speaker adaptation for hybrid MMI/connectionist speech-recognition systems. 465-468 - Jeff A. Bilmes:
Maximum mutual information based reduction strategies for cross-correlation based joint distributional modeling. 469-472 - Diego Giuliani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer:
Experiments of HMM adaptation for hands-free connected digit recognition. 473-476 - Albino Nogueiras Rodríguez, José B. Mariño:
Task independent minimum confusability training for continuous speech recognition. 477-480
Discriminative Training II
- Peter Beyerlein:
Discriminative model combination. 481-484 - Shawn M. Herman, Rafid A. Sukkar:
Joint MCE estimation of VQ and HMM parameters for Gaussian mixture selection. 485-488